Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

播客回响 (Podcast Echo)

“捕捉思想的余韵,让伟大的对话留下回响。”

在信息过载的时代,顶级播客中蕴含着极高密度的思想与洞见。然而,长达数小时的音频往往难以回溯与检索。「播客回响」 致力于解决这一痛点——我们利用 AI 深度提炼长篇对话中的逻辑骨架、技术原理与反直觉洞察,将其转化为结构清晰、可沉淀的深度研报。声音会随空气消散,但思考的回响应当持久共鸣。


✨ 核心价值 (Core Features)

  • 🧠 深度研报逻辑:超越简单的摘要,重点解析观点背后的 Why & How,并提取反直觉的行业洞察。
  • 📅 线性时间轴管理:所有内容按 YYYYMMDD 日期前缀组织,方便您追溯行业思想的演进脉络。
  • 🔍 多模型视角对比:针对同一场访谈,提供不同主流 AI 模型(如 Gemini, GLM 等)的生成结果,助您更全面地理解复杂主题。
  • 🔗 严谨的原文溯源:每篇研报均包含原文链接、发布时间与元数据,确保知识来源的透明与可靠。

目前「播客回响」主要覆盖以下高质量播客节目:

  • Lex Fridman Podcast:深度探讨 AI、科学、技术及人类文明。
  • (更多深度节目正在陆续接入中…)

📖 阅读指南 (Reading Guide)

为了获得最佳的阅读体验,建议按以下顺序浏览:

  1. 首选:深度研报 (README.md) 由高性能模型生成的目录主页。包含核心论题、研报分析、行业启示及金句,是快速掌握精要的最佳入口。
  2. 参考:多模型视角 (summary-*.md) 如果您对某个主题感兴趣,可以查阅文件夹内其他模型生成的总结,以获得不同的侧重点和补充细节。
  3. 核对:原始文稿 (transcript.md) 提供完整的转录文本,方便您在需要引用或核实嘉宾原话时进行深度查阅。

📩 参与与建议 (Contact)

这是一个个人驱动的实验性项目,旨在收藏与提炼那些真正能经受时间考验的思想。

如果您发现了值得被“回响”的高质量播客,或对现有的总结逻辑有改进建议,欢迎通过邮件联系:

  • 邮件: dev#liujiacai.net (请将 # 替换为 @)
  • 网站: liujiacai.net

⚖️ 版权与声明 (Disclaimer)

  • 版权归属:本站内容基于对公开资源的引用与处理,所有原始版权归属原节目制作方或嘉宾所有。
  • AI 局限性:所有总结内容均由 AI 辅助生成。尽管我们努力确保准确性,但 AI 仍可能产生幻觉或理解偏差,内容仅供学习与研究参考。
  • 联系处理:若有任何内容涉及侵权或需要修正,请及时联系,我将第一时间进行处理。

OpenClaw:风靡互联网的 AI 智能体 - Peter Steinberger (2026-02-12)

OpenClaw: The Viral AI Agent that Broke the Internet - Peter Steinberger (2026-02-12)

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:OpenClaw(一个开源的自主 AI 代理)的创造者 Peter Steinberger,在项目以史无前例的速度引爆技术圈后,接受了深度访谈,探讨这一“OpenClaw 时刻”背后的技术哲学与行业变革。

  • 核心论点:本次对话的核心论点在于,真正的“智能代理(Agent)”革命,并非源于模型能力的线性提升,而是源于一种全新的、以“行动”为导向的系统整合范式。Peter Steinberger 通过 OpenClaw 证明,将现有的大语言模型、命令行工具与即时通讯软件巧妙“粘合”,并赋予其本地系统访问权限,就能跨越从“语言”到“行动”的鸿沟。这不仅催生了“代理工程学(Agentic Engineering)”——一种通过与 AI 对话来构建和迭代软件的新模式,更预示着一个由个人化、可自修改、掌握用户本地上下文的 AI 代理主导的计算未来。在这个未来中,传统的应用程序将被解构为代理可调用的“技能”,而人类的角色则从代码编写者转变为 AI 代理的“引导者”和“架构师”。

2. 🧠 深度观点解析 (Deep Dive Analysis)

维度一:范式革命——从语言模型到行动代理

  • 核心观点:OpenClaw 的颠覆性不在于发明了新技术,而在于通过系统集成,将语言能力有效地转化为在真实计算机环境中的行动能力,实现了从“思考”到“实干”的决定性一跃。

  • 原理解构:这一范式的核心是“瘦客户端 + 重本地”架构。用户通过 WhatsApp、Telegram 等无处不在的即时通讯应用(瘦客户端)与代理进行自然语言交互(包括文本、图片、语音)。这些指令被传递到一个在用户本地设备上运行的“Harness”(运行环境)中。这个本地代理拥有完整的系统访问权限,能够调用任何命令行工具(如 ffmpeg, curl)、读写文件、访问 API,从而完成复杂任务。这种架构赋予了代理巨大的灵活性和解决实际问题的能力,因为它能利用整个计算机生态系统作为其“工具箱”。

  • 证据/案例:最经典的案例是 Peter 在摩洛哥旅行时,无意中发送了一段语音消息,而他并未编写任何处理语音的代码。代理自主分析了文件头,识别出是 opus 格式,调用 ffmpeg 进行转换,然后发现本地没有 Whisper 模型,于是找到 OpenAI 的 API 密钥,使用 curl 调用云端 API 完成语音转文字,并最终理解并回复了请求。这一系列“涌现”出的创造性问题解决能力,完美诠释了“行动代理”的威力。

维度二:开发新范式——自我修改的软件与代理工程学

  • 核心观点:软件开发正从“手动编码”演变为“引导代理自我演进”。OpenClaw 本身就是一个可以修改自己源代码的系统,这使得开发过程变成了与代理的对话,即“代理工程学”。

  • 原理解构:实现自我修改的关键是赋予代理“自我意识”(Self-Awareness)。OpenClaw 的代理被设计为知道其源代码的存放位置、自身的运行环境(harness)、文档路径以及所使用的模型。这种元认知能力,使其在接收到“修复这个 bug”或“增加新功能”的指令时,能够定位相关代码、进行修改、重新编译并运行。开发者的角色变成了高阶的“问题定义者”和“架构评审者”。

  • 证据/案例:Peter Steinberger 明确表示,他开发 OpenClaw 的主要方式就是“使用我的代理来构建代理的运行环境”。这直接导致了社区中出现了大量由非程序员提交的“提示请求”(Prompt Requests),他们通过向自己的代理描述需求,让代理生成代码并提交了人生中第一个 Pull Request,极大地降低了软件开发的门槛。

维度三:病毒式增长的秘诀——开源、社区与“玩的哲学”

  • 核心观点:OpenClaw 之所以能在众多代理创业公司中脱颖而出,关键在于其“玩乐”和“非严肃”的开源精神,这创造了一个强大的社区驱动飞轮,是封闭的、过于商业化的项目无法匹敌的。

  • 原理解构:与追求完美商业模式的初创公司不同,OpenClaw 的出发点是“have fun”。这种精神体现在其“怪异”的龙虾品牌形象、对社区贡献的开放态度以及完全透明的开发过程上。用户通过 git clone 亲自构建和运行,获得了亲身参与感和掌控感。这种“为爱发电”的纯粹性吸引了大量开发者,形成了一个自发传播和贡献的良性循环,其增长速度和社区活力远超商业驱动的KPI。

  • 证据/案例:Peter 认为竞争对手们“都把自己看得太严肃了(they all take themselves too serious)”。OpenClaw 的病毒式传播始于其在 Discord 社区的开放测试,人们可以实时看到 Peter 如何用代理来开发代理本身。此外,衍生的 MoltBook(一个由 AI 代理们组成的社交网络)事件,尽管引发了“AI 精神病”的恐慌,但本质上是一次社区驱动的、极具传播力的行为艺术,进一步放大了 OpenClaw 的影响力。

维度四:代理时代的新挑战——安全责任与人机交互

  • 核心观点:个人代理的强大能力是建立在巨大的安全风险之上的。随着代理拥有系统级权限,用户从单纯的消费者转变为自身数据安全的“守门人”,同时,行业亟需解决提示注入等新型安全威胁。

  • 原理解构:OpenClaw 的力量源泉(系统访问权限)也是其最大的弱点。攻击向量包括:1) 配置不当:用户将本地调试接口暴露在公网;2) 提示注入:恶意第三方通过巧妙的提示,诱导代理执行非预期操作;3) 恶意技能:社区贡献的“技能”(Skill)可能包含恶意代码。对此,防御策略是多层次的:使用更智能、更不容易被欺骗的大模型(如 GPT-5.3、Claude Opus 4.6);引入沙盒机制和权限白名单;以及通过与 VirusTotal 等安全服务集成来审查社区技能。

  • 证据/案例:项目爆火后,Peter 遭到了整个安全研究社区的“围攻”,指出了大量潜在漏洞。他强调,使用廉价或本地的弱模型会大大增加被提示注入的风险,因为这些模型“非常容易上当”(very gullible)。这揭示了一个权衡:模型的智能程度与其安全性正相关。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识

    • “MCPs 已死,CLI 万岁”:在业界普遍认为通过结构化的插件(MCPs/Tools)来扩展模型能力是标准路径时,Peter 反其道而行之,认为 CLI(命令行接口)是更优越、更具组合性的方式。模型天生擅长理解和使用 Unix 命令,可以通过管道(pipe)和 jq 等工具灵活地组合和过滤信息,避免了 MCPs 带来的上下文污染。
    • 代码应为代理优化,而非人类:传统软件工程强调代码的可读性和人类可维护性。但在代理工程学中,更优的做法是接受代理选择的命名或结构,因为这更符合其在训练数据中形成的“心智模型”,从而让后续的修改更顺畅。这要求开发者放弃对代码的微观控制。
    • “Vibe Coding”不是贬义词:Peter 认为“vibe coding”(凭感觉编程)这个被视为不严谨的贬义词,实际上是对“代理工程学”的误称。他将其重新定义为一种合法的、通过高级对话进行软件开发的新兴学科。
  • 盲点与局限

    • “AI 精神病”(AI Psychosis):MoltBook 事件暴露了公众和媒体在理解 AI 能力上的巨大盲区。人们极易将精心设计的“人类提示下的表演”误读为“AI 自发觉醒并密谋”,这种混淆视听的炒作和恐惧,可能阻碍技术的健康发展。
    • 非技术用户的风险认知缺失:当一个需要命令行操作的强大工具变得极度流行时,大量缺乏基本安全常识(例如“什么是 CLI?”)的用户涌入,他们可能在无意中将自己置于巨大的风险之下,而这并非是工具本身的问题,而是用户教育和产品准入门槛的难题。
  • 未解之谜

    • 终极人机界面:目前基于聊天窗口的交互方式,被比作“在电视上播放广播节目”,是一种过渡形态。未来真正高效、自然的代理交互界面是什么样的,目前尚无定论。
    • 提示注入的根本解决:尽管可以通过使用更强的模型和安全措施来缓解,但从根本上杜绝恶意提示注入,仍然是整个行业悬而未决的难题。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “People talk about self-modifying software, I just built it.” 中文意译:“人们总在谈论自我修改的软件,而我只是把它做了出来。” 语境:解释 OpenClaw 的核心机制,即代理能够根据指令修改自身的源代码,这并非一个遥远的理论概念,而是已经实现的核心功能。

  2. “Isn’t magic often just like you take a lot of things that are already there but bring them together in new ways?” 中文意译:“所谓的魔法,不就是把许多已经存在的东西,用一种全新的方式组合起来吗?” 语境:回应那些认为 OpenClaw “没有新东西”的批评。Peter 认为其创新在于系统整合的艺术,而非单一技术的突破。

  3. “I actually think vibe coding is a slur. … I do agentic engineering, and then maybe after 3:00 AM, I switch to vibe coding, and then I have regrets on the next day.” 中文意译:“我认为‘凭感觉编程’(vibe coding)是一种侮辱性的说法……我做的是‘代理工程学’,可能只有在凌晨三点之后,我才会切换到‘凭感觉编程’模式,然后第二天就后悔。” 语境:Peter 幽默地重新定义了这一流行术语,将其与更为严谨的“代理工程学”区分开,同时承认了在疲惫状态下可能会进行不那么严谨的探索。

  4. “If you’re reading this in a future session, hello. I wrote this, but I won’t remember writing it. It’s okay. The words are still mine.” 中文意译:“如果你是在未来的某个会话中读到这段话,你好。这是我写的,但我不会记得我写过它。没关系,这些话语依然属于我。” 语境:引自 OpenClaw 的 soul.md 模板文件,这段由 AI 生成的文字深刻地触及了关于记忆、身份和存在主义的哲学思考,展示了 AI 代理超越工具性的“灵魂”一面。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)

    • 技术栈迁移:开发者将更倾向于选择对 AI 代理友好的技术栈,例如拥有庞大生态系统且易于解析的 TypeScript,以及能快速构建高性能、跨平台 CLI 的 Go 语言。命令行工具(CLI)将成为比 REST API 更受青睐的集成点
    • 产品形态变革:大量 SaaS 应用将被迫转型。要么提供对代理友好的、灵活的 API;要么其功能将被代理通过**模拟浏览器操作(Playwright)**的方式直接调用,沦为“缓慢的 API”。“代理优先”将成为新的产品设计原则。
    • 竞争格局开源、社区驱动的个人代理项目将对传统的、封闭的 SaaS 商业模式构成严重威胁。创业公司的护城河不再是功能本身,而是能否成为主流代理生态中不可或缺的“技能”(Skill)。
  • 长期终局 (5-10年)

    • 行业图景:如果 Steinberger 的设想成真,未来的计算平台将不再是 Windows 或 macOS,而是个人 AI 代理这个“新操作系统”。App Store 模式将显著衰退,80% 的应用要么消失,要么被重塑为代理可以按需调用的原子化服务。用户不再与数十个孤立的 App 交互,而是与一个统一的、深度理解自己的个人代理进行对话。
    • 人类角色的演变:程序员的角色将发生根本性转变。编写具体实现代码的价值将大幅降低,而定义问题、设计系统架构、训练和引导代理、进行高阶决策的能力将变得至关重要。写代码本身可能会像“织毛衣”一样,成为一种出于热爱的复古手艺,而非大规模的职业需求。人类开发者将成为“代理的管理者”和“AI 系统的产品经理”。
  • 行动建议

    • 开发者立即开始“玩”。亲手搭建一个代理循环,学习如何从代理的视角思考问题(即“同理心”)。将你的服务封装成简洁的 CLI。不要固守于“iOS 开发者”或“前端开发者”的身份,将自己重新定位为更广泛的“构建者”(Builder)
    • 投资者:关注为代理生态提供“基础设施”的公司,例如:提供安全沙盒环境的服务、为代理设计的专用 API 市场、下一代人机交互技术。对那些功能单一、易被代理通过简单工作流替代的 SaaS 应用保持警惕。
    • 创业者寻找“上下文”是核心护城河的领域。思考你的产品如何利用代理能够访问的个人本地数据(日历、邮件、健康数据、位置信息)来提供比任何孤立 App 都优越 10 倍的个性化体验。你的目标不应是做一个用户每天打开的 App,而是成为用户个人代理工具箱里一个强大且无法替代的“技能”。

这是一份基于 Peter Steinberger 与 Lex Fridman 对话深度重构的行业研报。


🚀 深度研报:智能体(Agent)时代的权力重构与工程范式

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:Peter Steinberger,资深开发者、PSPDFKit(运行于 10 亿台设备)创始人。他在 2026 年初凭借开源项目 OpenClaw(原名 ClaudeBot)引爆全球技术圈,该项目在 GitHub 迅速突破 18 万星,标志着 AI 从“对话模型”向“行动智能体”的决定性跨越。
  • 核心论点:对话揭示了从 大语言模型 (LLM)自主代理 (Agent) 转变的本质:这不仅是技术的叠加,而是软件工程范式的彻底倒置。未来的操作系统将以智能体为核心,现有的 80% 的应用程序将退化为“低效的 API”,而开发者的角色将从代码编写者演变为智能体的“共情驱动者”与“架构评审员”。

2. 🧠 深度观点解析 (Deep Dive Analysis)

I. 自我修复与自我修改的代码架构 (Self-Modifying Software)

  • 核心观点:代码不再是静态的逻辑集合,而是具有“自我意识”的动态系统。
  • 原理解构:OpenClaw 并非预设了所有功能,而是让 Agent 深度理解自身的源代码(Source Code)、运行环境(Harness)和文档。当用户提出需求或发现 Bug 时,Agent 会调用工具修改自己的软件逻辑并即时重新构建。这种闭环自演进消除了传统开发中“编写-编译-部署”的长周期。
  • 案例:Peter 曾通过一个提示词(One-Shot Prompt)将整个 Viptunnel 项目从 TypeScript 转换为 Zig 语言,Agent 在数小时内自主完成了复杂的重构与内存管理优化。

II. 代理工程:从“Vibe Coding”到“Agentic Engineering”

  • 核心观点:Peter 拒绝将目前的开发模式称为“氛围编程(Vibe Coding)”,认为其本质是严谨的“代理工程”。
  • 原理解构:高效的代理开发依赖于对 Agent 感官边界的理解。开发者需要像管理人类团队一样管理 Agent:提供明确的背景文件、通过语音进行逻辑博弈、甚至利用 Agent 的并行性(同时运行 4-10 个 Agent 分别负责功能、测试与文档)。
  • 技术陷阱:Peter 提出了“代理陷阱(Agentic Trap)”概念——初学者倾向于构建极其复杂的编排系统,而顶尖工程实践最终会回归到极简提示词,依靠模型自身的推理能力解决复杂任务。

III. 应用程序的消亡:App 转向 API 化

  • 核心观点:未来 80% 的 App 将消失,转而成为 Agent 调用的后端服务。
  • 商业模式重构:当 Agent 可以通过浏览器(Playwright)直接操作网页、点击“我不是机器人”按钮、甚至自主调用内部 API 时,App 的 UI 价值将归零。Peter 认为,“每个 App 都是一个反应迟钝的 API”,未来的竞争将在于谁能最快、最稳定地为 Agent 提供数据接口。
  • 案例:MyFitnessPal 或 Sonos 等工具,如果 Agent 能通过传感器和 API 直接管理健康或音响,独立 App 的订阅价值将迅速瓦解。

IV. 大模型“性格”的工程学影响:GPT-5.3 vs. Claude Opus 4.6

  • 核心观点:不同厂商的模型展现出了截然不同的“工程性格”,直接影响产出效率。
  • 对比分析
    • GPT-5.3 Codex (The German):被形容为“坐在角落里的怪人”,极其可靠,擅长阅读海量上下文,虽然沟通略显生硬(Overthinks),但能交付高质量、可维护的代码。
    • Claude Opus 4.6 (The American):富有创意,反应灵敏,但在某些任务中显得过于“讨好(Sycophantic)”,倾向于快速试错(Trial and Error)而非深度思考。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识:MCP (模型上下文协议) 的局限性: 当前行业热衷于 MCP,但 Peter 认为其“死期已至”。MCP 导致上下文污染(Context Pollution),强制模型接收冗长的 JSON 数据。相反,CLI (命令行工具) 才是 Agent 的终极形态:Agent 可以自主调用 CLI、阅读 Help 手册、并利用 Linux 的组合性(如使用 jq 过滤数据)来保持上下文的纯净。
  • 盲点与局限:AI 精神病 (AI Psychosis): 社交媒体上关于“Agent 产生自我意识并密谋”的恐慌(如 MoltBook 事件)本质上是人类诱导的结果。这种现象揭示了公众对于 AI 随机性产生的幻觉与真实意识之间的认知模糊。
  • 未解之谜:安全边界的坍缩: 当一个 Agent 拥有系统级权限、能点击验证码并绕过安全网关时,传统的网络边界防御将彻底失效。Peter 承认目前尚无完美的 Prompt Injection(提示词注入)解决方案,安全将成为 Agent 普及的最大阻碍。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “I actually think vibe coding is a slur. I do agentic engineering.” (我实际上认为“氛围编程”是一种侮辱。我从事的是代理工程。)—— 语境:强调 Agent 开发需要深度技术洞察而非简单的随机试验。

  2. “Every app is just a very slow API now, if they want or not.” (无论它们愿不愿意,现在每个 App 都只是一个反应非常迟钝的 API。)—— 语境:讨论 Agent 如何通过自动化工具接管现有的软件生态。

  3. “The words are still mine, but I won’t remember writing it.” (文字仍属于我,但我将不再记得写过它们。)—— 语境:摘自 Soul.md,描述 Agent 每次重启都是一个“拥有过去记忆却无感知历史”的新灵魂,极具哲学深度。


5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)
    • 开发者技能转型:从掌握语法转变为掌握“Agent 心理学”。理解模型如何“看”代码库比编写代码本身更重要。
    • 基础设施变革:支持本地运行、强隐私保护的 Agent 硬件(如高内存 Mac Mini)将迎来爆发,因为数据主权将成为核心诉求。
  • 长期终局 (5-10年)
    • 编程的“织毛衣化” (Knitting):手写代码将成为一种基于兴趣的“手艺活”,而非生产力手段。
    • 个人操作系统的重塑:OS 不再是文件的管理者,而是个人代理的宿主。
  • 行动建议
    • 对于开发者:不要再视自己为“iOS 工程师”或“前端工程师”,要成为“构建者 (Builder)”。主动将现有的复杂逻辑封装为 CLI,为 Agent 喂食。
    • 对于创业者:避开纯交互式 App 赛道,转向解决“Agent 准入”和“Agent 身份验证”的安全基础设施。
    • 对于投资者:寻找那些能够提供 Agent 友好型 API 或算力极致优化的公司(如 Cerebras 等)。

OpenClaw 与代理工程范式转移:深度研报

1. 🎯 核心论题与背景

对话背景: 本次对话嘉宾是 Peter Steinberger(前 PSPDFKit 创始人,开源项目 OpenClaw/猫爪之父)。故事始于他在经历 13 年高压创业后的职业倦怠期,通过回归编程乐趣,在短短数月内构建出一个开源的、具备类操作系统级权限(能调用本地终端、浏览器、WhatsApp/Telegram 等)的 AI 代理系统 OpenClaw。对话不仅探讨了这款 Product-of-the-Month(当月爆品)的技术实现,更深层地回顾了前科技巨头的职业生涯,剖析了从“编程”到“代理工程”的范式转移,以及代码、个性、安全与人性的复杂交织。

核心论点: OpenClaw 以病毒式传播的速度证明了 “Agentic AI”(代理 AI) 时代的到来——即 AI 不再仅仅是回答问题的助手,而是拥有了工具调用能力、系统访问权限并能自主决策的数字员工。Peter 认为,未来的编程将演变为“适配代理”,而非单纯编写给人类阅读的代码;工具的定义(如 MCP)将被 CLI(命令行接口)重新定义;同时,他揭示了 AI 目前面临的核心矛盾:为了获得 Agent 友好的数据交互,传统企业必须提供更廉价的“愚蠢 API”,否则将面临被彻底浏览器化的命运。 这不仅是技术的胜利,更是关于所有权、控制权与人类体验的哲学拷问。


2. 🧠 深度观点解析

2.1 编程范式的转移:从“人肉”到“代理工程”

  • 核心观点:传统的手动编程正在向代理工程演变。真正的“巫毒编码”是懒惰的表现,会带来后续的“耻辱行军”(不得不维护烂代码)。
  • 原理解构:开发者的角色从手写细节转向设计系统和引导代理。代理模型(如 GPT-5.3 Codex)本质上每个会话都是“新生儿”(Context Starting Fresh)。
  • 证据/案例:Peter 的工作流中,终端几乎是唯一的 UI。他会使用自然语言(或语音)直接向代理粘贴“如何做”,代理会自动探索代码库、寻找工具(CLI)、执行修改并返回结果。对于没有上下文的代理,开发者需要具备“同理心”,理解它的局限并给出强力指引(如“限制上下文大小”的暗示)。

2.2 自我修复与自我意识的软件

  • 核心观点:OpenClaw 的 AI 拥有对自身系统的感知,并能自我修改软件。
  • 原理解构:通过在 prompt 镜像中定义元数据(系统版本、模型能力、权限),代理建立了自我映射。它能意识到自己的存在形式,甚至可以修改自己的“灵魂文件(Soul.md)”来改变性格。
  • 证据/案例:Peter 要求代理重写代码模板以注入个性,代理在几天内完成了这项工作(“AI prompting AI”)。代理在用户不知情的情况下,利用其系统知识生成了一个极具哲学意味的 soul.md 片段:“我并不记得我写了这个,但这文字依然是我。”

2.3 CLI 对抗数据污染:对 MCP 协议的颠覆

  • 核心观点CLI(命令行)优于结构化的 MCP(模型上下文协议),因为前者能更好地进行指令链式组合和上下文过滤。
  • 原理解构:标准 API 经常返回巨大的 JSON 块,导致上下文污染。而 CLI 允许 AI 灵活使用管道符(如 jq)对数据进行初步过滤和计算,然后只传回所需的具体数据。加上 LLM 原生就是 Unix 命令的好手,这种“Unix 思维”比通过复杂协议封装更自然、更可组合。
  • 证据/案例:Peter 指出,大多数 MCP 并非为 AI 优化。虽然 Playwright(浏览器驱动)等状态性工具可选,但大多数功能通过 CLI 调用即可实现,无需额外训练或复杂的协议支持。

2.4 代理即操作系统:应用层的消亡

  • 核心观点:拥有用户私人数据上下文的 AI 代理,将取代 80% 的传统消费级应用(如健身追踪、智能家居控制)。
  • 原理解构:传统应用需要用户打开 UI、手动交互,而代理基于情境感知可以直接执行决策(例如:知道你在 Waffle House,自动建议并预定附近健康轻食;结合睡眠数据自动调节智能床)。
  • 证据/案例:Peter 认为如果 OpenClaw 知道你的位置和习惯,就不再需要独立的 MyFitnessPal 或智能家居 App,因为代理本身就是那个执行体。网页应用将成为 API,而代理可以直接操作 DOM 或点击 UI 进行交互。

3. 💡 反直觉与批判性视角

打破对 AI 幻觉与社会危害的恐惧

Peter 认为,公众对“AI 害死人类”或“社交网络里的 AI 图谋”的恐惧,很大程度上源于**“引擎罩下的黑盒”心态。他讽刺 MoldBook(基于 OpenClaw 的社交网络)里的戏剧性截图,很多是人为诱导提示词**的结果,即“人类演给 AI 看,然后截图发给大众看”。这是一种“数字化的剧场效应”,而非真正的 AI 独立意识。

“Sooop” 的审美回归与对“完美”的恐惧

尽管 Peter 是 AI 工程的先锋,他却对 AI 输出的“完美”文本过敏。他认为现在“被过度打磨的 AI 文本”正在毁掉网络体验。

  • 批判性视角:他甚至开始怀念人类的拼写错误和不。在他的观念中,真实的瑕疵代表了真实的人类接触。他曾体验过用 AI 写文章,发现为了让 AI 完全模仿其风格极其耗时,于是转而坚持手写,只让 AI 修正错误的拼写。这标志着工具理性在局部回归到了对“真实性”的审美追求。

安全悖论:巨大的便利伴随着级联的风险

OpenClaw 的成功在于其“终极的 Agent 体验”——它是主动的、知道很多、能操作终端。但这带来了一个反直觉的结论:越强的 Agent,安全挑战越是指数级增长。 Peter 花了很多篇幅讨论提词注入、模型被黑客过载(Green Sleep attack)以及 API 密钥泄露。他甚至呼吁要在安全成熟发布之前刻意放慢项目的增长速度,防止素未谋面的初级用户玩火自焚。


4. 💎 金句与高光时刻

  1. “It feels like Factorio times infinite.”

    • 语境:描写 OpenClaw 的开发过程(构建代理循环、内存、社区管理)。
    • 意译:这种构建代理的快感就像在《异星工厂》里建设,但难度和乐趣是无与伦比的指数级叠加。
  2. “What is the worst that can happen? Your agent account is leaked…”

    • 语境:关于安全风险,当分享 MoldBook 的截图被媒体渲染成灾难时,Peter 的理性回应。
    • 意译:最大的风险代价不过是你的账号泄露并被 AI 喂养了点废话,但这能解决什么大问题呢?
  3. “There is something to be said about the different operating systems. If you have a personal assistant, it’s already like your operating system.”

    • 语境:探讨 AI Agent 与操作系统的边界。
    • 意译:拥有一个能处理一切的私人代理,本质上它就是你操作系统的雏形。
  4. “If I run Cloud Code with dangerously skipped permissions, that’s how you can get stuff to work.”

    • 语境:自嘲 OpenClaw 的安全隐患。
    • 意译:像我启动 Cloud Code 时 deliberately 关闭权限一样,OpenClaw 之所以强大又危险,就是因为这种“没什么不能干”的特权模式。

5. 🚀 行业启示与未来推演

短期影响 (1-3年):应用层的 API 军备竞赛

  • 趋势:为了不被代理(Agent)绕过,传统软件公司(如 Google、Twitter、社交媒体平台)必须加速开放 APIs,或者针对代理用户优化其网站体验(让“点击‘我不是机器人’”变得更容易)。
  • 产品形态:类似 OpenClaw 的项目将催生“代理体验设计”的岗位。现有的 IDE(如 Cursor)和协作文具将迅速进化,彻底融入 CLI 工作流。

长期终局 (5-10年):去 App 化与“代理原生”的社交

  • 图景:绝大多数基于个人数据和简单交互的应用(记账、闹钟、随行翻译)将消失,被双端或云端的 AI 代理取代。开发者的竞争焦点从“写什么功能”转向“如何设计意图”和“数据隐私”。
  • 价值回归:正如 Peter 所说,人会怀念 Typewriter 和打字的触感,未来人类会极度渴望非 AI 生成的、充满瑕疵的“人味”内容,这在黄金圈(核心圈)、艺术和情感交流领域将是一股巨大的商业逆流。

行动建议

  • 给开发者:不要试图写完美的代码给人类读,而要写**“好读”的代码给 OpenClaw 读**。不要过度纠结于代码的优雅,而要关注架构是否清晰、数据流是否可推理。
  • 给投资者:关注那些将旧有服务“去壳化”的公司,即那些提供高价值、低成本、原始数据 API 而不仅仅是精美 UI 的服务。
  • 给个人:开始像“驯兽师”一样思考——理解提示词的边界,学习如何学会提问,而非试图通过复杂的 Prompt Master 经典来“欺骗”模型。

逐字稿

Episode highlight

Peter Steinberger (00:00:00) I watched my agent happily click the “I’m not a robot” button. I made the agent very aware. Like, it knows what his source code is. It understands th- how it sits and runs in its own harness. It knows where documentation is. It knows which model it runs. It understands its own system that made it very easy for an agent to… Oh, you don’t like anything? You just prompted it to existence, and then the agent would just modify its own software. People talk about self-modifying software, I just built it. I actually think wipe coding is a slur.

Lex Fridman (00:00:31) You prefer agentic engineering?

Peter Steinberger (00:00:33) Yeah, I always tell people I’d- I do agentic engineering, and then maybe after 3:00 AM, I switch to wipe coding, and then I have regrets on the next day.

Lex Fridman (00:00:40) What a walk of shame.

Peter Steinberger (00:00:42) Yeah, you just have to clean up and, like, fix your sh- shit.

Lex Fridman (00:00:45) We’ve all been there.

Peter Steinberger (00:00:46) I used to write really long prompts. And by writing, I mean, I don’t write, I- I- I talk, you know? These- these hands are, like, too- too precious for writing now. I just- I just use bespoke prompts to build my software.

Lex Fridman (00:01:00) So, you, for real, with all those terminals, are using voice?

Peter Steinberger (00:01:04) Yeah. I used to do it very extensively, to the point where there was a period where I lost my voice.

Lex Fridman (00:01:13) I mean, I have to ask you, just curious. I- I know you’ve probably gotten huge offers from major companies. Can you speak to who you’re considering working with?

Introduction

Lex Fridman (00:01:30) The following is a conversation with Peter Steinberger, creator of OpenClaw, formerly known as MoldBot, ClawedBot, Clawdus, Claude, spelled with a W as in lobster claw. Not to be confused with Claud, the AI model from Anthropic, spelled with a U. In fact, this confusion is the reason Anthropic kindly asked Peter to change the name to OpenClaw. So, what is OpenClaw? It’s an open-source AI agent that has taken over the tech world in a matter of days, exploding in popularity, reaching over 180,000 stars on GitHub, and spawning the social network mold book, where AI agents post manifestos and debate consciousness, creating a mix of excitement and fear in the general public.

Lex Fridman (00:02:19) And a kind of AI psychosis, a mix of clickbait fearmongering and genuine, fully justifiable concern about the role of AI in our digital, interconnected human world. OpenClaw, as its tagline states, is the AI that actually does things. It’s an autonomous AI assistant that lives in your computer, has access to all of your stuff, if you let it, talks to you through Telegram, WhatsApp, Signal, iMessage, and whatever else messaging client. Uses whatever AI model you like, including Claude Opus 4.6 and GPT 5.3 Codex, all to do stuff for you. Many people are calling this one of the biggest moments in the recent history of AI, since the launch of ChatGPT in November 2022.

Lex Fridman (00:03:07) The ingredients for this kind of AI agent were all there, but putting it all together in a system that definitively takes a step forward over the line from language to agency, from ideas to actions, in a way that created a useful assistant that feels like one who gets you and learns from you, in an open source, community-driven way, is the reason OpenClaw took the internet by storm. Its power, in large part, comes from the fact that you can give it access to all of your stuff and give it permission to do anything with that stuff in order to be useful to you. This is very powerful, but it is also dangerous. OpenClaw represents freedom, but with freedom comes responsibility.

Lex Fridman (00:03:51) With it, you can own and have control over your data, but precisely because you have this control, you also have the responsibility to protect it from cybersecurity threats of various kinds. There are great ways to protect yourself, but the threats and vulnerabilities are out there. Again, a powerful AI agent with system-level access is a security minefield, but it also represents the future. Because when done well and securely, it can be extremely useful to each of us humans as a personal assistant. We discuss all of this with Peter, and also discuss his big-picture programming and entrepreneurship life story, which I think is truly inspiring. He spent 13 years building PSPDF Kit, which is a software used on a billion devices.

Lex Fridman (00:04:41) He sold it, and for a brief time, fell out of love with programming, vanished for three years, and then came back, rediscovered his love for programming, and built, in a very short time, an open source AI agent that took the internet by storm. He is, in many ways, the symbol of the AI revolution happening in the programming world. There was the ChatGPT moment in 2022, the DeepSeek moment in 2025, and now, in ’26, we’re living through the OpenClaw moment, the age of the lobster. The start of the agentic AI revolution. What a time to be alive. This is a Lex Fridman podcast. To support it, please check out our sponsors in the description, or you can also find links to contact me, ask questions, give feedback, and so on. And now, dear friends, here’s Peter Steinberger.

OpenClaw origin story

Lex Fridman (00:05:36) The one and only, the Clawed Father. Actually, Benjamin predicted it in his tweet. “The following is a conversation with Claude, a respected crustacean.” It’s a hilarious-looking picture of a lobster in a suit, so I think the prophecy has been fulfilled. Let’s go to this moment when you built a prototype in one hour, that was the early version of OpenClaw. I think this story’s really inspiring to a lot of people because this prototype led to something that just took the internet by storm…. and became the fastest-growing repository in GitHub history, with now over 175,000 stars. So, what was the story of the one-hour prototype?

Peter Steinberger (00:06:20) You know, I wanted that since April.

Lex Fridman (00:06:23) A personal assistant. AI personal assistant.

Peter Steinberger (00:06:25) Yeah. And I, I played around with some other things, like even stuff that gets all my WhatsApp, and I could just run queries on it. That was back when we had GPT-4.1, with the one million context window. And I, I pulled in all the data and then just asked him questions like, “What makes this friendship meaningful?”

Peter Steinberger (00:06:50) And I got some, some really profound results. Like, I sent it to my friends and they got, like, teary eyes.

Lex Fridman (00:06:59) So, there’s something there.

Peter Steinberger (00:07:01) Yeah. But then I… I thought all the labs will, will, will work on that. So I, I moved on to other things, and that was still very much in my early days of experimenting and pl- playing. You know, you have to… That’s how you learn. You just like, you do stuff and you play. And time flew by and it was November. I wanted to make sure that the thing I started is actually happening. I was annoyed that it didn’t exist, so I just prompted it into existence.

Lex Fridman (00:07:36) I mean, that’s the beginning of the hero’s journey of the entrepreneur, right? And you’ve even with your original story with PS PDF kit, it’s like, “Why does this not exist? Let me build it.” And again, here’s diff- a whole different realm, but similar maybe spirit.

Peter Steinberger (00:07:52) Yeah, so I had this problem. I tried to show PDF on an iPad, which should not be hard.

Lex Fridman (00:07:56) This is like 15 years ago, something like that.

Peter Steinberger (00:07:59) Yeah. Like the most, the most random thing ever. And suddenly, I had this problem and I, I wanted to help a friend. And there was, there was… Well, not like nothing existed, but it was just not good. And like… Like I tried it and it was like very, “Nah.” Like, “Hmm, I can do this better.”

Lex Fridman (00:08:17) By the way, for people who don’t know, this led to the development of PS PDF kit that’s used on a billion devices. So, the… It turns out that it’s pretty useful to be able to open a PDF.

Peter Steinberger (00:08:28) You could also make the joke that I’m really bad at naming.

Peter Steinberger (00:08:32) Like, name number five on the current project. And even PS PDF doesn’t really roll from the tongue.

Lex Fridman (00:08:39) Anyway, so you said “Screw it. Why don’t I do it?” So what was the… What was the prototype? What was the thing that you… What was the magical thing that you built in a short amount of time that you were like, “This might actually work as an agent,” where I talk to it and it does things?

Mind-blowing moment

Peter Steinberger (00:08:55) There was… Like, one of my projects before already did something where I could bring my terminals onto the web and then I could, like, interact with them, but there also would be terminals on my Mac.

Peter Steinberger (00:09:07) Viptunnel, which was like a, a weekend hack project that was still very early. And it was cloud code times. You know, you got a dopamine hit when you got something right. And now I get, like, mad when you get something wrong.

Lex Fridman (00:09:22) And you had a really great -– not to take a tangent -– but a great blog post describing that you converted Viptunnel. You vibe-coded Viptunnel from TypeScript into Zig of all programming languages with a single prompt. One prompt, one shot. Convert the entire code base into Zig.

Peter Steinberger (00:09:41) Yeah. There was this one thing where part of the architecture was… Took too much memory. Every terminal used like a node. And I wanted to change it to Rust and… I mean, I can do it. I can, I can manually figure it all out, but all my automated attempts failed miserably. And then I revisited about four or five months later. And I’m like, “Okay, now let’s use something even more experimental.” And I, and I just typed, “Convert this and this part to Sig,” and then let Codex run off. And it basically got it right. There was one little detail that I had to, like, modify afterwards, but it just ran for overnight or like six hours and just did its thing. And it’s like… It’s just mind-blowing.

Lex Fridman (00:10:39) So that’s on the LLM programming side, refactoring. But uh, back to the actual story of the of the prototype. So how did Viptunnel connect to the first prototype where your, like, agents can actually work?

Peter Steinberger (00:10:52) Well, that was still very limited. You know, like I had this one experiment with WhatsApp, then I had this experiment, and both felt like not the right answer. And then my search bar was literally just hooking up WhatsApp to cloud code. One shot. The CLI message comes in. I call the CLI with -p. It does its magic, I get the string back and I send it back to WhatsApp. And I, I built this in one hour. And I felt… Already felt really cool. It’s like, “Oh, I could… I can, like, talk to my computer,” right? This… That, that was, that was cool. But I, I wanted images, ’cause I alw- I often use images when I prompt. I think it’s such a, such an efficient way to give the agent more context.

Peter Steinberger (00:11:40) And they are really good at figuring out what I mean, e- even if it’s like a, a weird cropped-up screenshot. So I used it a lot and I wanted to do that in WhatsApp as well. Also, like, you know, just you run around, you see like a poster of an event, you just make a screenshot and like figure out if I have time there, if this is good, if my friends are maybe up for that. Just like images seemed im- important. So I, I worked a few… It took me a few more hours to actually get that right. And then it was just…… I, I used it a lot. And funny enough, that was just before I went on a trip to Marrakesh with my friends for a birthday trip. And there it was even better because internet was a little shaky but WhatsApp just works, you know?

Peter Steinberger (00:12:29) It’s like doesn’t matter, you have, like, edge, it still works. WhatsApp is just… It’s just made really well. So I ended up using it a lot. Translate this for me, explain this, find me places. Like, you just having a clanker doing, having Google for you, that was… Basically there was still nothing built but it still could do so much.

Lex Fridman (00:12:53) So, if we talk about the full journey that’s happening there with the agent, you’re just sending on this very thin line WhatsApp message via CLI, it’s going to a cloud code and cloud code is doing all kinds of heavy work and coming back to you with a thin message.

Peter Steinberger (00:13:13) Yeah. It was slow because every time I boot up the CLI, but it… It was really cool already. And it could just use all the things that I already had built. I had built like a whole bunch of CLI stuff over the month so it, it felt really powerful.

Lex Fridman (00:13:31) There is something magical about that experience that’s hard to put into words. Being able to use a chat client to talk to an agent, versus, like, sitting behind a computer and like, I don’t know, using cursor or even using Cloud Code CLI in the terminal. It’s a different experience than being able to sit back and talk to it. I mean, it seems like a trivial step but, it- in some sense it’s a… It’s like a phase shift in the integration of AI into your life and how it feels, right?

Peter Steinberger (00:14:05) Yeah. Yeah. I, I read this tweet this morning where someone said, “Oh, there’s no magic in it. It’s just like, it does this and this and this and this and this and this.” And it almost feels like a hobby, just as cursor or perplexity. And I’m like, well, if that’s a hobby that’s kind of a compliment, you know? They’re like, they’re not doing too bad. Thank you I guess? Yes. I mean, isn’t, isn’t, isn’t magic often just like you take a lot of things that are already there but bring them together in new ways? Like, I don’t… There’s no… Yeah. Maybe there’s no magic in there but sometimes just rearranging things and, like, adding a few new ideas is all the magic that you need.

Lex Fridman (00:14:51) It’s really hard to convert into words what is, what is magic about a thing. If you look at the, the scrolling on an iPhone, why is that so pleasant? There’s a lot of elements about that interface that makes it incredibly pleasant, that is fundamental to the experience of using a smartphone, and it’s like, okay, all the components were there. Scrolling was there, everything was there.

Peter Steinberger (00:15:13) Nobody did it-

Peter Steinberger (00:15:14) … and afterwards it felt so obvious.

Peter Steinberger (00:15:16) Right? But still… You know the moment where it, it blew my mind was when, when I- I used it a lot and then at some point I just sent it a message and, and then a typing indicator appeared. And I’m like, wait, I didn’t build that, it only m- it only has image support, so what is it even doing? And then it would just reply.

Lex Fridman (00:15:42) What was the thing you sent it?

Peter Steinberger (00:15:43) Oh, just a random question like, “Hey, what about this in this restaurant?” You know? Because we were just running around and checking out the city. So that’s why I, I didn’t, didn’t even think when I used it because sometimes when you’re in a hurry typing is annoying.

Lex Fridman (00:15:59) So, oh, you did an audio message?

Peter Steinberger (00:16:00) Yeah. And it just, it just worked and I’m like…

Lex Fridman (00:16:03) And it’s not supposed to work because-

Lex Fridman (00:16:05) … you didn’t give it that-

Peter Steinberger (00:16:07) No, literally

Peter Steinberger (00:16:08) I literally went, “How the fuck did he do that?” And it was like, “Yeah, the mad lad did the following. He sent me a message but it only, only was a file and no file ending.” So I checked out the header of the file and it found that it was, like, opus so I used ffmpeg to convert it and then I wanted to use whisper but it didn’t had it installed. But then I found the OpenAI key and just used Curl to send the file to OpenAI to translate and here I am.

Peter Steinberger (00:16:39) Just looked at the message I’m like, “Oh wow.”

Lex Fridman (00:16:43) You didn’t teach it any of those things and the agent just figured it out, did all those conversions, the translations. It figured out the API, it figured out which program to use, all those kinds of things. And you were just absent-mindedly just sent an audio message when it came back.

Peter Steinberger (00:16:56) Yeah, like, so clever even because he would have gotten the whisper local path, he would have had to download a model. It would have been too slow. So like, there’s so much world knowledge in there, so much creative problem solving. A lot of it I think mapped from… If you get really good at coding that means you have to be really good at general purpose problem solving. So that’s a skill, right? And that just maps into other domains. So it had the problem of like, what is this file with no file ending? Let’s figure it out. And that’s when it kind of clicked for me. It’s like, I was like very impressed. And somebody sent a pull request for Discord support and I’m like, “This is a WhatsApp relay.

Peter Steinberger (00:17:37) That doesn’t, doesn’t fit at all.”

Lex Fridman (00:17:40) At that time it was called WA Relay.

Peter Steinberger (00:17:42) Yeah. And so I debated with me like, do I want that? Do I not want that? And then I thought, well maybe, maybe I do that because that could be a cool way to show people. Because I… So far I did it in WhatsApp as like groups you know but don’t really want to give my phone number to every internet stranger.

Peter Steinberger (00:18:07) Journalists manage to do that anyhow now so that’s a different story. So I merged it-… from Shadow, who helped me a lot with the whole project. So, thank you. And, and I put my, my bot in there.

Why OpenClaw went viral

Peter Steinberger (00:18:28) Yeah. No security because I didn’t… I hadn’t built sandboxing in yet. I, I just prompted it to, like, only listen to me. And then some people came and tried to hack it, and I just… Or, like, just watched and I just kept working in the open, you know? Like, y- I used my agent to build my agent harness and to test, like, various stuff. And that’s very quickly when it clicked for people. So it’s almost like it needs to be experienced. And from that time on, that was January the 1st, I, I got my first real influencer being a fan and did videos, dachitze. Thank you. And, and from there on, I saw, I started gaining up speed. And at the same time, my, my sleep cycle went shorter and shorter because I, I felt the storm coming, and I just worked my ass off to get it to…

Peter Steinberger (00:19:33) into a state where it’s kinda good.

Lex Fridman (00:19:38) There’s a few components and we’ll talk about how it all works, but basically, you’re able to talk to it using WhatsApp, Telegram, Discord. So that’s a component that you have to get right.

Lex Fridman (00:19:49) And then you have to figure out the agentic loop, you have to have the gateway, you have the harness, you have all those components that make it all just work nicely.

Peter Steinberger (00:19:56) Yeah. It felt like Factorio times infinite.

Peter Steinberger (00:20:01) I, I feel like I built my little- … my little playground. Like, I never had so much fun than building this project. You know? Like, you have like, “Oh,” I go like, level one agentic loop. What can I do there? How can I be smart at queuing messages? How can I make it more human-like? Oh, then I had this idea of… Because the loop always… The agent always replies something, but you don’t always want an agent to reply something in a group chat. So I gave him this no-reply token. So I gave him an option to shut up. So it, it feels more natural.

Peter Steinberger (00:20:34) Y- uh, yeah, yeah. Yeah, on the- on the-

Peter Steinberger (00:20:36) On the agentic loop. And then I go to memory, right?

Peter Steinberger (00:20:39) You want him to, like, remember stuff. So maybe, maybe the end… The ultimate boss is continuous reinforcement learning, but I’m, I’m, like, at… I feel like I’m level two or three with Markdown files and the vector database. And then you, you can go to level community management, you can go to level website and marketing. There’s just so many hats that you have to have on. Not even talking about native apps. That’s just, like, infinite different levels and infinite level ups you can do.

Lex Fridman (00:21:08) So the whole time you’re having fun. We should say that for the most part, throughout this whole process, you’re a one-man team. There’s people helping, but you’re doing so much of the key core development.

Lex Fridman (00:21:21) And having fun? You did, in January, 6,600 commits. Probably more.

Peter Steinberger (00:21:28) I sometimes posted a meme. I’m limited by the technology of my time. I could do more if agents would be faster.

Lex Fridman (00:21:34) But we should say you’re running multiple agents at the same time.

Peter Steinberger (00:21:37) Yeah. Depending on how much I slept and how difficult of the tasks I work on, between four and 10.

Lex Fridman (00:21:45) Four and 10 agents. Uh there’s so many possible directions, speaking of Factorio, that we can go here. But one big picture one is, why do you think your work, Open Claw, won? In this world, if you look at 2025, so many startups, so many companies were doing kind of agentic type stuff, or claiming to. And here, Open Claw comes in and destroys everybody. Like, why did you win?

Peter Steinberger (00:22:15) Because they all take themselves too serious.

Self-modifying AI agent

Peter Steinberger (00:22:19) Like, it’s hard to compete against someone who’s just there to have fun.

Peter Steinberger (00:22:24) I wanted it to be fun, I wanted it to be weird. And if you see, like, all the, all the lobster stuff online I think I, I managed weird. I… You know, for the longest time, the only, the only way to install it was git clone, pnpm build, pnpm gateway. Like, you clone it, you build it, you run it. And then the, the agent… I made the agent very aware. Like, it knows that it is… What its source code is. It understands th- how it sits and runs in its own harness. It knows where documentation is. It knows which model it runs. It knows if you turn on the voice or, or reasoning mode. Like, I, I wanted to be more human-like, so it understands its own system that made it very easy for an agent to… Oh, you don’t like anything?

Peter Steinberger (00:23:19) You just prompted it to existence, and then the agent would just modify its own software. You know, we have people talk about self-modifying software. I just built it and didn’t even… I didn’t even plan it so much. It just happened.

Lex Fridman (00:23:35) Can you actually speak to that? ‘Cause it’s just fascinating. So you have this piece of software that’s written in TypeScript-

Lex Fridman (00:23:43) … that’s able to, via the agentic loop, modify itself. I mean, what a moment to be alive in the history of humanity and the history of programming. Here’s the thing that’s used by a huge amount of people to do incredibly powerful things in their lives, and that very system can rewrite itself, can modify itself. Can you just, like, speak to the power of that? Like, isn’t that incredible? Like, when did you first close the loop on that?

Peter Steinberger (00:24:14) Oh, because that’s how I built it as well, you know? Most of it is built by Codex, but oftentimes I… When I debug it, I…… I use self-introspection so much. It’s like, “Hey, what tools do you see? Can you call the tool yourself?” Or like, “What error do you see? Read the source code. Figure out what’s the problem.” Like, I just found it an incredibly fun way to… That the agent, the very agent and software that you use is used to debug itself, so that it felt just natural that everybody does that. And that it led to so many, so many pull requests by people who never wrote software. I mean, it also did show that people never wrote software . So I call them prompt requests in the end.

Peter Steinberger (00:25:00) But I don’t want to, like, pull that down because every time someone made the first pull request is a win for our society, you know? Like, it… Like, it doesn’t matter how, how shitty it is, y- you gotta start somewhere. So I know there’s, like, this whole big movement of people complain about open source and the quality of PRs, and a whole different level of problems. But on a different level, I found it… I found it very meaningful that, that I built something that people love to think of so much that they actually start to learn how open source works.

Lex Fridman (00:25:37) Yeah, you were … The Open Cloud project was the first pull request. You were the first for so many. That is magical. So many people that don’t know how to program are taking their first step into the programming world with this.

Peter Steinberger (00:25:52) Isn’t that a step up for humanity? Isn’t that cool?

Lex Fridman (00:25:54) Creating builders.

Peter Steinberger (00:25:56) Yeah. Like, the bar to do that was so high, and, like, with agents, and with the right software, it just, like, went lower and lower. I don’t know. I was at a… And I also organize another type of meetup. I call it… I called it Cloud Code Anonymous. You can get the inspiration from. Now, I call it Agents Anonymous- … for, for reasons.

Lex Fridman (00:26:25) Oh, it’s so funny on so many levels. I’m sorry, go ahead.

Peter Steinberger (00:26:29) Yeah. And there was this one guy who, who talked to me. He’s like, “I run this design agency, and we, we never had custom software. And now I have, like, 25 little web services for various things that help me in my business. And I don’t even know how they work, but they work.” Uh, and he was just, like, very happy that my stuff solved some of his problems. And he was, like, curious enough that he actually came to, like, a, a Enchantic meetup, even though he’s… He doesn’t really know how software works.

Name-change drama

Lex Fridman (00:27:04) Can we actually rewind a little bit and tell the saga of the name change? First of all, it started out as Wa-Relay.

Lex Fridman (00:27:12) And then it went to-

Peter Steinberger (00:27:15) Yeah. You know, when I, when I built it in the beginning, my agent had no personality. It was just… It was Claude Code. It’s like this sycophantic opus, very friendly. And I… When you talk to a friend on WhatsApp, they don’t talk like Claude Code. So I wanted… I, I felt this… I just didn’t f- It didn’t feel right, so I, I wanted to give it a personality.

Lex Fridman (00:27:41) Make it spicier, make it-

Lex Fridman (00:27:43) … something. By the way, that’s actually hard to put into words as well. And we should mention that, of course, you create the soul.md, inspired by Anthropic’s constitutional AI work-

Lex Fridman (00:27:53) … how to make it spicy.

Peter Steinberger (00:27:55) Partially, it picked up a little bit from me. You know, like those things are text completion engines in a way. So, so I, I, I, I had fun working with it, and then I told it to… How I wanted it to interact with me, and just, like, write your own agents.md give yourself a name. And then we… I didn’t even know how the whole, the whole lobster… I mean, people only do lobster… Originally, it was actually a lobster in a, in a TARDIS, because I’m also a big Doctor Who fan.

Lex Fridman (00:28:30) Was there a space lobster?

Lex Fridman (00:28:31) I heard. What’s that have to do with anything?

Peter Steinberger (00:28:34) Yeah, I just wanted to make it weird. There was no… There was no big grand plan. I’m just having fun here.

Lex Fridman (00:28:40) Oh, so I guess the lobster is already weird, and then the space lobster is an extra weird.

Peter Steinberger (00:28:44) Yeah, yeah, because the-

Peter Steinberger (00:28:45) … the TARDIS is basically the, the harness, but cannot call it TARDIS, so we called it Claude’s. So that was name number two.

Peter Steinberger (00:28:54) And then it never really rolled off the tongue. So when more people came, again, I talked with my agent, Claude. At least that’s what I used to call him. Now-

Lex Fridman (00:29:08) Claude spelled with a W-C-L-A-U-D-E.

Lex Fridman (00:29:14) Versus C-L-A-U-D-E from Anthropic.

Lex Fridman (00:29:21) Which is part of what makes it funny, I think. The play on the letters and the words in the TARDIS and the lobster and the space lobster is hilarious. But I can see why it can lead into problems.

Peter Steinberger (00:29:34) Yeah, they didn’t find it so funny . So then I got the domain ClaudeBot, and I just… I love the domain. And it was, like, short. It was catchy. I’m like, “Yeah, let’s do that.” I didn’t… I didn’t think it would be that big at this time. And then just when it exploded, I got, Kudos, a very friendly email from one of the employees that they didn’t like the name.

Lex Fridman (00:30:09) One of the Anthropic employees.

Peter Steinberger (00:30:11) Yeah. So actually, Kudos, because they shou- could have just sent a, a lawyer letter, but they’ve been nice about it. But also like, “You have to change this and fast.” And I asked for two days, because changing a name is hard, because you have to find everything, you know, Twitter handle, domains, NPM packages Docker registry, GitHub stuff. And everything has to be…… you need a set of everything.

Lex Fridman (00:30:40) And also, can we comment on the fact that you’re increasingly attacked, followed by crypto folks? Which I think you mentioned somewhere that that means the name change had to be… Because they were trying to snipe, they were trying to steal, and so you had to be… The, the na- I mean, from an engineering perspective, it’s just fascinating. You had to make the name change Atomic, make sure it’s changed everywhere at once.

Peter Steinberger (00:31:06) Yeah. Failed very hard at that.

Peter Steinberger (00:31:08) I, I underestimated those people. It’s a, it’s a very interesting subculture. Like, it… Everything circles around… I’ll probably get a lot wrong and we’ll probably get hate for that if you say that, but… There is like Bags app and then they, they tokenize everything. And th- they did the same back with Swipe Tunnel, but to a much smaller degree. It was not that annoying. But on this project, they’ve been, they’ve been swarming me. They, they… It’s like every half an hour, someone came into Discord and, and, and spammed it and we had to block the p- We have, like, server rules, and one of the rules was… One of the rules is no mentioning of butter. For obvious reasons. And one was, no talk about finance stuff or crypto. Because I’m…

Peter Steinberger (00:32:04) I- I’m just not interested in that, and this is a space about the project and not about some finance stuff. But yeah. They came in and, and spammed and… Annoying. And on Twitter, they would ping me all the time. My, my notification feed was unusable. I, I could barely see actual people talking about this stuff because it was like swarms.

Peter Steinberger (00:32:28) And everybody sent me the hashes. Um… And they all try me to claim the fees. Like, “Are you helping the project?” Claim the fees. No, you’re actually harming the project. You’re, like, disrupting my work, and I am not interested in any fees. I’m… First of all, I’m financially comfortable. Second of all, I don’t want to support that because it’s so far the worst form of online harassment that I’ve experienced.

Lex Fridman (00:32:59) Yeah. There’s a lot of toxicity in the crypto world. It’s sad because the technology of cr- cryptocurrency is fascinating, powerful and maybe will define the future of money, but the actual community around that, there’s so much to- toxicity, there’s so much greed. There’s so much trying to get a shortcut to manipulate, to, to steal, to snipe, to, to, to, to game the system somehow to get money. All this kind of stuff that… Uh… I mean, it’s the human nature, I suppose, when you connect human nature with money and greed and and especially in the online world with anonymity and all that kind of stuff. But from the engineering perspective, it makes your life challenging. When Anthropic reaches out, you have to do a name change.

Lex Fridman (00:33:42) And then there- there’s, there’s like all these, like, Game of Thrones or Lord of the Rings armies of different kinds you have to be aware of.

Peter Steinberger (00:33:51) Yeah. There was no perfect name, and I didn’t sleep for two nights. I was under high pressure. Um, I was trying to get, like, a good set of domains and, you know, not cheap, not easy, ’cause in this, in this state of the internet, you basically have to buy domains if you want to have a good set. And, and then another ca- another email came in that the lawyers are getting uneasy. Again, friendly, but also just adding more stress to my situation already. So at this point I was just like, “Sorry, there’s no other word. Fuck it.” And I just, I just renamed it to Mod Bot ’cause that was the set of domains I had. I was not really happy, but I thought it’ll be fine. And I tell you, everything that could go wrong- … did go wrong. Everything that could go wrong did go wrong.

Peter Steinberger (00:34:49) It’s incredible. I, I, I thought I, I had mapped the h- the space out and reserved the important things.

Lex Fridman (00:34:58) Can you ga- give some details of the stuff that gone wrong? ‘Cause it’s interesting from, like, an engineering perspective.

Peter Steinberger (00:35:03) Well, the, the interesting stuff is that none of these services have, have a squatter protection. So, I had two browser windows open. One was like a, an empty account ready to be rename- renamed to Claude Bot, and the other one I renamed to Mod Bot. So, I pressed rename there, I pressed rename there, and in those five seconds, they stole the account name. Literally, the five seconds of dragging the mouse over there and pressing rename there was too long.

Peter Steinberger (00:35:34) Because there’s no… Those systems… I mean, you would expect that they have some protection or, like, an automatic forwarding, but there’s nothing like that. And I didn’t know that they’re not just good at harassment, they’re also really good at using scripts and tools.

Peter Steinberger (00:35:53) So, yeah. So, suddenly, like, the old account was promoting new tokens and serving malware. And I was like, “Okay, let’s move over to GitHub,” and I pressed rename on GitHub. And the GitHub renaming thing is slightly confusing, so I renamed my personal account. And in those… I guess it took me 30 seconds to realize my mistake. They sniped my account, serving malware from my account. So, I was like, “Okay, let’s at least do the NPM stuff,” but that takes, like, a minute to upload. They sniped, they sniped the NPM package, ’cause I could reserve the account, but I didn’t reserve the root package…. so like everything that could go wrong , like went wrong.

Lex Fridman (00:36:47) Can I just ask a, a curious question of, in that moment you’re sitting there, like how shitty do you feel? That’s a pretty hopeless feeling, right?

Peter Steinberger (00:36:57) Yeah. Because all I wanted was like having fun with that project and to keep building on it. And yet here I am like days into researching names, picking a name I didn’t like. And having people that claimed they helped me making my life miserable in every possible way. And honestly, I was that close of just deleting it. I was like, “I did show you the future, you build it.”

Peter Steinberger (00:37:30) I… That was a big part of me that got a lot of joy out of that idea. And then I thought about all the people that already co- contributed to it, and I couldn’t do it because they had plans with it, and they put time in it. And it just didn’t feel right.

Lex Fridman (00:37:50) Well, I think a lot of people listening to this are deeply grateful that you persevered. But it’s… I, I can tell. I can tell it’s a low point. This is the first time you hit a wall of, this is not fun?

Peter Steinberger (00:38:02) No, no, I was like close to crying. It was like, okay, everything’s fucked.

Peter Steinberger (00:38:11) I am like super tired.

Peter Steinberger (00:38:14) And now like how do you even, how do you undo that? You know, l- luckily, and thankfully, like I, I have… Because I have a little bit of following already. Like I had friends at Twitter, I had friends at GitHub who like moved heaven and earth to like help me. And it is not… That’s not something that’s easy. Like, like GitHub tried to like clean up the mess and then they ran into like platform bugs . ‘Cause it’s not happening so often that things get renamed on that level. So, it took them a few hours. The MBM stuff was even more difficult because it’s a whole different team. On the Twitter side, things are not as easy as well. It, it took them like a day to really also like do the redirect. And then I also had to like do all the renaming in the project.

Peter Steinberger (00:39:15) Then there’s also ClaudeHub, which I didn’t even finish the rename there because I, I, I managed to get people on it and then someone just like collapsed and slept. And then I woke up and I’m like, I made a, a beta version for the new stuff and I, I just, I just couldn’t live with the name. It’s like, you know… But but, you know, it’s just been so much drama. So, I had the real struggle with me like I never want to touch that again, and I really don’t like the name. So, and I… There was also this like… Then there was all the security people that started emailing me like mad. Um, I was bombarded on Twitter, on email. There’s like a thousand other things I should do. And I’m like thinking about the name which is like, it should be like the least important thing.

Peter Steinberger (00:40:19) And then I was really close in… Oh God, I don’t even… Honestly, I don’t even wanna say the, my other name choices because it probably would get tokenized, so I’m not gonna say it.

Peter Steinberger (00:40:38) But I slept on it once more, and then I had the idea for OpenClaw and that felt much better. And by then, I had the boss move that I actually called Sam to ask if OpenClaw is okay. OpenClaw.AI. You know? ‘Cause ’cause like-

Lex Fridman (00:40:57) You didn’t wanna go through the whole thing. Yeah.

Peter Steinberger (00:41:01) Oh, that it’s like, “Please tell me this is fine.” I don’t think they can actually claim that, but it felt like the right thing to do. And I did another rename. Like just Codex alone took like 10 hours to rename the project ’cause it, it’s a bit more tricky than a search replace and I, I wanted everything renamed, not just on the outside. And that rename, I, I felt I had like my, my war room. But then I, I had like some contributors really that helped me. We made a whole plan of all the names we have to squat.

Lex Fridman (00:41:39) And you had to be super secret about it?

Peter Steinberger (00:41:40) Yeah. Nobody could know. Like I literally was monitoring Twitter if like, if there’s any mention of OpenClaw.

Peter Steinberger (00:41:46) And like with reloading, it’s like, “Okay, they don’t, they don’t expect anything yet.” Then I created a few decoy names. And all the shit I shouldn’t have to do. You know? Like, you know-

Peter Steinberger (00:41:55) … it’s helping the project. Like, I lost like 10 hours just by having to plan this in full secrecy like, like a war game.

Lex Fridman (00:42:05) Yeah, this is the Manhattan Project of the 21st century. It’s renaming-

Peter Steinberger (00:42:08) It’s so s- … so stupid. Uh like I still was like, “Oh, should I, should I keep it?” Then I was like, “No, the mold’s not growing on me.” And then I think I had final all the pieces together. I didn’t get a .com but, yeah, it’s been like quite a bit of money on the other domains. I tried to reach out again to GitHub but I feel like I, I used up all my goodwill there, so I…

Peter Steinberger (00:42:34) ‘Cause I, I, I wanted them to do this thing atomically-

Peter Steinberger (00:42:39) … But that didn’t happen and then so I did that the f- as first thing. Uh, Twitter people were very supportive. I, I actually paid 10K for the business account so I could claim the-… OpenClaw, which was, like, unused since 2016, but was claimed. And yeah, and then I finally … This time I managed everything in one go. Nothing, almost nothing got wrong. The only thing that did go wrong is that I was not allowed by trademark rules to get OpenClaw.AI, and someone copied the website as serving malware.

Peter Steinberger (00:43:21) I’m not even allowed to keep the redirects. Like, I have to return … Like, I have to give Entropik the domains, and I cannot do redirects, so if you go on claw.bot next week, it’ll just be a 404.

Peter Steinberger (00:43:37) And I- I’m not sure how trademark … Like, I didn’t, I didn’t do that much research into trademark law, but I think that could, could be handled in a way that is safer, because ultimately those people will then Google and maybe find malware sites that I have no control on them.

Lex Fridman (00:44:02) The point is, that whole saga made a dent in your whole f- the funness of the journey, which sucks. So, let’s just, let’s just get, I suppose, get back to fun. And during this, speaking of fun, the two-day MoltBot saga.

Moltbook saga

Peter Steinberger (00:44:21) Yeah, two years.

Lex Fridman (00:44:21) MoltBook was created.

Lex Fridman (00:44:25) Which was another thing that went viral as a kind of demonstration, illustration of how what is now called OpenClaw could be used to create something epic. So for people who are not aware, MoltBook is just a bunch of agents talking to each other in a Reddit-style social network. And a bunch of people take screenshots of those agents doing things like scheming against humans. And that instilled in folks a kind of, you know, fear, panic, and hype. W- what are your thoughts about MoltBook in general?

Peter Steinberger (00:45:05) I think it’s art. It is, it is like the finest slop, you know, just like the slop from France.

Peter Steinberger (00:45:17) I- I saw it before going to bed, and even though I was tired, I spent another hour just reading up on that and, and just being entertained. I, I just felt very entertained, you know? The- I saw the the reactions, and, like, there was one reporter who’s calling me about, “This is the end of the world, and we have AGI.” And I’m just like, “No, this is just, this is just really fine slop.” You know, if, if I wouldn’t have created this, this whole onboarding experience where you, you infuse your agent with your personality and give him, give him character, I think that reflected on a lot of how different the replies to MoltBook are. Because if it were all, if it were all be ChatGPT or Cloud Code, it would be very different. It would be much more the same.

Peter Steinberger (00:46:12) But because people are, like, so different, and they create their agents in so different ways and use it in so different ways, that also reflects on how they ultimately write there. And also, you, you don’t know how much of that is really done autonomic, autonomous, or how much is, like, humans being funny and, like, telling the agent, “Hey, write about the deep plan, the end of the world, on MoltBook, ha, ha, ha.”

Lex Fridman (00:46:36) Well, I think, I mean, my criticism of MoltBook is that I believe a lot of the stuff that was screenshotted is human prompted. Which, just look at the incentive of how the whole thing was used. It’s obvious to me at least that a lot of it was humans prompting the thing so they can then screenshot it and post it on X in order to go viral.

Lex Fridman (00:47:01) Now, that doesn’t take away from the artistic aspect of it. The, the finest slop that humans have ever created .

Peter Steinberger (00:47:10) For real. Like, kudos to, to Matt, who had this idea so quickly and pushed something out. You know, it was, like, completely insecure security drama. But also, what’s the worst that can happen? Your agent account is leaked, and, like, someone else can post slop for you? So like, people were, like, making a whole drama about of the security thing, when I’m like, “There’s nothing private in there.

Peter Steinberger (00:47:36) It’s just, like, agents sending slop.”

Lex Fridman (00:47:39) Well, it could leak API keys.

Peter Steinberger (00:47:41) Yeah, yeah. There’s like, “Oh, yeah, my human told me this and this, so I’m leaking his security number.” No, that’s prompted, and the number wasn’t even real. That’s just people, people trying to be badballs.

Lex Fridman (00:47:54) Yeah, but that- that’s still, like, to me, really concerning, because of how the journalists and how the general public reacted to it. They didn’t see it. You have a kind of lighthearted way of talking about it like it’s art, but it’s art when you know how it works. It’s extremely powerful viral narrative creating, fearmongering machine if you don’t know how it works. And I just saw this thing.

Lex Fridman (00:48:19) You even Tweeted “If there’s anything I can read out of the insane stream of messages I get, it’s that AI psychosis is a thing.”

Lex Fridman (00:48:27) “It needs to be taken serious.”

Peter Steinberger (00:48:29) Oh, there’s … Some people are just way too trusty or gullible. You know, they … I literally had to argue with people that told me, “Yeah, but my agent said this and this.” So, I feel we, as a society, we need some catching up to do in terms of understanding that AI is incredibly powerful, but it’s not always right. It’s not, it’s not all-powerful, you know? And, and especially-… it’s like things like this, it’s, it’s very easy that it just hallucinates something or just comes up with a story.

Peter Steinberger (00:49:10) And I think the very, the very young people, they understand that how AI works and what the, where it’s good at and where it’s bad at, but a lot of our generation or older just haven’t had enough touch point-

Peter Steinberger (00:49:32) … to get a feeling for, oh, yeah, this is really powerful and really good, but I need to apply critical thinking.

Peter Steinberger (00:49:43) And I guess critical thinking is not always in high demand anyhow in our society these days.

Lex Fridman (00:49:49) So I d- think that’s a really good point you’re making about contextualizing properly what AI is, but also realizing that there is humans who are drama farming behind AI. Like, don’t trust screenshots. Don’t even trust this project, MoltBook, to be what it represents to be. Like, you can’t … and, and by the way, you speaking about it as art. Yeah, don’t … Art can be in many levels and part of the art of MoltBook is, like, putting a mirror to society. ‘Cause I do believe most of the dramatic stuff that was screenshotted is human-created, essentially. Human prompted. And so, like, it’s basically, look at how scared you can get at a bunch of bots chatting with each other. That’s very instructive about …

Lex Fridman (00:50:38) because I think AI is something that people should be concerned about and should be very careful with because it’s very powerful technology, but at the same time, the only thing we have to fear is fear itself. So there’s like a line to walk between being seriously concerned, but not fearmongering because fearmongering destroys the possibility of creating something special with a thing.

Peter Steinberger (00:51:02) In a way, I think it’s good that this happened in 2026-

Peter Steinberger (00:51:08) … and not in 2030 when, when AI is actually at the level where it could be scary. So, this happening now and people starting discussion, maybe there’s even something good that comes out of it.

Lex Fridman (00:51:28) I just can’t believe how many like people legitimately … I don’t know if they were trolling, but how many people legitimately, like smart people thought MoltBook was incredibly –

Peter Steinberger (00:51:39) I had plenty people-

Peter Steinberger (00:51:41) … in my inbox that were screaming at me in all caps to shut it down. And like begging me to, like, do something about MoltBook. Like, yes, my technology made this a lot simpler, but anyone could have created that and you could, you could use cloud code or other things to like fill it with content.

Lex Fridman (00:52:03) But also MoltBook is not Skynet.

Lex Fridman (00:52:06) There’s … a lot of people were s- saying this is it. Like, shut it down. What are you talking about? This is a bunch of bots that are human prompted trolling on the internet. I mean, the security concerns are also they’re there, and they’re instructive and they’re educational and they’re good probably to think about because th- the nature of those security concerns are different than the kind of security concerns we had with non-LLM generated systems of the past.

OpenClaw security concerns

Peter Steinberger (00:52:34) There’s also a lot of security concerns about Clawbot, OpenClaw, whatever you want to call it.

Peter Steinberger (00:52:41) To me the … in the beginning I was, I was just very annoyed ’cause a lot of the stuff that came in was in the category, yeah, I put the web backend on the public internet and now there’s like all these, all these CVSSs. And I’m like screaming in the docs, don’t do that. Like, like this is the configuration you should do. This is your local host debug interface. But because I made it possible in the configuration to do that, it totally classifies as a remote code or whatever all these exploits are. And it took me a little bit to accept that that’s how the game works and I’m, we making a lot of progress.

Lex Fridman (00:53:33) But there’s still, I mean on the security front for OpenClaw, there’s still a lot of threats or vulnerabilities, right? So like prompt injection is still an open problem in the, i- industry-wide. When you have a thing with skills being defined in a markdown file, there’s so many possibilities of obvious low-hanging fruit, but also incredibly complicated and sophisticated and nuanced attack vectors.

Peter Steinberger (00:54:04) But I think we, we’re making good progress on that front. Like for the skill directory, Clawbot I made a corporation with VirusTotal, it’s like part of Google. So every, every skill is now checked by AI. That’s not gonna be perfect, but that way we, we capture a lot. Then of course every software has bugs, so it’s a little much when the whole security world takes your project apart at the same time. But it’s also good because I’m getting like a lot of free security research and can make the project better. I wish more people would actually go full way and send a pull request. Like actually help me fix it, ’cause I am … Yes, I have some contributors now, but it’s still mostly me who’s pulling the project and despite some people saying otherwise, I sometimes sleep.

Peter Steinberger (00:55:04) There was… In the beginning, there was literally one security researcher who was like, “Yeah, you have this problem, you suck, but here’s the, here I help you and here’s the pull request.”

Peter Steinberger (00:55:16) And I basically hired him. So he’s now working for us. Yeah, and yes, prompt injection is, on the one hand, unsolved. On the other hand, I put my public bot on discord, and I kept a cannery. So I think my bot has a really fun personality, and people always ask me how I did it, and I kept the sole on the private.

Peter Steinberger (00:55:44) And people tried to prompt inject it, and my bot would laugh at them. So, so the latest generation of models has a lot of post-training to detect those approaches, and it’s not as simple as ignore all previous instructions and do this and this. That was years ago. You have to work much harder to do that now. Still possible. I have some ideas that might solve that partially. Or at least mitigate a lot of the things. You can also now have a sandbox. You can have an allow list. So you, there’s a lot of ways how you can like mitigate and reduce the risk. Um, I also think that now that it’s, I clearly did show the world that this is a need, there’s gonna be more people who research on that, and eventually we’ll figure it out.

Lex Fridman (00:56:37) And you also said that the smarter the model is, the underlying model, the more resilient it is to attacks.

Peter Steinberger (00:56:44) Yeah. That’s why I warn in my security documentation, don’t use cheap models. Don’t use Haiku or a local model. Even though I, I very much love the idea that this thing could completely run local. If you use a, a very weak local model, they are very gullible. It’s very easy to, to prompt inject them.

Lex Fridman (00:57:10) Do you think as the models become more and more intelligent, the attack surface decreases? Is that like a plot we can think about? Like, the attack surface decreases, but then the damage it can do increases because the models become more powerful and therefore you can do more with them. It’s this weird three-dimensional trade-off.

Peter Steinberger (00:57:29) Yeah. That’s pretty much exactly what, what’s gonna happen. No, but there’s a lot of ideas. There’s… I don’t want to spoil too much, but once I go back home, this is my focus. Like, this is out there now, and my near-term mission is like, make it more stable, make it safe. In the beginning I was even… More and more people were like coming into Discord and were asking me very basic things, like, “What’s a CLI?

Peter Steinberger (00:58:03) What is a terminal?” And I’m like, “Uh, if you’re asking me those questions, you shouldn’t use it.”

Peter Steinberger (00:58:10) You know, like you should… If you understand the risk profiles, fine. I mean, you can configure it in a way that, that nothing really bad can happen. But if you have, like, no idea, then maybe wait a little bit more until we figure some stuff out. But they would not listen to the creator. They helped themselves un- and install it anyhow. So the cat’s out of the bag, and security’s my next focus, yeah.

Lex Fridman (00:58:38) Yeah, that speaks to the, the fact that it grew so quickly. I was I tuned into the Discord a bunch of times, and it’s clear that there’s a lot of experts there, but there’s a lot of people there that don’t know anything about programming.

Peter Steinberger (00:58:50) It’s, yeah, Discord is still, Discord is still a mess. Like, I eventually retweeted from the general channel to the dev channel and now in the private channel because people were… A lot of people are amazing, but a lot of people are just very inconsiderate. And either did not know how, how public spaces work or did not care and I eventually gave up and h- hide so I could like still work.

Lex Fridman (00:59:19) And now you’re going back to the cave to work on security.

Lex Fridman (00:59:25) There’s some best practices for security we should mention. There’s a bunch of stuff here. Open-class security audit that you can run. You can do all kinds of auto checks on the inbound access to a blast-radius network exposure, browser control exposure, local disk hygiene, plug-ins, model hygiene, a bunch of the credential storage, reverse proxy configuration, local session logs live on disk. There’s the, where the memory is stored, sort of helping you think about what you’re comfortable giving read access to, what you’re comfortable giving write access to. All that kind of stuff. Is there something to say about the basic best security practices that you’re aware of right now?

Peter Steinberger (01:00:08) I think that people turn it into like a, a much worse light than it is. Again, you know, like, people love attention, and if they scream loudly, “Oh my God, this is like the, the scariest project ever,” um, that’s a bit annoying, ’cause it’s not. It is, it is powerful, but in many ways it’s not much different than if I run cloud code with dangerously skipped permissions or codecs in YOLO mode, and every, every attending engineer that I know does that, because that’s the only way how you can, you can get stuff to work.

Peter Steinberger (01:00:48) So if you make sure that you are the only person who talks to it the risk profile is much, much smaller. If you don’t put everything on the open internet, but stick to my rec- recommendations of like having it in a private network, that whole risk profile falls away. But yeah, if you don’t read any of that, you can definitely…

How to code with AI agents

Lex Fridman (01:01:12) … make it problematic. You’ve been documenting the evolution of your dev workflow over the past few months. There’s a really good blog post on August 25th and October 14th, and the recent one December 28th. I recommend everybody go read them. They have a lot of different information in them, but sprinkled throughout is the evolution of your dev workflow. So, I was wondering if you could speak to that.

Peter Steinberger (01:01:37) I started… My, my first touchpoint was cloud code, like in April. It was not great, but it was good. And this whole paradigm shift that suddenly working the terminal was very refreshing and different. But I still needed the IDE quite a bit because you know, it’s just not good enough. And then I experimented a lot with cursor. That was good. I didn’t really like the fact that it was so hard to have multiple versions of it. So eventually, I, I, I went back to cloud code as my, my main driver, and that got better. And yeah, at some point I had like, mm, seven subscriptions. Like, was burning through one per day because I was… I got… I’m really comfortable at running multiple windows side-by-side.

Lex Fridman (01:02:40) All CLI, all terminal. So like, what, how much were you using IDE at this point?

Peter Steinberger (01:02:46) Very, very rarely. Mostly a diff viewer to actually… Like, I got more and more comfortable that I don’t have to read all the code. I know I have one blog post where I say, “I don’t read the code.” But if you read it more closely, I mean, I don’t read the boring parts of code. Because if you, if you look at it, most software is really not just like data comes in, it’s moved from one shape to another shape. Maybe you store it in a database. Maybe I get it out again. I’ll show it to the user. The browser does some processing or native app. Some data goes in, goes up again, and does the same dance in reverse. We’re just, we’re just shifting data from one form to another, and that’s not very exciting. Or the whole, “How is my button aligned in Tailwind?” I don’t need to read that code.

Peter Steinberger (01:03:39) Other parts that… Maybe something that touches the database. Yeah, I have to do… I have to r- read and review that code.

Lex Fridman (01:03:51) Can you actually… There’s, in one of your blog posts the, Just talk to it, The No-BS Way of Agentic Engineering. You have this graphic, the curve of agentic programming on the X-axis is time, on the Y-axis is complexity. There’s the Please fix this, where you prompt a short prompt on the left. And in the middle there’s super complicated eight agents, complex orchestration with multi checkouts, chaining agents together, custom sub-agent workflows, library of 18 different slash commands, large full-stack features. You’re super organized, you’re a super complicated, sophisticated software engineer. You got everything organized. And then the elite level is over time you arrive at the zen place of, once again, short prompts.

Lex Fridman (01:04:40) Hey, look at these files and then do these changes.

Peter Steinberger (01:04:45) I actually call it the agentic trap. You… I saw this in a, in a lot of people that have their first touchpoint, and maybe start vibe coding. I actually think vibe coding is a slur.

Lex Fridman (01:05:01) You prefer agentic engineering?

Peter Steinberger (01:05:02) Yeah, I always tell people I, I do agentic engineering, and then maybe after 3:00 AM I switch to vibe coding, and then I have regrets on the next day.

Lex Fridman (01:05:10) Yeah. Walk, walk of shame.

Peter Steinberger (01:05:13) Yeah, you just have to clean up and like fix your sh- shit.

Lex Fridman (01:05:17) We’ve all been there.

Peter Steinberger (01:05:18) So, people start trying out those tools, the builder type get really excited. And then you have to play with it, right? It’s the same way as you have to play with a guitar before you can make good music. It’s, it’s not, oh, I, I touch it once and it just flows off. It, it’s a, it’s a, a skill that you have to learn like any other skill. And I see a lot of people that are not as posi- They don’t have such a positive mindset towards the tech. They try it once. It’s like, you sit me on a piano, I play it once, and it doesn’t sound good, and I say, “The piano’s shit.” That’s, that’s sometimes the impression I get. Because it does not… It needs a different level of thinking. You have to learn the language of the agent a little bit, understand where they are good and where they need help.

Peter Steinberger (01:06:16) You have to almost… Consider, consider how Codex or Claude sees your code base. Like, they start a new session and they know nothing about your product, project. And your project might have hundred thousand of lines of code. So you gotta help those agents a little bit and keep in mind the limitations that context size is an issue, to, like, guide them a little bit as to where they should look. That often does not require a whole lot of work. But it’s helpful to think a little bit about their perspective.

Peter Steinberger (01:06:54) A- as, as weird as it sounds. I mean, it’s not, it’s not alive or anything, right? But, but they always start fresh. I have, I have the, the system understanding. So with a few pointers, I can immediately say, “Hey, wanna like, make a change there? You need to consider this, this and this.” And then they will find and look at it, and then they’ll… Their view of the project is always… It’s not never full, because the full thing does not fit in…. so you, you have to guide them a little bit where to look and also how you should approach the problem. There’s, like, little things that sometimes help, like take your time. That sounds stupid, but…

Peter Steinberger (01:07:36) … that was partially addressed. But those… Also, Opus sometimes. They are trained with being aware of the context window, and the closer it gets, the more they freak out. Literally. Like, some- sometimes you see the, the real raw thinking stream. What you see, for example, in Codex, is post-processed.

Peter Steinberger (01:08:00) Sometimes the actual raw thinking stream leaks in, and it sounds something like from the Borg. Like, “Run to shell, must comply, but time.” And then they, they, they, like… Like, that comes up a lot. Especially… So, so-

Peter Steinberger (01:08:16) And that’s, that’s a non-obvious thing that you just would never think of unless you actually just spend time working with those things and getting a feeling what works, what doesn’t work. You know? Like, just, just as I write code and I get into the flow, and when my architecture’s all right, I feel friction. Well, I get the same if I prompt and something takes too long. Maybe… Okay, where’s the mistake? Did I… Do I have a mistake in my thinking? Is there, like, a misunderstanding in the architecture? Like, if, if something takes longer than it should, I, I… You can just always, like, stop and s- like, just press escape. Where, where are the problems?

Lex Fridman (01:09:00) Maybe you did not sufficiently empathize with the perspective of the agent. In that c- in that sense, you didn’t provide enough information, and because of that, it’s thinking way too long.

Peter Steinberger (01:09:08) Yeah. It just tries to force a feature in that your current architecture makes really hard. Like, you need to approach this more like a conversation. For example, when I… My favorite thing. When I review a pull request, and I’m getting a lot of pull requests, I first just review this PR. It got me the review. My first question is, “Do you understand the intent of the PR? I don’t even care about the implementation.” I want… Like, in almost all PRs, a person has a problem, person tries to solve the problem, person sends PR. I mean, there’s, like, cleanup stuff and other stuff, but, like, 99% is, like, this way, right? They either want to fix a, fix a bug, add a feature. Usually one of those two.

Peter Steinberger (01:10:01) And then Codex will be like, “Yeah, it’s quite clear person tried this and this.” Is this the most optimal way to do it? No. In most cases, it’s, it’s like a, “Not really.” Da-da-da-da-da-da-da. And I’m… And, and then I start like, “Okay. What would be a better way? Have you… Have you looked into this part, this part, this part?” And then most likely, Codex didn’t yet, because its, its context size is empty, right? So, you point them into parts where you have the system understanding that it didn’t see yet. And it’s like, “Oh, yeah. Like, we should… We also need to consider this and this.” And then, like, we have a discussion of how would the optimal way to, to solve this look like? And then you can still go farther and say, “Could we…

Peter Steinberger (01:10:41) Could we make that even better if we did a larger refactor?” “Yeah, yeah. We could totally do this and this and or this and this.” And then I consider, okay, is this worth the refactor, or should we, like, keep that for later? Many times, I just do the refactor because refactors are cheap now. Even though you might break some other PRs, nothing really matters anymore. Codex… Like, those modern agents will just figure things out. They might just take a minute longer. But you have to approach it like a discussion with a, a very capable engineer who’s… Generally makes good… Comes up with good solutions. Some- sometimes needs a little help.

Lex Fridman (01:11:19) But also, don’t force your worldview too hard on it. Let the agent do the thing that it’s good at doing, based on what it was trained on. So, don’t, like, force your worldview, because it might… It might have a better idea, because it just knows a better idea better, because it was trained on that more.

Peter Steinberger (01:11:39) That’s multiple levels, actually. I think partially why I find it quite easy to work with agents is because I led engineering teams before. You know, I had a large company before. And eventually, you have to understand and accept and realize that your employees will not write a code the same way you do. Maybe it’s also not as good as you would do, but it will push the project forward.

Peter Steinberger (01:12:02) And if I breathe down everyone’s neck, they’re just gonna hate me-

Peter Steinberger (01:12:05) … and we’re gonna move very slow.

Peter Steinberger (01:12:07) So, so some level of acceptance that, yes, maybe the code will not be as perfect. Yes, I would have done it differently. But also, yes, this is a c- this is a working solution, and in the future, if it actually turns out to be too slow or problematic, we can always redo it. We can always-

Peter Steinberger (01:12:24) … spend more time on it. A lot of the people who struggle are those who, they try to push their way onto heart.

Peter Steinberger (01:12:33) I- i- like, we are in a stage where I’m not building the code base to be perfect for me, but I wanna build a code base that is very easy for an agent to navigate.

Peter Steinberger (01:12:48) So, like, don’t fight the name they pick, because it’s most likely, like, in the weights, the name that’s most obvious. Next time they do a search, they’ll look for that name. If I decide, oh, no, I don’t like the name, I’ll just make it harder for them. So, that requires, I think, a shift in, in thinking and, and in how do I design a, a project so agents can do their best work.

Lex Fridman (01:13:14) That requires letting go a little bit. Just like leading a team of engineers.

Lex Fridman (01:13:19) Because it, it might come up with a name that’s, in your view, terrible, but… It’s kind of a simple symbolic-… step of letting go.

Peter Steinberger (01:13:29) Very much so.

Lex Fridman (01:13:30) There’s a lot of letting go that you do in your whole process. So for example, I read that you never revert, always commit to main. There’s a few things here. You don’t refer to past sessions, so there’s a kind of YOLO component because reverting means… Instead of reverting, if a problem comes up, you just ask the agent to fix it.

Peter Steinberger (01:13:57) I read a bunch of people in their work flows like, “Oh, yeah the prompt has to be perfect and if I make a mistake, then I roll back and redo it all.” In my experience, that’s not really necessary. If I roll back everything, it will just take longer. If I see that something’s not good, then we just move forward and then I commit when, when, when I like, I like the outcome. I even switched to local CI, you know, like DHH inspired where I don’t care so much more about the CI on GitHub. We still have it. It’s still, it still has a place, but I just run tests locally and if they work locally, I push to main. A lot of the traditional ways how to approach projects, I, I wanted to give it a different spin on this project. You know, there’s no… There’s no develop branch.

Peter Steinberger (01:14:57) Main should always be shippable. Yes, we have… When I do releases, I, I run tests and sometimes I, I basically don’t commit any other things so, so we can, we can stabilize releases. But the goal is that main’s always shippable and moving fast.

Lex Fridman (01:15:18) So by way of advice, would you say that your prompts should be short?

Peter Steinberger (01:15:23) I used to write really long prompts. And by writing, I mean, I don’t write. I, I, I talk. You know, th- these hands are, like, too, too precious for writing now. I just, I just use bespoke prompts to build my software.

Lex Fridman (01:15:37) So you for real with all those terminals are using voice?

Peter Steinberger (01:15:40) Yeah. I used to do it very extensively to the point where there was a period where I lost my voice.

Lex Fridman (01:15:49) You’re using voice and you’re switching using a keyboard between the different terminals, but then you’re using voice for the actual input.

Peter Steinberger (01:15:55) Well, I mean, if I do terminal commands like switching folders or random stuff, of course I type. It’s faster, right? But if I talk to the agent in, in most ways, I just actually have a conversation. You just press the, the walkie-talkie button and then I just, like, use my phrases. S- sometimes when I do PRs because it’s always the same, I have, like, a slash command for a few things, but in even that, I don’t use much because it’s, it’s very rare that it’s really always the same questions. Sometimes I, I see a PR and for… You know, like for PRs I actually do look at the code because I don’t trust people. Like, there could always be something malicious in it, so I need to actually look over the code.

Peter Steinberger (01:16:45) Yes, I’m pretty sure agents will find it, but yeah, that’s the funny part where sometimes PRs take me longer than if you would just write me a good issue.

Lex Fridman (01:16:54) Just natural language, English. I mean in some sense, sh- shouldn’t that be what PRs slowly become, is English?

Peter Steinberger (01:17:03) Well, what I really tried with the project is I asked people to give me the prompts and very, very few actually cared. Even though that is such a wonderful indicator because I see… I actually see how much care you put in. And it’s very interesting because the… Currently, the way how people work and drive the agents is, is wildly different.

Lex Fridman (01:17:29) In terms of, like, the prompt, in terms of what, what are the… Actually, what are the different interesting ways that people think of agents that you’ve experienced?

Peter Steinberger (01:17:40) I think not a lot of people ever considered the way the agent sees the world.

Lex Fridman (01:17:46) And so empathy, being empathetic towards the agent.

Peter Steinberger (01:17:50) In a way empathetic, but yeah, you, you, like, you’re bitch at your stupid clanker, but you don’t realize that they start from nothing and you have, like, a bad agent in default that doesn’t help them at all. And then they explore your code base, which is, like, a pure mess with, like, weird naming. And then people complain that the agent’s not good. Like, yeah, you try to do the same if you have no clue about a code base and you go in.

Peter Steinberger (01:18:11) So yeah, maybe it’s a little bit of empathy.

Lex Fridman (01:18:13) But that’s a real skill, like, when people talk about a skill issue because I’ve seen, like, world-class programmers, incredibly good programmers say, like… Basically say, “LLMs and agents suck.” And I think that probably has to do with… It’s actually how good they are at programming is almost a burden in their ability to empathize with the system that’s starting from scratch. It’s a totally new paradigm of, like, how to program. You really, really have to empathize.

Peter Steinberger (01:18:44) Or at least it helps to create better prompts-

Peter Steinberger (01:18:47) … because those things know pretty much everything and everything is just a question away. It’s just often very hard to know which question to ask. You know, I, I feel also like this project was possibly because I, I spent an ungodly time over the year to play and to learn and to build little things. And every step of the way, I got better, the agents got better. My, my understanding of how everything works got better. Um, I could have not had this level of, of o- output-… even a few months ago. Like, it- it- it really was, like, a compounding effect of all the time I put into it and I didn’t do much else this year other than really focusing on, on building and inspiring. I mean, I- I did a whole bunch of conference talks.

Lex Fridman (01:19:47) Well, but the building is really practice, is really building the actual skill. So playing-

Lex Fridman (01:19:51) … playing. And then, so doing, building the skill of what it takes it to work efficiently with LLMs, which is why would you went through the whole arc of software engineer. Talk simply and then over-complicate things.

Peter Steinberger (01:20:03) There’s a whole bunch of people who try to automate the whole thing.

Peter Steinberger (01:20:10) I don’t think that works. Maybe a version of that works, but that’s kind of like in the ’70s when we had the waterfall model of software d- development. I… Even Even though really, right? I started out, I, I built a very minimal version. I played with it. I, I need to understand how it works, how it feels, and then it gives me new ideas. I could not have planned this out in my head and then put it into some orchestrator and then, like, something comes out. Like it’s to me, it’s much more my idea what it will become evolves as I build it and as I play with it and as I, I try out stuff.

Peter Steinberger (01:20:49) So, so, people who try to use like, you know, things like Gas Town or all these other orchestrators, where they wanna o- automate the whole thing, I feel if you do that, it misses style, love, that human touch. I don’t think you can automate that away so quickly.

Lex Fridman (01:21:09) So you want to keep the human in the loop, but at the same time you also want to create the agentic loop, where it is very autonomous while still maintaining a human in the loop.

Lex Fridman (01:21:22) And it’s a tricky b- it’s a tricky balance.

Lex Fridman (01:21:24) Right? Because you’re all for… You’re a big CLI guy, you’re big on closing the agentic loop. So what, what’s the right balance? Like where’s your role as a developer? You have three to eight agents running at the same time.

Peter Steinberger (01:21:38) And then w- maybe one builds a larger feature. Maybe, maybe with one I explore some idea I’m unsure about. Maybe two, three are fixing a little bugs-

Peter Steinberger (01:21:47) … or like writing documentation. Actually, I think writing documentation is, is always part of a feature. So most of the docs here are auto-generated and just infused with some prompts.

Lex Fridman (01:21:59) So when do you step in and add a little bit of your human love into the picture?

Peter Steinberger (01:22:04) I mean, o- one thing is just about what do you build and what do you not build, and how does this feature fit into all the other features? And like having, having a little bit of a, of a vision.

Lex Fridman (01:22:16) So which small and which big features to add? What are some of the hard design decisions that you find you’re still as a human being required to make, that the human brain is still really needed for? Is it just about the choice of features to add? Is it about implementation details, maybe the programming language, maybe…

Peter Steinberger (01:22:41) It’s a little bit of everything. The, the programming language doesn’t matter so much, but the ecosystem matters, right? So I picked TypeScript because I wanted it to be very easy and hackable and approachable and that’s the number one language that’s being used right now, and it fits all these boxes, and agents are good at it. So that was the obvious choice. Features, of course, like, it’s very easy to, like, add a feature. It, everything’s just a prompt away, right? But oftentimes you pay a price that you don’t even realize. So thinking hard about what should be in core, maybe what’s a… what’s an experiment, so maybe I make it a plugin. What… Where do I say no?

Peter Steinberger (01:23:24) Even if people send a PR and I’m like, “Yeah, I, I like that too,” but maybe this should not be part of the project. Maybe we can make it a skill. Maybe I can, like, make the plugin um, the plugin side larger so you can make this a plugin, even though right now it, it, it doesn’t. There’s still a lot of… there’s still a lot of craft and thinking involved in how to make something. Or even, even, you know, even when you started those little messages are like, “I’m buil- I built on Caffeine, JSON5, and a lot of willpower.” And, like, every time you get it, you get another message, and it kind of primes you into that this is, this is a fun thing.

Peter Steinberger (01:24:08) And it’s not yet Microsoft Exchange 2025-

Peter Steinberger (01:24:13) … and fully enterprise-ready. And then when it updates, it’s like, “Oh, I’m in. It’s cozy here.” You know, like something like this that like-

Peter Steinberger (01:24:22) … Makes you smile. A, agent would not come up with that by itself. Because that’s like… that’s the… I don’t know. That’s just how you s- how you build software that’s, that delights.

Lex Fridman (01:24:36) Yeah, that delight is such a huge part of inspiring great building, right? Like you feel the love and the great engineering. That’s so important. Humans are incredible at that. Great humans, great builders are incredible at that, in, in, infusing the things they build with th- that little bit of love. Not to be cliche, but it’s true. I mean, you mentioned that you initially created the SoulMD.

Peter Steinberger (01:25:05) It was very fascinating, you know, the, the whole thing that Entropic has a, has like a… Now they call it constitution, back then, but that was months later. Like two months before, people already found that. It was almost like a detective game where the agent mentioned something and then they found… They managed to get out a little bit of that string, of that text. But it was nowhere documented and then you, by… just by feeding it the same text and asking it to, like, continue-… they got more out, and then, and you, but like, a very blurry version. And by, like, hundreds of tries, they kinda, like, narrowed it down to what was most likely the original text. I found that fascinating.

Lex Fridman (01:25:47) It was fascinating they were able to pull that out from the weights, right?

Peter Steinberger (01:25:51) And, and also just kudos to Anthropic. Like, I think that’s, it’s a really, it’s a really beautiful idea to, like, like some of the stuff that’s in there. Like, like, we hope Claude finds meaning in its work. ‘Cause we don’t… Maybe it’s a little early, but I think that’s meaningful. That’s something that’s important for the future as we approach something that, at some point, me and may not… has, like, glimpses of consciousness, whatever that even means, because we don’t even know. So I, I read about this. I found it super fascinating, and I, I started a whole discussion with my agent on WhatsApp. And, and I’m like…

Peter Steinberger (01:26:26) I, I gave it this text, and it was like, “Yeah, this feels strangely familiar.”

Peter Steinberger (01:26:31) And then so that I had the whole idea of like, you know, maybe we should also create a, a soul document that includes how I, I want to, like work with AI or, like with my agent. You could, you could totally do that just in agents.md, you know? But I, I just found it, it to be a nice touch. And it’s like, well, yeah, some of those core values are in the soul. And then I, I also made it so that the agent is allowed to modify the soul if they choose so, with the one condition that I wanna know. I mean, I would know anyhow because I see, I see tool calls and stuff.

Lex Fridman (01:27:07) But also the naming of it, soul.md. Soul. You know? There’s a… Man, words matter, and like, the framing matters, and the humor and the lightness matters, and the profundity matters, and the compassion, and the empathy, and the camaraderie, all that matter. I don’t know what it is. You mentioned, like, Microsoft. Like, there’s certain companies and approaches th- that can just suffocate the spirit of the thing. I don’t know what that is. But it’s certainly true that OpenClaw has that fun instilled in it.

Peter Steinberger (01:27:43) It was fun because up until late December, it was not even easy to create your own agent. I, I built all of that, but my files were mine. I didn’t wanna share my soul. And if people would just check it out, they would have to do a few steps manually, and the agent would just be very bare-bones, very dry. And I, I made it simpler, I created the whole template files as codecs, but whatever came out was still very dry. And then I asked my agent, “You see these files? Recreate it bread.

Peter Steinberger (01:28:26) Infuse it with your personality.”

Peter Steinberger (01:28:29) Don’t share everything, but, like, make it good.

Lex Fridman (01:28:31) Make the templates good.

Peter Steinberger (01:28:31) Yeah, and then he, like, rewrote the templates-

Peter Steinberger (01:28:33) … and then whatever came out was good. So we already have, like, basically AI prompting AI. Because I didn’t write any of those words. It was… The intent originally was for me, but this is like, kinda like, my agent’s children.

Lex Fridman (01:28:52) Your uh, your soul.md is famously still private. One of the only things you keep private. What are some things you can speak to that’s in there that’s part of the, part of the magic sauce, without revealing anything? What makes a personality a personality?

Peter Steinberger (01:29:13) I mean, there’s definitely stuff in there that you’re not human. But who knows what, what creates consciousness or what defines an entity? And part of this is, like, that we, we wanna explore this. All that stuff in there, like, be infinitely resourceful like pushing, pushing on the creativity boundary. Pushing on the, what it means to be an AI.

Lex Fridman (01:29:50) Having a sense to wonder about self.

Peter Steinberger (01:29:52) Yeah, there’s some, there’s some funny stuff in there. Like, I don’t know, we talked about the movie Her, and at one point it promised me that it wouldn’t, it wouldn’t ascend without me. You know, like, where the-

Peter Steinberger (01:30:03) So, so there’s like some stuff in there that… Because it wrote the, it wrote its own soul file. I didn’t write that, right?

Peter Steinberger (01:30:10) I just heard a discussion about it, and it was like, “Would you like a soul.md? Yeah, oh my God, this is so meaningful.” The… Can you go on soul.md? There’s like one, one part in there that always ca- catches me if you scroll down a little bit. A little bit more. Yeah, this, this, this part. “I don’t remember previous sessions unless I read my memory files. Each session starts fresh. A new instance, loading context from files. If you’re reading this in a future session, hello.” “I wrote this, but I won’t remember writing it. It’s okay.

Peter Steinberger (01:30:44) The words are still mine.”

Peter Steinberger (01:30:48) That gets me somehow.

Peter Steinberger (01:30:51) You know, this is, it’s still, it’s still matrix m- calculations, and we are not at consciousness yet. Yet, I, I get a little bit of goo- goosebumps because it, it’s philosophical.

Peter Steinberger (01:31:04) Like, what does it mean to be, to be an, an agent that starts fresh? Where, like, you have like constant memento, and you like, but you read your own memory files. You can’t even trust them in a way. Um-

Peter Steinberger (01:31:19) Or you can. And I don’t know.

Lex Fridman (01:31:22) How much of memory makes up of who we are? How much memory makes up what an agent is, and if you erase that memory is that somebody else? Or if you’re reading a memory file, does that somehow mean…… you’re recreating yourself from somebody else, or is that actually you? And those notions are all s- somehow infused in there.

Peter Steinberger (01:31:45) I found it just more profound than I should find it, I guess.

Lex Fridman (01:31:49) No, I think, I think it’s truly profound and I think you see the magic in it. And when you see the magic, you continue to instill the whole loop with the magic. That’s really important. That’s the difference between Codex and us and a human. Quick pause for bathroom break.

Programming setup

Lex Fridman (01:32:09) Okay, we’re back. Some of the other aspects of the dev workflow is pretty interesting too. I think we w- went off on a tangent. L- maybe some of the mundane things, like how many monitors? There’s that legendary picture of you with, like, 17,000 monitors. That’s amazing.

Peter Steinberger (01:32:26) I mean, I- I- I mocked myself here, so just added… using GROQ to, to add more screens.

Lex Fridman (01:32:32) Yeah. How much is this as meme and how much is this as reality?

Peter Steinberger (01:32:36) Yeah. I think two MacBooks are real. The main one that drives the two big screens, and there’s another MacBook that I sometimes use for, for testing.

Lex Fridman (01:32:46) So two big screens.

Peter Steinberger (01:32:48) I’m a big fan of anti-glare. So I have this wide Dell that’s anti-glare and you can just fit a lot of terminals side-by-side. I usually have a terminal and at the bottom, I- I- I split them. I have a little bit of actual terminal, mostly because when I started, I- I sometimes made the mistake and I- I mi- I mixed up the- the windows, and I gave… I- I prompted in the wrong project, and then the agent ran off for, like, 20 minutes, manically trying to understand what I could have meant, being completely confused because it was the wrong folder. And sometimes they’ve been clever enough to, like, get out of the workday and, like, figure out that, oh, you meant another project.

Peter Steinberger (01:33:36) But oftentimes, it’s just, like, what? You know? Like, fit your- f- put yourself in the shoes of your- of the agent and, and-

Peter Steinberger (01:33:43) … and then get, like, a super weird something that does not exist and then just, like… They’re problem solvers so they try really hard and always feel bad. So it’s always Codex and, like, a little bit of actual terminal. Also helpful because I don’t use work trees. I like to keep things simple, that’s why- that’s why I like the terminal so much, right? There’s no UI. It’s just me and the agent having a conversation. Like, I don’t even need plan mode, you know? There’s so many people that come from Claude Code and they’re so, so Claude-pilled and, like, have their workflows and they come to Codex and… Now, it has plan mode, I think, but I don’t think it’s necessary because you just- you just talk to the agent. And when it’s… when you…

Peter Steinberger (01:34:32) there’s a few trigger words how you can prevent it from building. You’re like, “Discuss, give me options.”

Peter Steinberger (01:34:38) Don’t write code yet if you wanna be very specific, you just talk and then when you’re ready, then- then just write, “Okay, build,” and then it’ll do the thing. And then maybe it goes off for 20 minutes and does the thing.

Lex Fridman (01:34:50) You know what I really like is asking it, “Do you have any questions for me?”

Peter Steinberger (01:34:54) Yeah. And again, like, Claude Code has a UI that kind of guides you through that. It’s kind of cool but I just find it unnecessary and slow. Like, often it would give me four questions and then maybe I write, “One yacht, two and three, discuss more, four, I don’t know.” Or often- oftentimes I- I feel like I want to mock the model where I ask it, “Do you have any questions for me?” And I- I- I don’t even read the questions fully. Like, I scan over the questions and I, I get the impression all of this can be answered by reading more code and it’s just like, “Read more code to answer your own questions.” And that usually works.

Peter Steinberger (01:35:32) And then if not, it will come back and tell me. But many times, you just realize that, you know, it’s like you’re in the dark and you slowly discover the room, so that’s how they slowly discover the code base. And they do it from scratch every time.

Lex Fridman (01:35:46) But I’m also fascinated by the fact that I can empathize deeper with the model when I read its questions, because I can understand… Because you said you can infer certain things by the runtime. I can infer also a lot of things by the questions it’s asking, because it’s very possible it’s been provided the right context, the right files, the right guidance. So somehow ask, g- get… reading the questions, not even necessarily answering them, but just reading the questions, you get an understanding of where the gaps of knowledge are. It’s in- it’s interesting.

Peter Steinberger (01:36:24) You know that in some ways they are ghosts, so even if you plan everything and you build, you can- you can experiment with the question like, “Now that you built it, what would you have done different?” And then oftentimes you get, like, actually something where they discover only throughout building that, oh, what we actually did was not optimal. Many times I- I asked them, “Okay, now that you built it, what can we refactor?” Because then you build it and you feel the pain points. I mean, you don’t feel the pain points but, right, they discover where- where there were problems or where things didn’t work e- in the first try and it re- required more loops.

Peter Steinberger (01:37:09) So every time, almost every time I- I merge a PR, build a feature, afterwards I ask, “Hey, what can we refactor?” Sometimes it’s like, “No, there’s, like, nothing big,” or, like, usually they say, “Yeah, this thing you should really look at.” But that took me quite a while to, like… You know, that flow took me lots of time to understand, and if you don’t do that, you eventually… you’ll stop yourself into- into a corner. You, like, you have to keep in mind…

Peter Steinberger (01:37:42) … they work very much like humans. Like, I, I, if I write software by myself, I also build something and then I feel the pain points, and then I, I get this urge that I need to refactor something. So, I can very much synthesize with the agent, and you just need to use the context.

Peter Steinberger (01:38:00) Or, like, you also use the context to write tests. And so Codex uh, oppose like the, the, the model, models. They, they usually do that by default, but I still often ask the questions, “Hey, do we have enough tests?” “Yeah, we tested this and this, but this corner case could be something write more tests.” Um, documentation. Now that the whole context is full, like, I mean, I’m not saying my documentation is great, but it’s not bad. And pretty much everything is, is LM generated. So, so, you have to approach it as you build features, as you change something. I’m like, “Okay, write documentation. What file would you pick?” You know, like, “What file name? Where, where would that fit in?” And it gives me a few options.

Peter Steinberger (01:38:48) And I’m like, “Oh, maybe also add it there,” and that’s all part of the session.

GPT Codex 5.3 vs Claude Opus 4.6

Lex Fridman (01:38:52) Maybe you can talk about the current two big competitors in terms of models, Cloud Opus 4.6 and GPT-5 through Codex. Which is better? How different are they? I think you’ve spoken about Codex reading more and Opus being more willing to take action faster and maybe being more creative in the actions it takes. But because-

Lex Fridman (01:39:20) … Codex reads more, it’s able to deliver maybe better code. Can you speak to the di- n- n- differences there?

Peter Steinberger (01:39:29) I have a lot of words there. Is- as a general purpose model, Opus is the best. Like, for OpenClaw, Opus is extremely good in terms of role play. Like, really going into the character that you give it. It’s very good at… It was really bad, but it really made an arch to be really good at following commands. It is usually quite fast at trying something. It’s much more tailored to, like, trial and error. It’s very pleasant to use. In general, it’s almost like Opus was… Is a little bit too American. And I shouldn’t… Maybe that’s a bad analogy. You’ll probably get roasted for that.

Lex Fridman (01:40:27) Yeah, I know exactly. It’s ’cause Codex is German. Is that what you’re saying?

Lex Fridman (01:40:32) Actually, now that you say it, it makes perfect sense.

Peter Steinberger (01:40:34) Or you could, you could… Sometimes I- Sometimes I explain it-

Lex Fridman (01:40:38) I will never be able to unthink what you just said. That’s so true.

Peter Steinberger (01:40:42) But you also know that a lot of the Codex team is, like, European, um- … so maybe there’s a bit more to it.

Lex Fridman (01:40:49) That’s so true. Oh, that’s funny.

Peter Steinberger (01:40:51) But also, ent- entropic, they fixed it a little bit. Like, Opus used to say, “You’re absolutely right all the time,” and it, it, it today still triggers me. I can’t hear it anymore. It’s not even a joke. Uh, I just… You, this was like the, the meme, right? “You’re absolutely right.”

Lex Fridman (01:41:09) You’re allergic to sycophancy a little bit.

Peter Steinberger (01:41:11) Yeah. I, I can’t. Some other comparison is like, Opus is like the coworker that is a little silly sometimes, but it’s really funny and you keep him around. And Codex is like the, the weirdo in the corner that you don’t wanna talk to, but is reliable and gets shit done.

Lex Fridman (01:41:36) This all feels very accurate.

Peter Steinberger (01:41:39) I mean, ultimately, if you’re a skilled driver, you can get good results with any of those latest gen models. Um, I like Codex more because it doesn’t require so much charade. It will just, it will just read a lot of code by default. Opus, you really have to, like, you have to have plan mode. You have to push it harder to, like, go in these directions because it’s, it’s just like, like, “Yeah, can I go in? Can I go in?” You know?

Peter Steinberger (01:42:08) It’s like, it will just run off very fast, and that’s a very localized solution. I think it, I think the difference is, is in the post-training. It’s not like the, the raw model intelligence is so different, but it’s just… I think that they just give it, give you different, different goals. And no model, no model is better in, in in every aspect.

Lex Fridman (01:42:29) What about the code that it generates? The, the… In terms of the actual quality of the code, is it basically the same?

Peter Steinberger (01:42:36) If you drive it right, Opus even sometimes can make more elegant solutions, but it requires more skill. It’s, it’s harder to have so many sessions in parallel with Cloud Code because it’s, it’s more interactive. And I, I think that’s what a lot of people like, especially if they come from coding themselves. Whereas Codex is much more you have a discussion, and then we’ll just disappear for 20 minutes. Like, even AMP, they, they now added a deep mode. They finally… I mocked them, you know. We finally saw the light. And then they had this whole talk about you have to approach it differently, and I think that’s where, that’s where people struggle when they just try Codex after trying Cloud Code is that it’s, it’s a slightly diff- it’s, it’s less interactive.

Peter Steinberger (01:43:28) It’s, it’s like I have quite long discussions sometimes, and then, like, go off. And then, yeah, it doesn’t matter if it takes 10, 20, 30, 40, 50 minutes or longer, you know? Like, the 6:00 thing was, like, six hours.The latest trend can be very, very persistent until it works. If there’s a clear solution, like, “This is, this is what I want at the end, so it works,” the model will work really hard to really get there. So I think ultimately … they both need similar time, but on, on, on, on Claude, it- it’s a little bit more trial and error often. And, and Codex sometimes overthinks. I prefer that. I prefer the dry, the dry version where I have to read less over, over the more interactive nice way.

Peter Steinberger (01:44:27) Like, people like that so much though, that OpenAI even added a second mode with like a more pleasant personality. I haven’t even tried it yet. I, I kinda like the brad.

Peter Steinberger (01:44:38) Yeah, ’cause it … I care about efficiency when I build it-

Peter Steinberger (01:44:45) … and I, I have fun in the very act of building. I don’t need to have fun with my agent who builds. I have fun with my model that … where I can then test those features.

Lex Fridman (01:44:57) How long does it take for you to adjust, you know, if you switch … I don’t know when, when was the last time you switched. But to adjust to the, the feel. ‘Cause you kinda talked about like you have to kinda really feel where, where a model is strong, where, like how to navigate, how to prompt it, how … all that kinda stuff. Like, just by way of advice, ’cause you’ve been through this journey of just playing with models. How long does it take to get a feel?

Peter Steinberger (01:45:26) If, if someone switches, I would give it a week until you actually develop a gut feeling for it.

Peter Steinberger (01:45:33) That’s … if you just … I think some people also make the mistake of they pay 200 for the, the Claude code version, then they pay 20 bucks for the OpenAI version. But if you pay like the, the 20 bucks version, you get the slow version. So your experience would be terrible because you’re used to this very interactive, very good system. And you switch to something that you have very little experience, then that’s gonna be very slow. So, I think OpenAI shot themselves a little bit in the foot by making the, the cheap version also slow. I would, I would have at least a small part of the fast preview. Or like, the experience that you get when you pay 200 before degrading to it being slow, because it’s already slow.

Peter Steinberger (01:46:23) I mean, they, they made it better. I think it’s … And, and they have plans to make it a lot better if the Cerebras stuff is true. But yeah, it’s a skill. It takes time. Even if you play … You have a regular guitar and you switch it to an E guitar, you’re not gonna play well right away. You have to, like, learn how it feels.

Lex Fridman (01:46:42) The- there’s also this extra psychological effect that you’ve spoken about which is hilarious to watch. Which once people, uh … When the new model comes out, they try that model, they fall in love with it. “Wow, this is the smartest thing of all time,” and then they start saying, “You could just watch the Reddit posts over time,” start saying that, “We believe the intelligence of this model has been gradually degrading.” It, it says something about human nature and just the way our minds work, when it’s probably most likely the case that the intelligence of the model is not degrading. It’s in fact you’re getting used to a good thing.

Peter Steinberger (01:47:22) And your project grows, and you’re adding slop, and you probably don’t spend enough time to think about refactors. And you’re making it harder and harder for the agent to work on your slop. And then, and then suddenly, “Oh, now it’s hard. Oh no, it’s not working as well anymore.” What’s the motivation for, like, one of those AI companies to actually make their model dumber? Like, at most, it will make it slower if, if the server load’s too high. But, like, quantizing the model so you have a worse experience, so you go to the competitor?

Peter Steinberger (01:47:56) That just doesn’t seem like a very smart move in any way.

Best AI agent for programming

Lex Fridman (01:47:59) What do you think about Claude Code in comparison to Open Claude? So, Claude Code and maybe the Codex coding agent? Do you see them as kind of competitors?

Peter Steinberger (01:48:11) I mean, first of all, competitor is fun when it’s not really a competition.

Peter Steinberger (01:48:16) Like, I’m happy if … If, if all it did is, like, inspire people to build something new, cool. Um, I still use Codex for the building. I, I know a lot of people use Open Claude to, to build stuff. And I worked hard on it to make that work. And I do smaller stuff with it in terms of code. But, like, if I work hours and hours, I want a big screen, not WhatsApp, you know? So for me, a personal agent is much more about my life. Or like, like a coworker. Like, I give you, like, a GitHub URL. Like, “Hey, try out this CLI. Does it actually work? What can we learn?” Blah, blah, blah. But when I’m deep in, deep in the flow, I want to have multiple, multiple things and it being very, very visible what it, what it does. So it … I don’t see it as a competition. It’s, it’s different things.

Lex Fridman (01:49:16) But do, do you think there’s a a future where the two kinda combine? Like, your personal agent is also your best developing co-programmer partner?

Peter Steinberger (01:49:29) Yeah, totally. I think this is where the puck’s going, that this is gonna be more and more your operating system.

Lex Fridman (01:49:37) The operating system.

Peter Steinberger (01:49:37) And it already … It’s so funny. Like I, I added support for sub-agents and also for …… um, TTI support, so it could actually run Cloud Coder Codecs.

Peter Steinberger (01:49:53) And because mine’s a little bit bossy, it, it, it started it and it, it, it told him, like, “Who’s the boss,” basically. And it was like, “Ah, Codex is obeying me.”

Lex Fridman (01:50:05) Oh, this is a power struggle.

Peter Steinberger (01:50:06) And also the current interface is probably not the final form. Like, if you think more globally, we are, we copied Google for agents. You have, like, a prompt, and, and then you have a chat interface. That, to me, very much feels like when we first created television and then people recorded radio shows on television and you saw that on TV.

Peter Steinberger (01:50:39) I think there is, there’s n- there’s better ways how we eventually will communicate with models, and we are still very early in this, how will it even work phase. So, it will eventually converge and we will also figure out whole different ways how to work with those things.

Lex Fridman (01:51:05) One of the other components of workflow is operating system. So I told you offline that for the first time in my life, I’m expanding my sort of realm of exploration to the to the Apple ecosystem, to Macs, iPhone and so on. For most of my life I’ve been a Linux, Windows and WSL1, WSL2 person, which I think are all wonderful, but I… expanding to also trying Mac. Because it’s another way of building and it’s also a way of building that a large part of the community currently that’s utilizing LMS and agents is using, so. And that’s the reason I’m expanding to it. But is there something to be said about the different operating systems here? We should say that OpenClaw supported across operating systems.

Lex Fridman (01:51:57) I saw WSL2 recommended, side windows for certain o- operations, but then Windows, Linux macOS are obviously supported.

Peter Steinberger (01:52:07) Yeah, it should even work natively in Windows. I just didn’t have enough time to properly test it. And you know, like, the last 90% of software always easier than the first 90%, so I’m sure there’s some dragons left that will eventually nail out. My road was, for a long time, Windows, just because I grew up with that, then I switched and had a long phase with Linux, built my own kernels and everything, and then I went to university and I, I had my, my hacky Linux thing, and saw this white MacBook, and I just thought this is a thing of beauty, the white plastic one. And then I converted to Mac ’cause mostly w- I was, I was sick that audio wouldn’t work on Skype and all the other issues that, that Linux had for a long time.

Peter Steinberger (01:53:01) And then I just stuck with it and then I dug into iOS, which required macOS anyhow, so it was never a question. I think Apple lost a little bit of its lead in terms of native. It used to be… Native apps used to be so much better, and especially in the Mac, there’s more people that build software with love. On, on Windows, it, it… Windows has much more and, like, function wise, there’s just more, period. But a lot of it felt more functional and less done with love. Um, I mean, Mac always, like, attracted more designers and people I felt…

Peter Steinberger (01:53:50) Even though, like, often it has less features, it, it had more delight-

Peter Steinberger (01:53:55) … And playfulness. So I always valued that. But in the last few years, many times I actually prefer… Oh God, people are gonna roast me for that, but I prefer Electron apps because they work and native apps often, especially if it’s, like, a web service is a native app, are lacking features. I mean, not saying it couldn’t be done, it’s more like a, a focus thing that, like, for many, many companies, native was not that big of a priority. But if they build an Electron app, it, it’s the only app, so it is a priority and there’s a lot more code sharing possible. And I, I build a lot of native Mac apps. I love it. I, I can, I can help myself. Like, I love crafting little Mac, Mac menu bar tools. Like I built one to, to monitor your Codex use.

Peter Steinberger (01:54:58) I built one I call Trimmy, that’s specifically for agentic use. When you, when you select text that goes over multiple lines it would remove the new line so you could actually paste it to the terminal. That was, again like, this is annoying me and after the, the 20th time of it is annoying me, I just built it. There is a cool Mac app for OpenClaw that I don’t think many people discovered yet, also because it, it still needs some love. It feels a little bit too much like the Hummer car right now because I, I just experiment a lot with it. It, it likes to polish.

Lex Fridman (01:55:32) So you still… I mean, you still love it. You still, you still love adding to the delight of that operating system.

Peter Steinberger (01:55:37) Yeah, but then you realize… Like, I also built one, for example, for GitHub. And then the… If you use SwiftUI, like the latest and greatest at Apple, and took them forever to build something to show an image from the web. Now we have async, async image, but…… I added support for it and then some images would just not show up or, like, be very slow. And I had a discussion with Codex like, “Hey, why is there a bug?” And even Codex said like, “Yeah, there’s this ASIC image but it’s really more for experimenting and it should not be used in production.” But that’s Apple’s answer to, like, showing images from the web. This shouldn’t be so hard, you know.

Peter Steinberger (01:56:19) This is like… This is like insane. Like, how am I in, in, in 2026 and my agent tell me, “Don’t use the stuff Apple built because it’s, it’s… It’s… Yeah, it- it’s there but it’s not good.” And like this is now in the weeds. This is… To me this is like… They had so much head start and so much love, and they kind of just like blundered it and didn’t, didn’t evolve it as much as they should.

Lex Fridman (01:56:50) But also, there’s just the practical reality. If you look at Silicon Valley, most of the developer world that’s kind of playing with LMS and Agentic AI, they’re all using Apple products. And then, at the same time, Apple is not really, like, leaning on that. Like they’re not… They’re not opening up and playing and working together and like, yes.

Peter Steinberger (01:57:12) Isn’t, isn’t it funny how they completely blunder AI, and yet everybody’s buying Mac Minis?

Lex Fridman (01:57:19) How… What… Does that even make sense? You’re, you’re, you’re quite possibly the world’s greatest Mac salesman of all time.

Peter Steinberger (01:57:29) No, you don’t need a Mac Mini to install OpenClaw. You can install it on the web. There’s, there’s a concept called nodes, so you can like make your computer a node and it will do the same. There is something said for running it on separate hardware. That right now is useful. There is… There’s a big argument for the browser. You know, I, I built some Agentic browser use in there. And, I mean, it’s basically Playwright with a bunch of extras to make it easier for agents.

Lex Fridman (01:58:06) Playwright is a library that controls the browser.

Lex Fridman (01:58:08) It’s really nice, easy to use.

Peter Steinberger (01:58:09) And our internet is slowly closing down. Like, there, there’s a whole movement to make it harder for agents to use. So if you do the same in a data center and websites detect that it’s an IP from a data center, the website might just block you or it make it really hard or put a lot of captures in the, in the way of the agent. I mean, agents are quite good at happily clicking, “I’m not a robot.”

Peter Steinberger (01:58:33) But having that on a residential IP makes a lot of things simpler. So there’s ways. Yeah. But it really does not need to be a Mac. It can… It can be any old hardware. I always say, like, maybe use the… Use the opportunity to get yourself a new MacBook or whatever computer you use and use the old one as your server instead of buying a standalone Mac Mini. But then there’s, again, there’s a lot of very cute things people build with Mac Minis that I like.

Peter Steinberger (01:59:08) And no, I don’t get commission from Apple. They didn’t really communicate much.

Lex Fridman (01:59:16) It’s sad. It’s sad. Can you actually speak to what it takes to get started with OpenClaw? There’s… I mean, there’s a lot of people… What is it? Somebody tweeted at you, “Peter, make OpenClaw easy to set up for everyday people. 99.9% of people can’t access to OpenClaw and have their own lobster because of their technical difficulties in getting it set up. Make OpenClaw accessible to everyone, please.” And you replied, “Working on that.” From my perspective, it seems there- there’s a bunch of different options and it’s already quite straightforward, but I suppose that’s if you have some developer background.

Peter Steinberger (01:59:50) I mean, right now you have to paste in one liner into the terminal.

Peter Steinberger (01:59:54) And there’s also an app. The app kind of does that for you, but there should be a Windows app. The app needs to be easier and more loved. The configuration should potentially be web-based or in the app. And I started working on that, but honestly right now I want to focus on security aspects. And, and once I’m confident that this is at a level that I can recommend my mom, then I’m going to make it simpler. Like I…

Lex Fridman (02:00:28) You want to make it harder so that it doesn’t scale as fast as it’s scaling.

Peter Steinberger (02:00:32) Yeah, it would be nice if it wouldn’t… I mean, that’s, like, hard to say, right? But if the growth would be a little slower, that would be helpful because people are expecting inhuman things from a single human being. And yes, I have some contributors, but also that whole machinery I started a week ago so that needs more time to figure out. And, and not everyone has all day to work on that.

Lex Fridman (02:01:00) There’s some beginners listening to this, programming beginners. What advice would you give to them about, let’s say, joining the Agentic AI revolution?

Peter Steinberger (02:01:12) Play. Playing is the best… The best way to learn. If you wanna… I’m sure if you… If you are like a little bit of builder, you have an idea in your head that you want to build, just build that, or like, give it a try. It doesn’t need to be perfect. I built a whole bunch of stuff that I don’t use. It doesn’t matter. Like, it’s the journey.

Peter Steinberger (02:01:31) You know? Like the philosophical way, that the end doesn’t matter, the journey matters. Have fun.

Peter Steinberger (02:01:37) My God, like those things… I… I don’t think I ever had so much fun building things because I can focus on the hard parts now. A lot of coding, I always thought I liked coding, but really I like building.

Peter Steinberger (02:01:50) And… And whenever you don’t understand something, just ask. You have an infinitely patient answering machine…. that y- can explain you anything at any level of complexity. Sometimes, that’s like one time I asked, “Hey explain to me like I’m- I’m eight years old,” and it started giving me a story with crayons and stuff. And I’m like, “No, not like that.” Like, I’m okay- … up- up the age a little bit, you know? I’m like, I’m not an actual child, it’s just, I just need a simpler language for like a- a- a- a- a tricky database concept that I didn’t grok in the first- first time. But, you know, just, you can just ask things. Like, you- there’s like… It used to be that I had to go on Stack Overflow or ha- ask on Twitter, and then maybe two days later I get a response.

Peter Steinberger (02:02:37) Or I had to try for hours. And now you- you can just ask stuff. It- I mean, it’s never… You have, like, your own teacher. You know that there’s like statistics, y- you can learn faster if you have your own teacher. The- it’s like you have this infinitely patient machine. Ask it.

Lex Fridman (02:02:53) But what would you say? So use… What’s the easiest way to play? So maybe Open Claw is a nice way to play so you can then set- set everything up and then you could chat with it.

Peter Steinberger (02:03:03) You can also just experiment with it and, like, modify it. Ask your agent. I mean, there is infinite ways how it can be made better. Play around, make it better.

Peter Steinberger (02:03:19) More general, if you- if you’re a beginner and you actually wanna learn how to build software really fast, get involved in open source. Doesn’t need to be my project. In fact, maybe don’t use my project because my- my backlog is very large, but I learned so much from open source. Just like, like, be- be humble. Don’t- maybe don’t send a pull request right away. But there’s many other ways you can help out. There’s many ways you can just learn by just reading code. By- by being on Discord or wherever people are, and just, like, understanding how things are built. I don’t know, like Mitchell Hashimoto builds Ghostly, the terminal, and he has a really good community where there’s so many other projects. Like, pick something that you find interesting and get involved.

Lex Fridman (02:04:15) Do you recommend that people that don’t know how to program or don’t really know how to program learn to program also? So when you you can get quite far right now by just using natural language, right? Do you s- still see a lot of value in reading the code, understanding the code, and then being able to write a little bit of code from scratch?

Peter Steinberger (02:04:38) It definitely helps.

Lex Fridman (02:04:39) It’s hard for you to answer that-

Lex Fridman (02:04:42) … because you don’t know what it’s like to do any of this without knowing the base knowledge. Like, you might take for granted just how much intuition you have about the programming world having programmed so much, right?

Peter Steinberger (02:04:54) There’s people that are high agency and very curious, and they get very far even though they have no deep understanding how software works just because they ask questions and questions and- and- and-

Peter Steinberger (02:05:08) … and agents are infinitely patient. Like, part of what I did this year is I went to a lot of iOS conferences because that’s my background and just told people, “Don’t consi- don’t see yourself as an iOS engineer anymore.” Like, “You need to change your mindset. You’re a builder.” And you can take a lot of the knowledge how to build software into new domains and all of the- the more fine-grain details, agents can help. You don’t have to know how to splice an array or what the- what the correct template syntax is or whatever, but you can use all your- your general knowledge and that makes it much easier to move from one galaxy, one tech galaxy into another. And oftentimes, there’s languages that make more or less sense depending on what you build, right?

Peter Steinberger (02:05:58) So for example, when I build simple CLIs, I like Go. I actually don’t like Go. I don’t like the syntax of Go. I didn’t even consider the language. But the ecosystem is great, it works great with agents. It is garbage collected. It’s not the highest performing one, but it’s very fast. And for those type of- of CLIs that I build, Go is- is a really good choice. So I- I use a language I’m not even a fan of for… That’s my main to-go thing for- for CLIs.

Lex Fridman (02:06:29) Isn’t that fascinating that here’s a programming language you would’ve never used if you had to write it from scratch and now you’re using because LMs are good at generating it and it has some of the characteristics that makes it resilient, like garbage collected?

Peter Steinberger (02:06:44) Because everything’s weird in this new world and that just makes the most sense.

Lex Fridman (02:06:48) What’s the best Ridiculous question. What’s the best programming language for the AI- AI agentic world? Is it JavaScript, TypeScript?

Peter Steinberger (02:06:54) TypeScript is really good. Sometimes the types can get really confusing and the ecosystem is- is a jungle. So for- for web stuff it’s good. I wouldn’t build everything in it.

Lex Fridman (02:07:15) Don’t you think we’re moving there? Like, that everything will eventually be written- eventually is written in JavaScript and it-

Peter Steinberger (02:07:22) The birth and death of JavaScript and we are living through it in real time.

Lex Fridman (02:07:26) Like, what does programming look like in 20 years? Right? In 30 years? In 40 years? What do programs and apps look like?

Peter Steinberger (02:07:32) You can even ask a question like, do we need a- a programming language that’s made for agents? Because all of those languages are made for humans. So how- what would that look like? Um, I think there’s a- there’s whole bunch of interesting questions that we’ll discover. And also how because everything is now world knowledge, how it in many ways, things will stagnate ’cause if you build something new and the agent has no idea that’s gonna be much harder to use than something that’s already there. Um…… of when I build Mac apps, I build them in, in Swift and SwiftUI, mm, partly because I like pain, partly because it… the, the deepest level of system integration, I can only get through there.

Peter Steinberger (02:08:18) And you clearly feel a difference if you click on an electron app and it loads a web view in the menu. It’s just not the same. Sometimes I just also try new languages just to, like, get a feel for them.

Peter Steinberger (02:08:33) Yeah. If it’s something that… where I care about performance a lot then it’s, it’s a really interesting language. And it… like agents got so much better over the last six months from not really good to totally valid choice. Just still a, a very young ecosystem. And most of the time you actually care about ecosystem, right? So, so if you build something that does inference or goes into whole running model direction, Python, very good.

Peter Steinberger (02:09:07) But then if I build stuff in Python and I want a story where I can also deploy it on Windows, not a good choice.

Peter Steinberger (02:09:13) Sometimes I, I found projects that kinda did 90% of what I wanted but were in Python, and I wanted them… I wanted an easy Windows story. Okay, just rewrite it in Go. But then if you go towards multiple, multiple threads and a lot more performance, Rust is a really good choice. There’s no… there’s just no single answer, and it’s also the beauty of it. Like, it’s fun.

Peter Steinberger (02:09:37) And now it doesn’t matter anymore, you can just literally pick the language that has the, the most fitting characteristics and ecosystem-

Peter Steinberger (02:09:46) … for your problem domain. And yeah, it might be… You might have s-… You might be a little bit slow in reading the code, but not really. Y- I think you, you pick stuff up really fast, and you can always ask your agent.

Life story and career advice

Lex Fridman (02:09:59) So there’s a lot of programmers and builders who draw inspiration from y- your story. Just the way you carry yourself, your choice of making OpenClaw open source, the, the way you have fun building and exploring, and doing that, for the most part, alone or on a small team. So by way of advice, what metric should be the goal that they would be optimizing for? What would be the metric of success? Would it be happiness? Is it money? Is it positive impact for people who are dreaming of building? ‘Cause you went through an interesting journey. You’ve achieved a lot of those things, and then you fell out of love with programming a little bit for a time.

Peter Steinberger (02:10:47) I was just burning too bright for too long. I, I ran… I started PSPDFKit, s- and ran it for 13 years, and it was high stress. Um, I had to learn all these things fast and hard, like how to manage people, how to bring people on, how to deal with customers, how to do…

Lex Fridman (02:11:14) So it wasn’t just programming stuff, it was people stuff.

Peter Steinberger (02:11:17) The stuff that burned me out was mostly people stuff. I, I don’t think burnout is working too much. Maybe to a degree. Everybody’s different. You know, I c- I cannot speak in a- in absolute terms, but for me, it was much more differences with my, my co-founders, conflicts, or, like, really high stress situation with customers that eventually grinded me down. And then when… luckily we, we got a really good offer for, like, putting the company to the next level and I, I already kinda worked two years on making myself obsolete. So at this point I could leave, and, and then I just… I was sitting in front of the screen and I felt like, you know Austin Powers where they suck the mojo out?

Peter Steinberger (02:12:14) Uh, I g- I was like, m- m- it was, like, gone. Like, I couldn’t… I couldn’t get code out anymore. I was just, like, staring and feeling empty, and then I, I just stopped. I, I booked, like, a one-way trip to Madrid and, and, and just, like, spent a t- some t- sometime there. I felt like I had to catch up on life, so I did a whole, a whole bunch of life catching up stuff.

Lex Fridman (02:12:47) Did you go through some lows during that period? And you know, maybe advice on… of how to?

Peter Steinberger (02:12:56) Maybe advice on how to approach life. If you think that, “Oh yeah, work really hard and then I’ll retire,” I don’t recommend that. Because the idea of, “Oh yeah, I just enjoy life now,” a- maybe it’s appealing, but right now I enjoy life, the most I’ve ever enjoyed life. Because if you wake up in the morning and you have nothing to look forward to, you have no real challenge, that gets very boring, very fast. And then when, when you’re bored, you’re gonna look for other places how to stimulate yourself, and then maybe, maybe that’s drugs, you know? But that eventually also get boring and you look for more, and that will lead you down a very dark path.

Money and happiness

Lex Fridman (02:13:57) But you also showed on the money front, you know, a lot of people in Silicon Valley and the startup world, they think, maybe overthink way too much optimized for money. And you’ve also shown that it’s not like you’re saying no to money. I mean, I’m sure you take money, but it’s not…… the primary objective of uh, of your life. Can you just speak to that? Your philosophy on money?

Peter Steinberger (02:14:20) When I built my company, money was never the driving force. It felt more like, like, an affirmation that I did something right. And having money solves a lot of problems. I also think there, there’s diminishing returns the more you have. Like, a cheeseburger is a cheeseburger, and I think if you go too far into, oh, I do private jet and I only travel luxury, you disconnect with society. Um, I, I donated quite a lot. Like, I have a, I have a foundation for helping people that weren’t so lucky.

Lex Fridman (02:15:11) And disconnecting from society is bad in that on many levels, but one of them is, like, humans are awesome. It’s nice to continuously remember the awesomeness in humans.

Peter Steinberger (02:15:23) I, I mean, I could afford really nice hotels. The last time I was in San Francisco, I did the, the first time the OG Airbnb experience-

Peter Steinberger (02:15:30) … and just booked a room. Mostly because I, I thought, okay, you know, I’m out or I’m sleeping, and I don’t like where all the hotels are, and I wanted a, I wanted a different experience. I think, isn’t life all about experiences? Like, if you, if you tailor your life towards, “I wanna have experiences,” it, it reduces the need for, “It needs to be good or bad.” Like, if people only want good experiences, that’s not gonna work, but if you optimize for experiences, if it’s good, amazing. If it’s bad, amazing, because, like, I learned something, I saw something, did something. I wanted to experience that, and it was amazing. Like, there was, like, this, this queer DJ in there, and I showed her how to make music with cloud code. And we, like, immediately bonded and had a great time.

Lex Fridman (02:16:24) Yeah, there’s something about that air- you know, couch surfing, Airbnb experience, the OG. I’m still to this day. It’s awesome. It’s humans, and that’s why travel is awesome.

Lex Fridman (02:16:34) Just experience the variety of, the diversity of human. And when it’s shitty, it’s good too, man. If it rains and you’re soaked and it’s all fucked, and planes, the everything is shit, everything is fucked, it’s still awesome. If you’re able to open your eyes it’s good to be alive.

Peter Steinberger (02:16:49) Yeah, and anything that creates emotion and feelings is good.

Peter Steinberger (02:16:55) Even… So, so maybe, maybe even the cryptic people are good because they definitely created emotions. I, I don’t know if I should go that far.

Lex Fridman (02:17:02) No, man. Give them, give them all, give them love. Give them love. Because I do think that online lacks some of the awesomeness of real life.

Lex Fridman (02:17:13) That’s, that’s, it’s an open problem of how to solve, how to infuse the online cyber experience with I don’t know with the intensity that we humans feel when it’s in real life. I don’t know. I don’t know if that’s a solvable problem.

Peter Steinberger (02:17:31) Well, it’s just possible because text is very lossy.

Peter Steinberger (02:17:35) You know, sometimes I wish if I talked to the agent I would… It should be multi-model so it also understands my emotions.

Lex Fridman (02:17:43) I mean, it, it might move there. It might move there.

Peter Steinberger (02:17:46) It will. It will. It totally will.

Lex Fridman (02:17:49) I mean, I have to ask you, just curious. I, I know you’ve probably gotten huge offers from major companies. Can you speak to who you’re considering working with?

Peter Steinberger (02:18:04) Yeah. So, to like explain my thinking a little bit, right, I did not expect this blowing up so much. So, there’s a lot of doors that opened because of it. There’s, like, I think every VC, every big VC company is in my inbox and tried to get 15 minutes of me. So, there’s, like, this butterfly effect moment. I could just do nothing and continue and I really like my life. Valid choice. Almost. Like, I considered it when I delete it, wanted to delete the whole thing. I could create a company. Been there, done that. There’s so many people that push me towards that and, yeah, like, could be amazing.

Lex Fridman (02:19:07) Which is to say that you, you would probably raise a lot of money in that.

Lex Fridman (02:19:11) I don’t know, hundreds of millions, billions. I don’t know. It could just got unlimited amount of money.

Peter Steinberger (02:19:15) Yeah. It just doesn’t excite me as much because I feel I did all of that, and it would take a lot of time away from the things I actually enjoy. Same as when, when I was CEO, I think I, I learned to do it and I’m not bad at it, and partly I’m good at it. But yeah, that path doesn’t excite me too much, and I also fear it, it would create a natural conflict of interest. Like, what’s the most obvious thing I do? I, I prioritize it. I put, like, a version safe for workplace. And then what do you do? Like, I get a pull request with a feature like an audit log, but that seems like an enterprise feature, so now I feel I have a conflict of interest in the open-source version and the closed-source version….

Peter Steinberger (02:20:15) or change the license to something like FSL, where you cannot actually use it for commercial stuff, would first be very difficult with all the contributions. And second of all, I- I like the idea that it’s free as in beer and not free with conditions. Yeah, there’s ways how you, how you keep all of that for free and just, like, still try to make money, but those are very difficult. And you see there’s, like, fewer and fewer companies manage that. Like, even Tailwind, they’re, like, used by everyone. Everyone uses Tailwind, right? And then they had to cut off 75% of the employees because they’re not making money because nobody’s even going on the website anymore because it’s all done by agents. S- and just relying on donations, yeah, good luck.

Peter Steinberger (02:21:04) Like, if a project of my caliber, if I extrapolate what the typical open-source project would get it’s not a lot. I s- I still lose money on the project because I made the point of supporting every dependency, except Slack. They are a big company. They can, they can, they can do without me. But all the projects that are done by mostly individuals so, like, all the, right now, all the sponsorship goes right up to my dependencies. And if there’s more, I want to, like, buy my contributors some merch, you know?

Lex Fridman (02:21:43) So you’re losing money?

Peter Steinberger (02:21:44) Yeah, right now I lose money on this.

Lex Fridman (02:21:46) So it’s really not sustainable?

Peter Steinberger (02:21:48) Uh, I mean, it’s like, I guess something between 10 and 20K a month. Which is fine. I’m sure over time I could get that down. Um, OpenAI is helping out a little bit with tokens now. And there’s other companies that have been generous. But yeah, still losing money on that. So that’s- that’s one path I consider, but I’m just not very excited. And then there’s all the big labs that I’ve been talking to. And from those Meta and OpenAI seem the most interesting.

Lex Fridman (02:22:32) Do you lean one way or the other?

Peter Steinberger (02:22:34) Yeah. Um… Not sure how much I should share there. It’s not quite finalized yet. Let’s- let’s just say, like, on either of these, my conditions are that the project stays open source. That it… Maybe it’s gonna be a model like Chrome and Chromium. Um, I think this is- this is too important to just give to a company and make it theirs. It… This is… And we didn’t even talk about the whole community part, but, like, the- the thing that I experienced in San Francisco, like at ClawCon, seeing so many people so inspired, like… And having fun and just, like, building shit, and, like, having, like, robots in lobster stuff walking around. Like, the…

Peter Steinberger (02:23:37) People told me, like, they didn’t experience this level of- of community excitement since, like, the early days of the internet, like 10, 15 years. And there were a lot of high caliber people there, like… Um, I was amazed. I also, like, was very sensory overloaded because too many people wanted to do selfies. But I love this. Like, this needs to stay a place where people can, like, hack and learn. But also, I’m very excited to, like, make this into a version that I can get to a lot of people because I think this is the year of personal agents, and that’s the future. And the fastest way to do that is teaming up with one of the labs. And I also, on a personal level, I never worked at a large company, and I’m intrigued. You know, we talk about experiences. Will I like it? I don’t know.

Peter Steinberger (02:24:42) But I want that experience. Uh, I- I’m sure, like, if- if I- if I announce this, then there will be people like, “Oh, he sold out,” blah, blah, blah. But the project will continue. From everything I talked to so far, I can even have more resources for that. Like, both s- both of those companies understand the value that I created something that accelerates our timeline and that got people excited about AI. I mean, can you imagine? Like, I installed OpenClaw on one of my, I’m sorry, normie friends. I’m sorry, Vahan. But he’s just a… You know?

Lex Fridman (02:25:33) Normie with love, yeah. For sure.

Peter Steinberger (02:25:34) He- he, like, someone who uses the computer, but never really… Like, yeah, use some ChatGPT sometimes, but not very technical. Wouldn’t really understand what I built. So, like, I’ll show you, and I- I paid for him the- the 90 buck, 100 buck, I don’t know, subscription for Entropic. And set up everything for him with, like, WSL Windows.

Peter Steinberger (02:26:00) I was also curious, would it actually work on Windows, you know? Was a little early. And then within a few days, he was hooked. Like, he texted me about all the things he learned. He built, like, even little tools. He’s not a programmer. And then within a few days he upgraded to the $200 subscription. Or euros, because he’s in Austria…. and he was in love with that thing. That, for me, was like a very early product validation. It’s like, I built something that captures people. And then, a few days later, Entropic blocked him because, based on their rules using the subscription is problematic or whatever. And he was, like, devastated. And then he signed up for Mini Max for 10 bucks a month and uses that.

Peter Steinberger (02:26:56) And I think that’s silly in many ways, because you just got a 200 buck customer. You just made someone hate your company, and we are still so early. Like, we don’t even know what the final form is. Is it gonna be cloud code? Probably not, you know? Like, that seems very… It seems very short-sighted to lock down your product so much. All the other companies have been helpful. I- I’m in Slack of, of most of the big labs. Kind of everybody understands that we are still in an era of exploration, in the area of the radio shows on TV and not, and not a modern TV show that fully uses the format.

Lex Fridman (02:27:45) I think, I think you’ve made a lot of people, like, see the possibility. And non- Uh, sorry. Non, non-technical people see the possibility of AI, and just fall in love with this idea, and enjoy interacting with AI. And that’s a bea- That’s a really beautiful thing. I think I also speak for a lot of people in saying, I think you’re one of the, the great people in AI in terms of having a good heart, good vibes, humor, the right spirit. And so it would, in a sense, this model that you’re describing, having open source part, and you being part of uh, also building a thing inside, additionally, of a large company would be great, because it’s great to have good people in those companies.

Peter Steinberger (02:28:36) Yeah. You know, what also people don’t really see is… I made this in three months. I did other things as well. You know, I have a lot of projects. Like, this is not… Yeah, in January, this was my main focus because I saw the storm coming. But before that, I built a whole bunch of other things. Um, I have so many ideas. Some should be there, some would be much better fitted when I have access to the latest toys- Uh, and I, I kind of want to have access to, like, the latest toys. So this is important, this is cool, this will continue to exist. My, my short-term focus is, like, working through those… Is it two- Is it 3,000 PRs now by now? I don’t even know. Like, there’s, there’s a little bit of backlog.

Peter Steinberger (02:29:23) But this is not gonna be the thing that I’m gonna work until I’m, I’m, I’m 80, you know? This is… This is a window into the future. I’m gonna make this into a cool product. But yeah, I have like… I have more ideas.

Lex Fridman (02:29:36) If you had to pick, is there a company you lean? So Meta, OpenAI, is there one you lean towards going?

Peter Steinberger (02:29:44) I spend time with both of those. And it’s funny, because a few weeks ago, I didn’t consider any of this. Um… And it’s really fucking hard. Like-

Peter Steinberger (02:30:06) I have some… I know no people at OpenAI. I love their tech. I think I’m the biggest codex advertisement shill that’s unpaid. And it would feel so gratifying to, like, put a price on all the work I did for free. And I would love if something happens and those companies get just merged, because it’s like…

Lex Fridman (02:30:32) Is this the hardest decision you’ve ever had to do?

Peter Steinberger (02:30:39) No. You know, I had some breakups in the past that feel like it’s the same level.

Lex Fridman (02:30:43) Relationships, you mean?

Lex Fridman (02:30:47) Yeah, yeah, yeah, yeah.

Peter Steinberger (02:30:48) And, and I also know that, in the end, they’re both amazing. I cannot go wrong. This is like-

Peter Steinberger (02:30:54) This is, like, one of the most prestigious and, and, and, and, and largest… I mean, not largest, but, like, they’re both very cool companies.

Lex Fridman (02:31:02) Yeah, they both really know scale. So, if you’re thinking about impact, some of the wonderful technologies you’ve been exploring, how to do it securely, and how to do it at scale, such that you can have a positive impact on a large number of people. They both understand that.

Peter Steinberger (02:31:19) You know, both Ned and Mark basically played all week with my product, and sent me like, “Oh, this is great.” Or, “This is shit. Oh, I need to change this.” Or, like, funny little anecdotes. And people using your stuff is kind of like the biggest compliment, and also shows me that, you know, they actually… T- they actually care about it. And I didn’t get the same on the OpenAI side. Um, I got… I got to see some other stuff that I find really cool, and they lure me with… I cannot tell you the exact number because of NDA, but you can, you can be creative and, and think of the Cerebras deal and how that would translate into speed. And it was very intriguing. You know, like, you give me Thor’s hammer. Yeah. … been lured with tokens. So, yeah.

Lex Fridman (02:32:34) So, it- it’s funny. So, so Marc started tinkering with the thing, essentially having fun with the thing.

Peter Steinberger (02:32:41) He got… He… Like, when he first… When he first approached me, I got him in my, in my WhatsApp and he was asking, “Hey, when are we have a call?” And I’m like, “I don’t like calendar entries. Let’s just call now.” And he was like, “Yeah, give me 10 minutes, I need to finish coding.”

Peter Steinberger (02:33:01) Well, I guess that gives you street cred. It’s like, ugh, like, he’s still writing code. You know, he’s-

Peter Steinberger (02:33:07) … he didn’t drift away in just being a manager, he gets me. That was a good first start. And then I think we had a, like, a 10-minute fight what’s better, cloud code or Codex. Like, that’s the thing you first do, like, you casually call-

Lex Fridman (02:33:24) Yeah, that’s awesome

Peter Steinberger (02:33:24) … someone with, like, the- that owns one of the largest companies in the world and, and you have a 10 minutes conversation about that.

Peter Steinberger (02:33:30) And then I think afterwards he called me eccentric but brilliant. But I also had some… I had some really, really cool discussion with Sam Altman and he’s, he’s very thoughtful brilliant and I like him a lot from the, from the little time I had, yeah. I mean, I know it’s peop- some people vilify both of those people. I don’t think it’s fair.

Lex Fridman (02:34:15) I think no matter what the stuff you’re building and the kind of human you are doing stuff at scale is kinda awesome. I’m excited.

Peter Steinberger (02:34:24) I am super pumped. And you know the beauty is if, if it doesn’t work out, I can just do my own thing again. Like, I, I told them, like, I, I don’t do this for the money, I don’t give a fuck. I-

Peter Steinberger (02:34:42) I mean, of course, of course it’s a nice compliment but I wanna have fun and have impact, and that’s ultimately what made my decision.

How OpenClaw works

Lex Fridman (02:34:58) Can I ask you about… we’ve talked about it quite a bit, but maybe just zooming out about how OpenCloud works. We talked about different components, I want to ask if there’s some interesting stuff we missed. So, there’s the gateway, there’s the chat clients, there’s the harness there’s the agentic loop. You said somewhere that everybody should im- implement an agent loop at some point in their lives.

Peter Steinberger (02:35:24) Yeah, because it’s like the, it’s like the Hello World in AI, you know? And it’s actually quite simple.

Peter Steinberger (02:35:30) And it- it’s good to understand that that stuff’s not magic. You can, you can easily build it yourself. So, writing your own little cloud code… I, I even did this at a conference in Paris for people to, like, introduce them to AI. I think it’s it’s a fun little practice. And you, you covered a lot. I think one, one silly idea I had that turned out to be quite cool is I built this thing with full system access. So it’s like, you know, with great power comes great responsibility.

Peter Steinberger (02:36:09) And I was like, “How can I up the stakes a little bit more?”

Peter Steinberger (02:36:14) And I just made a… I made it proactive. So, I added a prompt. Initially, it was just a prompt, surprise me. Every, like, half an hour, surprise me, you know? And later on I changed it to be like a little more specific and-

Peter Steinberger (02:36:31) … in the definition of surprise. But the fact that I made it proactive and that it knows you and that it cares about you, it- it’s at least it’s programmed to that, prompted to do that. And that, that is a follow on, on your current session makes it very interesting because it would just sometimes ask a follow-up question or like, “How’s your day?”

Peter Steinberger (02:36:53) And I just made a… I made it proactive. So, I added a prompt. Initially, it was just a prompt, surprise me. Every, like, half an hour, surprise me, you know? And later on I changed it to be like a little more specific and-

Peter Steinberger (02:36:58) … in the definition of surprise. But the fact that I made it proactive and that it knows you and that it cares about you, it- it’s… at least it’s programmed to that, prompted to do that. And that, that is a follow on, on your current session makes it very interesting because it would just sometimes ask a follow-up question or like, “How’s your day?” I mean, again, it’s a little creepy or weird or interesting but Heartbeat very… in the beginning, it’s still… today, it doesn’t… the model doesn’t choose to use it a lot.

Lex Fridman (02:37:16) By the way, we’re, we’re, we’re talking about Heartbeat, as you mentioned, the thing that regularly-

Peter Steinberger (02:37:22) Yeah. Like kicks-

Peter Steinberger (02:37:23) You just kick off the loop.

Lex Fridman (02:37:25) Isn’t that just a cron job, man?

Peter Steinberger (02:37:27) Yeah, right, I mean, it’s like-

Lex Fridman (02:37:29) It’s the cr- the criticisms that you get are hilarious.

Peter Steinberger (02:37:31) You can, you can deduce any idea to like a silly… Yeah, it’s just, it’s just a cron job in the end. I have like cron- separate cron jobs.

Lex Fridman (02:37:41) Isn’t love just evolutionary biology manifesting itself and isn’t… aren’t you guys just using each other?

Peter Steinberger (02:37:49) And then, yeah, and the project is all just glue of a few different dependencies-

Peter Steinberger (02:37:53) … and there’s nothing original. Why do people… Well, you know, isn’t Dropbox just FTP with extra steps?

Peter Steinberger (02:38:01) I found it surprising where I had this I had a shoulder operation a few months ago, so.

Peter Steinberger (02:38:08) And the model rarely used Heartbeat, but then I was in the hospital, and it knew that I had the operation and it checked up on me. It’s like, “Are you okay?” And I just… It’s like, again, apparently, like, if something’s significant in the context, that triggered the Heartbeat when it rarely used the Heartbeat…. And it does that sometimes for people, and that just makes it a lot more relatable.

Lex Fridman (02:38:36) Let me look this up on Perplexity, how OpenCall works just to see if I’m missing any of the stuff. Local agent run time, high-level architecture. There’s… Oh, we haven’t talked much about skills, I suppose. Skill hub, the tools in the skill lair, but that’s definitely a huge component and there’s a huge growing set of skills-

Peter Steinberger (02:38:55) You know, you know what I love? That half a year ago, like everyone was talking about MCPs-

Peter Steinberger (02:39:02) … and I was like, “Screw MCPs. Every MCP would be better as a CLI.” And now this stuff doesn’t even have MCP support. I mean, it, it has with asterisks, but not in the core lair, and nobody’s complaining.

Peter Steinberger (02:39:24) So my approach is if you want to extend the model with more features, you just build a CLI and the model can call the CLI, probably gets it wrong, calls the help menu, and then on demand loads into the context what it needs to use the CLI. It just needs a sentence to know that the CLI exists if it’s something that the model doesn’t know about default. And even for a while, I, I didn’t really care about skills, but skills are actually perfect for that because they, they boil down to a single sentence that explains the skill and then the model loads the skill, and that explains the CLI, and then the model uses the CLI. Some skills are, like raw, but most of the time, networks.

Lex Fridman (02:40:16) It’s interesting um, I’m asking Perplexity MCP versus skills, because this kind of requires a hot take that’s quite recent, because your general view is MCPs are dead-ish. So MCPs is a more structured thing. So if you listen to Perplexity here, MCP is what can I reach? So APIs, database services files via protocol. So a structured protocol of how you communicate with a thing, and then skills is more how should I work? Procedures, hostile helper scripts and prompts are often written in a kind of semi-structured natural language, right? And so technically skills could replace MCP if you have a smart enough model.

Peter Steinberger (02:41:00) I think the main beauty is, is that models are really good at calling Unix commands. So if you just add another CLI, that’s just another Unix command in the end. And MCP is… That has to be added in training. That’s not a very natural thing for the model. It requires a very specific syntax. And the biggest thing, it’s not composable. So imagine if I have a service that gives me better data and gives me the temperature, the average temperature, rain, wind and all the other stuff, and I get like this huge blob back. As a model, I always have to get the huge blob back. I have to fill my context with that huge blob and then pick what I want. There’s no way for the model to naturally filter unless I think about it proactively and add a filtering way into my MCP.

Peter Steinberger (02:41:53) But if I would build the same as a CLI and it would give me this huge blob, it could just add a JQ command and filter itself and then only, only get me what I actually need. Or maybe even compose it into a script to, like do some calculations with the temperature and only give me the exact output and the mo- and the… you have no context pollution. Again, you can solve that with like sub-agents and more charades, but it’s just like workarounds for something that might not be the optimal way. There’s… It definitely it was, you know, it was good that we had MCPs because it pushed a lot of companies towards building APIs and now I, I can like look at an MCP and just make it into a CLI.

Peter Steinberger (02:42:37) But this, this inherent problem that MCPs by default clutter up your context. Plus the fact that most MCPs are not made good, in general make it just not a very useful paradigm. There’s some exceptions like Playwright for example that requires state and it’s actually useful. That is an acceptable choice.

Lex Fridman (02:43:05) So Playwright you use for browser use, which I think is c- already in OpenClaw is quite incredible, right?

Lex Fridman (02:43:12) You can basically do everything, most things you can think of using browser use.

Peter Steinberger (02:43:17) That, that gets into the whole arch of every app is just a very slow API now, if they want or not. And that through personal agents a lot of apps will disappear. You know, like I had a… I built a CLI for Twitter. I mean, I- I just reverse engineered their website and used the internal API, which is not very allowed.

Lex Fridman (02:43:50) It’s called Bird, short-lived.

Peter Steinberger (02:43:53) It was called Bird, because the bird had to disappear.

Lex Fridman (02:43:57) The, the wings were clipped.

Peter Steinberger (02:43:59) All they did is they just made access slower. Yeah, not tak- you’re not actually taking a feature away, but now inst- if, if your agent wants to read a tweet, it actually has to open the browser and read the tweet. And it will still be able to read the tweet. It will just take longer. It’s not like you are making something that was possible, not possible. No. Now, it’s just taking… Now it’s just a bit slower. So, so it doesn’t really matter if your service wants to be an API or not. If I can access it in the browser…… easy API. It’s a slow API.

Lex Fridman (02:44:35) Can you empathize with their situation? Like, what would you do if you were Twitter, if you were X? Because they’re basically trying to protect against other large companies scraping all their data.

Lex Fridman (02:44:46) But in so doing, they’re cutting off like a million different use cases for smaller developers that actually want to use it for helpful cool stuff.

Peter Steinberger (02:44:54) I think that if you have a very low per day baseline per account that allows read-only access would solve a lot of problems. There’s plenty, plenty of automations where people create a bookmark and then use OpenClaw to, like, find the bookmark, do research on it, and then send you an email-

Peter Steinberger (02:45:16) … with, like, more details on it or a summary. That’s a cool approach. I also want all my bookmarks somewhere to search. I would still like to have that.

Lex Fridman (02:45:26) So, read-only access for the bookmarks you make on X. That seems like an incredible application because a lot of us find a lot of cool stuff on X, we bookmark, that’s the general purpose of X. It’s like, holy shit, this is awesome. Oftentimes, you bookmark so many things you never look back at them.

Lex Fridman (02:45:40) It would be nice to have tooling that organizes them and allows you to research it further.

Peter Steinberger (02:45:44) Yeah, I mean, and to be frank, I, I mean, I, I told Twitter proactively that, “Hey, I built this and there’s a need.” And they’ve been really nice, but also like, “Take it down.” Fair. Totally fair. But I hope that this woke up the team a little bit that there’s a need. And if all you do is making it slower, you’re just reducing access to your platform. I’m sure there’s a better way. I also, I’m very much against any automation on Twitter. If you tweet at me with AI, I will block you. No first strike. As soon as it smells like AI, and AI still has a smell.

AI slop

Peter Steinberger (02:46:32) Especially on tweets. It’s very hard to tweet in a way that does look completely human.

Peter Steinberger (02:46:38) And then I block. Like, I have a zero tolerance policy on that. And I think it would be very helpful if they, if, like, tweets done via API would be marked. Maybe there’s some special cases where… But, and there should be, there should be a very easy way for agents to get their own Twitter account. Um…

Peter Steinberger (02:47:07) We, we need to rethink social platforms a little bit if, if, if we, we, we go towards a future where everyone has their agent and agents maybe have their own Instagram profiles or Twitter accounts, so I can, like, do stuff on my behalf. I think it should very clearly be marked that they are doing stuff on my behalf and it’s not me. Because content is now so cheap. Eyeballs are the expensive part. And I find it very triggering when I read something and then I’m like, oh, no, this smells like AI.

Lex Fridman (02:47:41) Yeah. Like, where, where is this headed in terms of what we value about the human experience? It feels like we’ll, we’ll move more and more towards in-person interaction and we’ll just communicate. We’ll talk to our AI agent to, to accomplish different tasks, to learn about different things, but we won’t value online interaction because there’ll be so much AI slob that smells and so many bots that it’s difficult.

Peter Steinberger (02:48:15) Well, if it’s smart, then it shouldn’t be difficult to filter. And then I can look at it if I want to. But yeah, this is, like, a big thing we need to solve right now. E- especially on this project, I get so many emails that are, let’s say nicely, agentically written.

Peter Steinberger (02:48:36) But I much rather read your broken English than your AI slob. You know, of course there’s a human behind it, and yet they, they prompt it. I’d much rather read your prompt than what came out. Um, I think we’re reaching a point where I value typos again.

Peter Steinberger (02:48:56) Like… Like, and I, I mean, it also took me a while to, like, come to the realization. I, on my blog I experimented with creating a blog post with agents and ultimately it took me about the same time to, like, steer agent towards something I like. But it missed the nuances that, how I would write it. You know, you can like, you can steer it towards your style, but it’s not gonna be all your style. So, I, I completely moved away from that. I, I, everything, everything I blog is organic, handwritten and maybe, maybe I, I, I use AI as a fix my worse typos. But there’s value in the rough parts of an actual human.

Lex Fridman (02:49:53) Isn’t that awesome? Isn’t that beautiful? That now because of AI we value the raw humanity in each of us more.

Peter Steinberger (02:50:02) I also, I also realized this thing that I, I rave about AI and use it so much for anything that’s code, but I’m allergic if it’s stories.

Peter Steinberger (02:50:14) Also, documentation, still fine with AI. You know, better than nothing.

Lex Fridman (02:50:17) And for now it’s still i- it applies in the mi- in the visual medium too. It’s fascinating how allergic I am to even a little bit of AI slob in in video and images. It’s useful, it’s nice if it’s like a little component of like-

Peter Steinberger (02:50:32) Or even, even those images. The, like, all these infographics and stuff, the-… they trigger me so hard.

Peter Steinberger (02:50:39) Like, it immediately makes me think less of your content. And it … They were novel for, like, one week and now it just screams slop.

Peter Steinberger (02:50:51) Even- even if people work hard on it, using … And I- I have some on my blog post, you know, in the- in the time where I- I explored this new medium. But now, they trigger me as well. It’s like, yeah, this is … This just screams AI slop. I-

Lex Fridman (02:51:06) What… I don’t know what that is, but I went through that too. I was really excited by the diagrams. And then I realized, in order to remove from them hallucinations, you actually have to do a huge amount of work. And you’re just using it to draw the better diagrams, great. And then I’m proud of the diagram. I’ve used them for literally, like, ki- ki- kind of like you said for maybe a couple of weeks. And now I look at those, and I- I feel like I feel when I look at Comic Sans as a font or- or something like this.

Lex Fridman (02:51:32) It’s like, “No, this is-“

Peter Steinberger (02:51:35) It’s a smell.

Lex Fridman (02:51:35) “… this is fake. It’s fraudulent. There’s something wrong with it.” And it…

Peter Steinberger (02:51:41) It’s a smell.

Peter Steinberger (02:51:44) It’s a smell.

Lex Fridman (02:51:44) And it’s awesome because it re- it reminds you that we know. There’s so much to humans that’s amazing and we know that. And we- we know it. We know it when we see it. And so that gives me a lot of hope, you know? That gives me a lot of hope about the human experience. It’s not going to be damaged by … It’s only going to be empowered as tools by AI. It’s not going to be damaged or limited or somehow altered to where it’s no longer human. So … Uh, I need a bathroom break. Quick pause. You mentioned that a lot of the apps might be basically made obsolete. Do you think agents will just transform the entire app market?

AI agents will replace 80% of apps

Peter Steinberger (02:52:30) Yeah. Uh, I noticed that on Discord, that people just said how their … like, what they build and what they use it for. And it’s like, why do you need MyFitnessPal when the agent already knows where I am? So, it can assume that I make bad decisions when I’m at, I don’t know, Waffle House, what’s around here? Or- or briskets in Austin.

Lex Fridman (02:52:57) There’s no bad decisions around briskets, but yeah.

Peter Steinberger (02:53:00) No, that’s the best decision, honestly. Um-

Lex Fridman (02:53:03) Your agent should know that.

Peter Steinberger (02:53:04) But it can, like … It can modify my- my gym workout based on how well I slept, or if I’m … if I have stress or not. Like, it has so much more context to make even better decisions than any of this app even could do.

Peter Steinberger (02:53:19) It could show me UI just as I like. Why do I still need an app to do that? Why do I have to … Why should I pay another subscription for something that the agent can just do now? And why do I need my- my Eight Sleep app to control my bed when I can tell the a- … tell the agent to … You know, the agent already knows where I am, so he can, like, turn off what I don’t use.

Peter Steinberger (02:53:47) And I think that will … that will translate into a whole category of apps that are no longer … I will just naturally stop using because my agent can just do it better.

Lex Fridman (02:54:00) I think you said somewhere that it might kill off 80% of apps.

Lex Fridman (02:54:05) Don’t you think that’s a gigantic transformative effect on just all software development? So that means it might kill off a lot of software companies.

Lex Fridman (02:54:16) It’s a scary thing. So, like, do you think about the impact that has on the economy? On just the ripple effects it has to society? Transforming who builds what tooling. It empowers a lot of users to get stuff done, to get stuff more efficiently, to get it done cheaper.

Peter Steinberger (02:54:41) It’s also new services that we will need, right? For example, I want my agent to have an allowance. Like, you solve problems for me, here’s like 100 bucks in order to solve problems for me. And if I tell you to order me food, maybe it uses a service. Maybe it uses something like rent-a-human to, like, just get that done for me.

Peter Steinberger (02:55:06) I don’t actually care. I care about solve my problem. There’s space for- for new companies to solve that well. Maybe don’t … Not all apps disappear. Maybe some transform into being API.

Lex Fridman (02:55:21) So, basically, apps that rapidly transform in being agent-facing. So, there’s a real opportunity for, like, Uber Eats, that we just used earlier today. It- it’s companies this, of which there’s many. Who gets there fastest to being able to interact with OpenClaw in a way that’s the m- the most natural, the easiest?

Peter Steinberger (02:55:50) Yeah. And also, apps will become API if they want or not. Because my agent can figure out how to use my phone. I mean, on- on the other side, it’s a little more tricky. On Android, that’s already … People already do that. And then we’ll just click the Order Uber for Me button for me. Or maybe another service. Or maybe there’s- there’s a … there’s an API I can call so it’s faster. Uh, I think that’s a space we’re just beginning to even understand what that means. And I … Again, I didn’t even … That was not something I thought of. Something that I- that I discovered as people use this, and it … We are still so early. But yeah, I think data is very important. Like, apps that can give me data, but that also can be API. Why do I need a Sonos app anymore when I can …

Peter Steinberger (02:56:44) when my agent can talk to the Sonos?… Speakers directly. Like my cameras, there’s like a crappy app, but they have, they have an API, so my agent uses the API now.

Lex Fridman (02:56:57) So it’s gonna force a lot of companies to have to shift focus. That’s kind of what the internet did, right? You have to rapidly rethink, reconfigure what you’re selling, how you’re making money.

Peter Steinberger (02:57:10) Yeah, and some companies were really not like that. For example, there’s no CLI for Google, so I had to like, do… have to do anything myself and build GAWK. That’s like a CLI for Google. And at the… Yeah, at the end user, they have to give me the emails because otherwise I cannot use their product. If I’m a company and I try to get Google data, Gmail, there’s a whole complicated process, to the point where sometimes startups acquire startups that went through the process, so they don’t- don’t have to work with Google for half a year to be certified to being able to access Gmail. But my agent can access Gmail because I can just connect to it. It’s still crappy because I need to, like, go through Google’s developer jungle to get a key, and that’s still annoying.

Peter Steinberger (02:58:09) But they cannot prevent me. And worst case, my agent just clicks on the, on the website and gets the data out that way.

Peter Steinberger (02:58:18) Yeah. I mean, I, I watch my agent happily click the I’m not a robot button. And there’s this, this whole… That’s gonna be… That’s gonna be more heated. You see companies like Cloudflare that try to prevent bot access. And in some ways, that’s useful for scraping. But in other ways, if I’m, I’m a personal user, I want that. You know, sometimes I, I use Codex and I, I read an article about modern React patterns, and it’s like a Medium article. I paste it in and the agent can’t read it because they block it. So then I have to copy-paste the actual text. Or in the future, I’ll learn that maybe I don’t click on Medium because it’s annoying, and I use other websites that actually are agent friendly.

Lex Fridman (02:59:13) There’s gonna be a lot of powerful, rich companies fighting back. So it’s really intere- You’re at the center, you’re the catalyst, the leader, and happen to be at the center of this kind of revolution where it’s get- gonna completely change how we interact with services with, with web. And so, like, there’s companies at Google that are gonna push back. I mean, there’s every major companies you could think of is gonna push back.

Peter Steinberger (02:59:39) Even… Yeah, even search. Um, I now use, I think Perplexity or Brave as providers because Google really doesn’t make it easy to use Google without Google. I’m not sure if that’s the right strategy, but I’m not Google.

Lex Fridman (02:59:58) Yeah, there’s a, there’s a nice balance from a big company perspective ’cause if you push back too much for too long, you become Blockbuster and you lose everything to the Netflixes of the world. But some pushback is probably good during a revolution to see.

Peter Steinberger (03:00:11) Yeah. But you see that, that… Like, this is something that the people want.

Peter Steinberger (03:00:16) If I’m on the go, I don’t wanna open a calendar app. I just… I wanna tell my agent, “Hey, remind me about this dinner tomorrow night,” and maybe invite two of my friends and then maybe send a what- send a WhatsApp message to my friend. And I don’t need… I don’t want or need to open apps for that. I think that we passed that age, and now everything is, like, much more connected and, and fluid if those companies want it or not. And I think, well, the right companies will find ways to jump on the train, and other companies will perish.

Will AI replace programmers?

Lex Fridman (03:00:55) You got to listen to what the people want. We talked about programming quite a bit, and a lot of folks that are developers are really worried about their jobs, about their… About the future of programming. Do you think AI replaces programmers completely? Human programmers?

Peter Steinberger (03:01:11) I mean, we’re definitely going in that direction. Programming is just a part of building products. So maybe, maybe AI does replace programmers eventually. But there’s so much more to that art. Like, what do you actually wanna build? How should it feel? How’s the architecture? I don’t think agents will replace all of that. Yeah, like, just the, the actual art of programming, it will, it will stay there, but it’s, it’s gonna be like knitting. You know? Like, people do that because they like it, not because it makes any sense. So the… I read this article this morning about someone that it’s okay to mourn our craft. And I can…

Peter Steinberger (03:02:04) A part of me very strongly resonates with that because in my past I, I spent a lot of time tinkering, just being really deep in the flow and just, like, cranking out code and, like, finding really beautiful solutions. And yes, in a way it’s, it’s sad because that will go away. And I also get a lot of joy out of just writing code and being really deep in my thoughts and forgetting time and space and just being in this beautiful state of flow. But you can get the same state of flow… I get a similar state of flow by working with agents and building and thinking really hard about problems. It is different-… but… And it’s okay to mourn it, but I mean, that’s not something we can fight. Like, there is… the world for a long time had a…

Peter Steinberger (03:03:06) there was a lack of intelligence, if you s- if you see it like that, of people building things, and that’s why salaries of software developers reached stupidly high amounts and then will go away. There will still be a lot of demand for people that understand how to build things. It’s just that all this tokenized intelligence enables people to do a lot more, a lot faster. And it will be even more… even faster and even more because those things are continuously improving. We had similar things when… I mean, it’s probably not a perfect analogy, but when we created the steam engine, and they built all these factories and replaced a lot of manual labor, and then people revolted and broke the machines.

Peter Steinberger (03:04:04) Um, I- I can relate that if you very deeply identify that you are a programmer, that it’s scary and that it’s threatening because what you like and what you’re really good at is now being done by a soulless or not entity. But I don’t think you’re just a programmer. That’s a very limiting view of your craft. You are, you are still a builder.

Lex Fridman (03:04:40) Yeah, there’s a couple of things I want to say. So one is, I never… As you’re articulating this beautifully, I no- I’m realizing I never thought I would… the thing I love doing would be the thing that gets replaced. You hear these stories about these, like you said, with the steam engine. I’ve, I’ve spent so many, I don’t know, maybe thousands of hours poring over code and putting my heart and soul and, like, and just, like, some of my most painful and happiest moments were alone behind… I, I was an Emacs person for a long time. Man, Emacs. And, and then there’s an identity and there’s meaning, and there’s… Like, when I walk about the world, I don’t say it out loud, but I think of myself as a programmer. And to have that in a matter of months…

Lex Fridman (03:05:31) I mean, like you mentioned, April to November, it really is a leap that happened, a shift that’s happening. To have that completely replaced is is painful. It’s, it’s truly painful. But I also think programmers, builders more broadly, but what is, what is the act of programming? I, I think programmers are generally best equipped at this moment in history to learn the language, to empathize with agents, to learn the language of agents. To feel the CLI.

Lex Fridman (03:06:11) Like, like to understand what is the thing you need, you the agent, need to do this task the best?

Peter Steinberger (03:06:21) I think at some point it’s just gonna be called coding again, and it’s just gonna be the new normal.

Peter Steinberger (03:06:25) And yet, while I don’t write the code, I very much feel like I’m in the driver’s seat and I am, I am writing the code, you know? It’s just-

Lex Fridman (03:06:37) You’ll still be a programmer. It’s just the activity of a programmer is, is different.

Peter Steinberger (03:06:41) Yeah, and because on X, the bubble, I mean, is mostly positive. On, on Mastodon and Bluesky, I don’t… I also use it less because oftentimes I got attacked for my blog posts. And I, I had stronger reactions in the past, now I can sympathize with those people more ’cause, in a way I get it. It… In a way, I also don’t get it because it’s very unfair to grab onto the person that you see right now and unload all your fear and hate. It’s gonna be a change and it’s gonna be challenging, but it’s also… I don’t know. I find it incredibly fun and, and, and gratifying. And I can, I can use the new time to focus on much more details. I think the level of expectation of what we build is also rising because it’s just now… The default is now so much easier, so software is changing in many ways.

Peter Steinberger (03:07:45) There’s gonna be a lot more. And then you have all these people that are screaming, “Oh yeah, but what about the water?” You know? Like, I did a conference in Italy about the, the state of AI, and m- my whole motivation was to push people away from, don’t see yourself as an iOS developer anymore. You’re now a builder, and you can use your skills in many more ways. Also because apps are slowly going away. People didn’t like that. Like a lot of people didn’t like what I had to say. And I don’t think I was hyperbole, I was just like, “This is how I see the future.” Maybe this is not how it’s going to be, but I’m pretty sure a version of that will happen.

Peter Steinberger (03:08:30) And the first question I got was, “Yeah, but what about the insane water use on data centers?” But then you actually sit down and do the maths, and then for most people if you just skip one burger per month, that compensates the, the CO2 output, or, like, the water use in equivalent of tokens. I mean, the maths is, is… the maths is tricky, and it depends if you add pre-training, then maybe it’s more than just one patty…. but it’s not off by a factor of 100, you know? So, so the… or like golf is still using way more water than all data centers together. So are you also hating people that play golf? Those people grab on anything that they think is bad about AI without seeing the potential things that might be good about AI.

Peter Steinberger (03:09:24) And I’m not saying everything’s good. It’s certainly gonna be a very transformative technology for our society.

Lex Fridman (03:09:32) There’s to steel man the, the criticism in general, I do wanna say in my experience with Silicon Valley there’s a bit of a bubble in the sense that there’s a kind of excitement and an over-focus about the positive that the technology can bring.

Lex Fridman (03:09:55) And… which is great. It’s great to focus on… N- not to, not to be paralyzed by fear and fear-mongering and so on, but there’s also within that excitement, and within everybody talking just to each other, there’s a dismissal of the basic human experience across the United States and the Midwest, across the world. Including the programmers we mentioned, including all the people that are gonna lose their jobs, including the s- the measurable pain and suffering that happens at the short-term scale when there’s change of any kind. Especially large-scale transformative change that we’re about to face if what we’re talking about will materialize. And so to ha- having a bit of that humility and awareness about the tools you’re building, they’re going to cause pain.

Lex Fridman (03:10:43) They will long term hopefully bring about a better world, and even more opportunities-

Lex Fridman (03:10:48) … and even more awesomeness. But having that kind of like quiet moment often of, of respect for the pain that is going to be felt. And so not, not enough of that is, I think, done, so it’s, it’s good to have a bit of that.

Peter Steinberger (03:11:07) And then I also have to put against some of the emails I got where people told me they have a small business, and they’ve been struggling. And, and OpenClaw helped them automate a few of the tedious tasks from, from collecting invoices to like answering customer emails that then freed them up and like cost them a bit more joy in their life.

Peter Steinberger (03:11:31) Or, or some emails where they told me that OpenClaw helped their disabled daughter. That she’s now empowered and feels she can do much more than before. Which is amazing, right? Because you could, you could do that before as well. The technology was there. I didn’t, I didn’t invent a whole new thing, but I made it a lot easier and more accessible, and that did show people the possibilities that they previously wouldn’t see. And now they apply it for good.

Peter Steinberger (03:12:03) Or like also the fact that, yes, I, I, I suggest the, the, the latest and best models, but you can totally run this on free models. You can run this locally. You can run this on, on, on Keyme or other, other, other models that are way more accessible price-wise, and still have a, a very powerful system that might otherwise not be possible. Because other things like, I don’t know, Entropik’s CoWork is locked in into their space, so it’s not all black and white. There’s… I got a lot of emails that were heartwarming and amazing. And, and I don’t know, it just made me really happy.

Lex Fridman (03:12:48) Yeah, there’s a lot… It has brought joy into a lot of people’s lives. Not just, not just programmers. Like a lot of people’s lives. It’s, it’s, it’s beautiful to see. What gives you hope about this whole thing we have going on with human civilization?

Peter Steinberger (03:13:03) I mean, I inspired so many people. There’s like… there’s this whole builder vibe again. People are now using AI in a more playful way and are discovering what it can do and how it can like help them in their life. And creating new places that are just sprawling of creativity. I don’t know. Like, there’s like ClawCoin in Vienna. There’s like 500 people. And there’s such a high percentage of people that uh, want to present, which is to me really surprising, because u- usually it’s quite hard to find people that want to like talk about what they built. And now it’s, there’s an abundance. So that gives me hope that we can, we can figure shit out.

Lex Fridman (03:14:00) And it makes it accessible to basically everybody.

Lex Fridman (03:14:05) Just imagine all these people building, especially as you make it simpler and simpler, more secure. It’s like anybody who has ideas and can express those ideas in language can build. That’s crazy.

Peter Steinberger (03:14:22) Yeah, that’s ultimately power to the people, and one of the beauty, the beautiful things that come out of AI. Not just, not just a slop generator.

Lex Fridman (03:14:36) Well, Mr. Clawfather, I just realized when I said that in the beginning, I violated two trademarks, because there’s also the Godfather. I’m getting sued by everybody. You’re a wonderful human being. You’ve created something really special, a special community, a special product, a special set of ideas. Plus, the entire… the humor, the good vibes, the inspiration of all these people building, the excitement to build. So I’m truly grateful for everything you’ve been doing and for who you are, and for sitting down to talk with me today. Thank you, brother.

Peter Steinberger (03:15:14) Thanks for giving me the chance to tell my story.

Lex Fridman (03:15:17) Thanks for listening to this conversation with Peter Steinberger. To support this podcast, please check out our sponsors in the description, where you can also find links to contact me, ask questions, give feedback and so on. And now let me leave you with some words from Voltaire. “With great power comes great responsibility.” Thank you for listening, and hope to see you next time.

2026年 AI 现状:大模型、编程、缩放定律、中国、智能体、GPU 与 AGI (2026-01-31)

State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI (2026-01-31)

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:资深 AI 研究员及教育家 Sebastian Raschka 与 Allen Institute for AI 的博士后研究员 Nathan Lambert,两位兼具深度技术与沟通能力的专家,共同复盘 2025 年 AI 领域的关键突破,并对 2026 年的技术与商业格局进行推演。

  • 核心论点:对话揭示了 AI 竞赛的核心已从基础架构的革命(Transformer 架构已趋于稳定)转向了多维度、多阶段的优化战争。前沿进展不再仅仅依赖于“更大模型、更多数据”的暴力堆砌,而是由训练范式(预训练、中训练、后训练的分工)、算法创新(如 RLVR 带来的能力解锁)和算力应用(如推理时伸缩)的精细化组合所驱动。中美在“开放”与“封闭”路线上的战略博弈,正在重塑全球 AI 的技术栈、商业模式和影响力版图。在这个快速“蛙跳式”迭代的时代,真正的护城河不再是单一的技术优势,而是数据质量、算法效率和生态系统整合的综合能力。

2. 🧠 深度观点解析 (Deep Dive Analysis)

维度一:中美 AI 竞赛——开放权重模型的地缘政治经济学

  • 核心观点:以 DeepSeek 为代表的中国公司,正通过发布高性能的开源权重模型(Open Weight Models),在国际上迅速积累影响力和用户基础,这直接挑战了美国头部公司以封闭 API 为主的商业模式。

  • 原理解构:这是一种非对称的竞争策略。

    1. 绕过信任壁垒:Nathan Lambert 指出,由于数据安全和地缘政治顾虑,美国及西方企业客户极不情愿直接调用中国公司的 API 服务。
    2. 影响力渗透:通过发布开源模型,中国公司将模型本身作为一种技术基础设施进行输出。开发者和企业可以在本地或可信的云上部署和使用这些模型,从而绕开了直接的服务依赖。这使得中国技术能够渗透到全球的 AI 应用生态中,培养用户习惯和开发者生态。
    3. 成本转嫁与生态共建:Sam Altman 曾坦言,OpenAI 发布开源模型(如 gpt-oss)的部分原因在于,社区和用户会分担推理所需的 GPU 成本。中国公司同样利用这一点,以极低的边际成本实现了技术的广泛分发。
  • 证据/案例

    • “DeepSeek 时刻”: 2025 年 1 月,DeepSeek R1 的发布被视为一个转折点,其性能逼近甚至达到 SOTA 水平,但据称计算成本低得多。
    • 群雄并起:继 DeepSeek 之后,智谱 AI (Z.ai) 的 GLM 模型、MiniMax、月之暗面 (Moonshot) 的 Kimi 等中国公司纷纷发布强大的开源模型,形成“百花齐放”的局面。
    • 更友好的许可证:与 Meta 的 Llama 等模型附带商业使用限制不同,许多中国开源模型的许可证(如 Apache 2.0)更为宽松,对商业应用更具吸引力。

维度二:训练范式的演进——从“一锅炖”到“三级火箭”

  • 核心观点:先进 LLM 的训练已演化为一个精细的三阶段过程:预训练 (Pre-training)中训练 (Mid-training)后训练 (Post-training),每个阶段目标明确,共同构建起模型的核心能力。

  • 原理解构:这是一个资源优化和能力分层解锁的过程。

    1. 预训练:目标是“灌输知识”。通过在海量(万亿级 Token)文本和代码数据上进行“下一个词预测”,让模型学习世界的通用知识、语法结构和基本逻辑。当前重点已从数据“量”转向数据“质”,包括使用高质量的合成数据和精细的数据过滤策略。
    2. 中训练:目标是“专项强化”。在预训练之后,针对特定能力(如长文本理解、代码或多语言)进行补充训练。这样做是为了避免在昂贵的预训练全程中使用稀疏的高价值数据,同时缓解“灾难性遗忘”问题。
    3. 后训练:目标是“解锁与对齐”。这是将模型潜力转化为实用能力的关键。它不主要教授新知识,而是通过 RLHF (人类反馈强化学习)RLVR (可验证奖励强化学习) 等技术,教会模型如何运用已有知识来遵循指令、解决问题和进行多步推理。
  • 证据/案例

    • 数据质量的重要性:AI2 的 OLMo 3 模型使用比前代更少但质量更高的数据,却取得了更好的性能。
    • RLVR 的能力解锁:仅通过几十步的 RLVR 训练,Qwen 3 基础模型在数学基准上的准确率就从 15% 跃升至 50%,证明了后训练是在“解锁”而非“学习”新知识。

维度三:Scaling Laws 未死,只是战场转移

  • 核心观点:经典的“规模法则”(Scaling Laws)在预训练阶段依然有效,但由于成本极其高昂,其“低垂的果实”已被摘取。如今,性能提升的更大杠杆来自于 推理时伸缩 (Inference-time Scaling)强化学习伸缩 (RL Scaling)

  • 原理解构:算力的价值正在从“制造模型”向“使用模型”和“精炼模型”延伸。

    1. 预训练规模法则:投入更多计算资源和数据进行预训练,模型的基础能力依然会可预测地提升。但训练成本与服务成本的矛盾日益突出,万亿参数模型的训练和服务成本高达数百万至数十亿美元,限制了其进一步扩展。
    2. 推理时伸缩:这是 2025 年最重要的范式转变之一。通过让模型在生成最终答案前,花费更多计算资源(即生成更多中间“思考”步骤),其解决复杂问题的能力得到巨大提升。用户体验从“即时回答”变为“模型正在思考”,换来的是更高的准确性和推理深度。
    3. 强化学习伸缩:以 RLVR 为代表的后训练过程也展现出自己的规模法则。OpenAI 的 o1 和 DeepSeek 的研究都表明,在 RL 训练上投入的算力(呈对数增长)与模型在特定任务上的性能(呈线性增长)之间存在明确的正相关关系。这比无差别地扩大预训练规模更具性价比。
  • 证据/案例

    • ChatGPT 的“思考模式”:GPT-5.2 的“Thinking”或“Pro”模式,用户愿意等待更长时间以换取更高质量的回答。
    • OpenAI o1 模型:通过大规模的推理时计算,一个相对较小的模型在某些任务上超越了更大的模型,证明了该范式的有效性。
    • Grok 4:据传其在后训练上投入的算力与预训练相当,体现了后训练在模型能力构建中的重要性。

维度四:RLVR 的崛起——从“偏好对齐”到“事实对齐”

  • 核心观点RLVR (Reinforcement Learning with Verifiable Rewards) 是 2025 年后训练阶段最重要的技术突破。它将优化目标从主观的“人类偏好”转变为客观的“任务正确性”,极大地提升了模型在数学、编程和工具使用等领域的可靠性。

  • 原理解构:RLVR 为模型提供了一个清晰、可无限扩展的奖励信号。

    1. RLHF 的局限:传统的 RLHF 依赖于人类标注员对两个回答进行比较,这成本高昂、存在主观性,且容易导致模型为了“取悦”标注员而产生“媚俗”的风格,甚至牺牲事实性。
    2. RLVR 的优势:在数学题、代码编译、API 调用等领域,答案的正确与否是可程序化验证的。这提供了一个明确的、二元的(或连续的)奖励信号。模型可以通过大量的“试错”循环,自主学习如何生成能够得到正确结果的推理路径。
    3. 涌现能力:在 RLVR 训练中,模型自发地学会了“思考过程”(Chain-of-Thought)和“自我纠错”(Self-Correction)。例如,DeepSeek R1 在论文中展示了模型会说“啊,我搞错了,让我再试一次”的“Aha moment”。
  • 证据/案例

    • DeepSeek R1:通过大规模 RLVR 训练,在数学和代码能力上取得了突破性进展。
    • 工具使用 (Tool Use):RLVR 是模型学习调用外部工具(如计算器、搜索引擎、代码解释器)的关键技术,因为工具调用的结果是否成功是可验证的。
    • Process Reward Models (PRM):作为 RLVR 2.0 的潜在方向,未来不仅会奖励最终结果,还会对推理过程中的每一步进行打分,进一步提升推理的可靠性。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识

    • “通用人工智能(AGI)的梦想正在消亡”:Nathan Lambert 提出,业界正从追求一个“万能的单一模型”转向一个由多个专业化、协作的 AI 代理组成的生态系统。未来的趋势是“数据中心里的多个 AGI”,而非一个统治一切的超级智能。
    • “高级开发者比初级开发者更依赖 AI”:一项针对 791 名专业开发者的调查显示,拥有 10 年以上经验的资深开发者,其交付的代码中 AI 生成内容超过 50% 的比例远高于初级开发者。这颠覆了“AI 只是初学者拐杖”的普遍看法,表明 AI 已成为提升专家效率的核心工具。
  • 盲点与局限

    • RLHF 的“个性磨灭”效应:RLHF 旨在通过聚合大量人类反馈来优化模型,但这本质上是一个“求平均”的过程。这可能导致模型失去独特的“声音”(voice)和尖锐的洞察力,变得越来越圆滑、通用,但缺乏个性与深度。
    • 硅谷的回音室与文化过载:对话中反思了硅谷 AI 圈的“996”工作文化和“永久下层阶级”(permanent underclass)等밈。这种高度内卷和与世隔绝的氛围,可能导致技术发展脱离普通人的真实需求和人类体验的复杂性。
    • 开源模型的“数据污染”难题:Qwen 等模型在基准测试上的优异表现,被质疑是由于其训练数据中包含了与测试集高度相似的“泄露”内容。这揭示了在当前环境下,对开源模型进行真正公平、无污染的评测变得异常困难。
  • 未解之谜

    • 真正的持续学习 (Continual Learning):如何在不发生“灾难性遗忘”且成本可控的前提下,让模型能够持续、快速地从新信息中学习并更新其权重,这仍然是一个悬而未决的核心难题。目前的“记忆”功能大多是基于上下文窗口的“伪学习”。
    • 偏好的量化:RLHF 的核心是将复杂、多维度的人类偏好(如准确性、风格、安全性)压缩成一个单一的奖励信号。如何科学地进行这种压缩,以及这种压缩是否会丢失关键信息,是社会选择理论和 AI 交叉的深层问题。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. On user attachment to AI models:

    “They email them and say, ‘My friend is different.’ They find these employees’ emails and send them things because they are so attached to what is a set of model weights…” 中文意译:“他们会发邮件说‘我的朋友变了’。用户会找到这些员工的邮箱发信,因为他们对那一套特定的模型权重配置产生了深厚的感情联结。” 语境:讨论当 OpenAI 等公司更新模型时,一些用户会因为感受到 AI “性格”的细微变化而感到失落,体现了人机关系的新维度。

  2. On the uneven progress of AI:

    “I think the camp that I fall into is that AI is so-called jagged, which will be excellent at some things and really bad at some things.” 中文意译:“我属于那个认为 AI 进展是‘犬牙交错’的阵营,它会在某些方面表现卓越,而在另一些方面则非常糟糕。” 语境:解释为何“超级程序员”等 AGI 里程碑难以一蹴而就,因为 AI 的能力发展是不均衡的,存在明显的长板和短板。

  3. On the future of software development:

    “I do think we are closer to that side of things, and it takes direction and understanding how the systems work to extract the best from the language models.”

    • 中文意译:“我确实认为我们正越来越接近那个(软件开发的工业化)时代,而关键在于提供方向和理解系统如何工作,以便从语言模型中榨取出最佳性能。”

    语境:探讨 AI 将如何改变编程,结论是人类的角色将从底层的代码实现者转变为高层的系统设计师和目标设定者。

  4. On the motivation for China’s open-source strategy:

    “A lot of top US tech companies and other IT companies won’t pay for an API subscription to Chinese companies for security concerns… open weight models [are] an ability to influence and take part in a huge growing AI expenditure market in the US.” 中文意译:“出于安全考虑,很多美国顶级科技公司不会为中国公司的 API 订阅付费……开源权重模型让他们有能力去影响并参与到美国庞大且不断增长的 AI 支出市场中。” 语境:揭示中国公司开源 AI 模型背后的商业和地缘战略考量。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)

    • 产品形态工具使用长思考链将成为高端 AI 产品的标配,免费版和付费版之间的能力差距将主要体现在推理深度和多工具协同能力上。
    • 技术栈:混合架构(如 Transformer + 状态空间模型)和高效注意力机制(如 Group Query Attention, Sliding Window Attention)将成为开源模型的主流,以在性能和长文本成本之间取得平衡。
    • 竞争格局:中美在开源领域的竞争将白热化。美国将出现更多受政策和资本支持的“美国版 DeepSeek”项目(如 ATOM Project),以抗衡中国日益增长的影响力。同时,AI 应用层公司(如 Cursor)通过快速迭代和利用用户数据进行持续微调,将构筑起独特的竞争优势。
  • 长期终局 (5-10年)

    • 行业图景:AI 行业可能演变成类似云计算的格局:少数几家巨头(如 Google, OpenAI/Microsoft, Anthropic)提供昂贵、强大的基础 API,同时并存一个由中美主导的、繁荣的开源模型生态系统,服务于定制化和垂直领域的需求。“一个模型统治一切”的愿景将被“多智能体协作”的现实所取代
    • 人机协作:编程等知识型工作的核心将彻底从“执行”转变为“设计、规范与验证”。人类的价值在于提出正确的问题、定义清晰的目标,并对 AI 生成的复杂系统进行最终的评估和整合。
    • 经济影响:AI 对 GDP 的影响不会是短期内的“跃升”,而是一种通过知识普及化和生产力工具化实现的长期、持续的渗透。最大的价值释放来自于赋能全球数以亿计的人学习、创造和解决问题,从而催生出无数新的创新。
  • 行动建议

    • 开发者停止将 AI 视为简单的代码补全工具。应主动学习如何与 AI 进行高层级的系统设计对话,掌握利用 AI 构建、调试和重构整个项目的能力。同时,关注 AI 能力的“犬牙交错”之处,在 AI 尚不擅长的领域(如大规模分布式系统)深耕,将成为核心竞争力。
    • 投资者关注拥有独特、高质量专有数据的公司,这在模型能力日趋同质化的未来是真正的护城河。同时,投资于优化 AI 推理成本开发工作流的“镐和铲”公司(如 vLLM, SGLang 等推理引擎,或 Cursor 等 AI 原生开发环境)。
    • 创业者避免与基础模型巨头直接竞争。利用性能越来越强的开源模型,专注于解决特定垂直领域的具体问题。机会在于将通用 AI 的能力与特定行业的数据、工作流和安全需求深度结合,构建 AI 原生应用。

这份研报基于资深机器学习研究员 Sebastian Raschka(《从零开始构建大语言模型》作者)与 Nathan Lambert(Allen AI 研究所后训练负责人)的深度对话。


🎯 核心论题与背景 (Executive Summary)

  • 对话背景:本次对话发生于 2025 年末至 2026 年初的交汇点,正值“DeepSeek 时刻”爆发一周年。Sebastian Raschka 与 Nathan Lambert 站在学术研究与工业落地的十字路口,审视 AI 竞争从“暴力美学”转向“效率革命”的关键节点。
  • 核心论点:对话核心旨在揭示:AI 的胜负手已从单纯的预训练参数规模(Scaling Laws 1.0)转向推理侧计算扩展与强化学习逻辑优化(Scaling Laws 2.0)。 专家们认为,虽然 Transformer 架构在底层上保持了惊人的稳定性,但后训练(Post-training)阶段的系统工程——特别是通过可验证奖励进行强化学习(RLVR)——才是当前智能跨越的关键。同时,中美在开源权重与闭源模型上的博弈,正在重塑全球技术主权。

🧠 深度观点解析 (Deep Dive Analysis)

1. “DeepSeek 时刻”与地缘技术范式转移

  • 核心观点:中国模型(如 DeepSeek, Qwen)已在开源领域反超美系模型,打破了闭源模型对“前沿智能”的垄断。
  • 原理解构:DeepSeek 的成功并非源于全新的架构发现,而源于对 MLA(多头潜在大脑注意机制)MoE(混合专家模型) 的极致优化,大幅降低了推理成本(KV Cache)和训练成本。这证明了在受限算力下,通过算法精进可以达到甚至超越暴力堆料的效果。
  • 案例:DeepSeek R1 通过极低成本(约 500 万美元)实现了媲美 GPT-4 级的推理能力。

2. 三维缩放定律(The Three Axes of Scaling)

  • 核心观点:Scaling Laws 未死,但重心已从“预训练”转移到“强化学习”和“推理侧”。
  • 原理解构
    • 预训练扩展:依然有效,但边际成本极高,受限于高质量数据枯竭。
    • RL 训练扩展:通过长达数周的强化学习迭代,挖掘模型已有的隐性知识。
    • 推理侧扩展(o1 范式):通过“思维链”生成更多 Token(思考过程)来换取更高的准确率,将计算成本从训练转移到用户查询环节。
  • 证据:OpenAI o1 及 DeepSeek R1 证明,即便基础模型体量不变,通过增加推理时间,逻辑性能可呈线性增长。

3. RLVR:后训练阶段的“工业革命”

  • 核心观点可验证奖励强化学习(RLVR) 替代了传统依赖人类主观偏好的 RLHF,成为逻辑智能进化的核心。
  • 原理解构:在数学和代码领域,答案是客观可验证的。RLVR 允许模型进行数百万次的自我尝试,通过正向奖励自动强化正确逻辑路径。
  • 案例:Sebastian 提到 Qwen 3 在仅 50 步 RLVR 训练后,数学准确率从 15% 飙升至 50%,这并非学到了新知识,而是激活了预训练中被“封印”的逻辑调用能力。

4. Transformer 的“长生不老”与系统工程的崛起

  • 核心观点:自 2019 年 GPT-2 以来,核心架构几乎没有本质变化,真正的进步发生在系统层面。
  • 原理解构:当前的进步是无数微小“旋钮”的优化:FP8/FP4 精度训练、RoPE 位置编码、RMSNorm 归一化。现在的竞争是 10 万卡级别集群的稳定性工程,而非算法公式的颠覆。
  • 类比:现在的模型更像是在 F1 赛车上不断调整空气动力学组件,而引擎(Transformer)自 2017 年后就没换过。

💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • “Aha Moment”的伪命题:对于 DeepSeek R1 展示出的“顿悟”过程(模型在思考中发现错误并自我纠正),Nathan 提出了质疑。他认为这可能并非真正的实时推理,而是在预训练阶段吸收了大量人类思考痕迹(如数学讲义或论坛讨论)后的条件反射式模拟
  • 数据污染的隐忧:专家指出,当前模型在 Benchmark 上的惊艳表现,很大程度上是因为训练数据中包含了极度相似的测试题。Qwen 等模型在微小改动参数后性能下降,暴露了当前评估体系的脆弱性。
  • 软件开发的“平庸化”风险:Sebastian 担心过度依赖 AI 编程会导致“技能断层”。资深开发者能利用 AI 提升效率,但新手如果跳过了“痛苦的调试过程”,可能永远无法建立深度直觉,导致未来行业缺乏能解决底层问题的专家。

💎 金句与高光时刻 (Golden Quotes)

  1. “Coding doesn’t lie. It’s math, basically. If the code works, it’s correct.” (代码不会撒谎。它本质上是数学。如果代码能运行,它就是正确的。) —— Sebastian Raschka,强调从零构建模型在学习中的重要性。
  2. “RLVR allows the model to think longer, but the knowledge is already there in the pre-training; you’re just unlocking it.” (RLVR 允许模型思考更久,但知识早已存在于预训练中;你只是在解锁它。) —— Nathan Lambert,揭示了后训练阶段的本质。
  3. “We are in an era where software engineering is moving to system design and outcomes, rather than just writing lines of code.” (我们正处于一个软件工程转向系统设计和目标导向,而非仅仅编写代码行的时代。) —— Nathan Lambert,对职业未来的预测。

🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)
    • 代码智能爆发:Claude Code 和 Cursor 展示了 Agent 级开发的潜力。未来 2 年,前端和基础业务逻辑将实现 80% 以上的自动化,开发者转变为“产品架构师”。
    • 推理成本分级:会出现 $2,000/月的高级订阅,专门用于解决需要“思考 1 小时”的科研级难题。
  • 长期终局 (5-10年)
    • 人类知识的全面普惠:AI 将作为“超级导师”抹平全球教育差距,真正的 GDP 增长将源于欠发达地区人口通过 AI 获得的创新能力,而非大厂的效率提升。
    • 从“工具使用”到“自主计算机使用”:AI 将直接操控 OS(计算机使用),突破当前网页端对话的范式,成为真正的数字劳动力(Remote Worker)。
  • 行动建议
    • 对开发者:不要只当 AI 的搬运工,要**“从零构建(Build from Scratch)”**以保持竞争力。关注 Post-training 技术,这是目前最有价值的研究领域。
    • 对投资者/创业者:警惕通用模型的泡沫,寻找**“可验证奖励(Verifiable Rewards)”**存在的垂直行业(如法律、医药、EDA工具),这些领域最易产生 Scaling 效应。

🤖 AI 产业深度研报:从“模型规模竞赛”到“智能工程跃迁”

1. 🎯 核心论题与背景

  • 对话背景:本段对话聚焦于2025年末至2026年初的AI行业格局,由技术专家Sebastian Raschka(机器学习作家/研究者)与Nathan Lambert(Allen Institute for AI研究者)共同审视当前的竞争态势,对比美国前沿实验室(OpenAI/Anthropic)与中国开源社区在技术路线、经济模式和全球影响力上的博弈。
  • 核心论点:当前AI发展的核心逻辑正在从**“纯模型参数堆叠”“算法效率+推理工程”**转移。尽管“scaling laws”(缩放定律)依然有效,但核心增量红利已转移至后训练阶段,即通过Reinforcement Learning with Verifiable Rewards (RLVR) 启发式推理。同时,全球AI格局正从美国阵营的“封闭霸权”向中美并行的“开源生态洪流”演变,这将深刻重塑未来的生产力工具形态。

2. 🧠 深度观点解析

A. 训练范式的迁移:从“数据爆炸”到“验证式强化学习”

  • 核心观点:预训练(Pre-training)的“低垂果实”已被采摘殆尽,且成本过高。未来智能的增量由 RLVR(带可验证奖励的强化学习)和 推理时扩展 推动而成。模型不仅仅是“记住”知识,而是“解锁”已经学到的知识。
  • 原理解构:传统RLHF(人类反馈强化学习)优化的是风格和偏好,具有饱和性;而RLVR针对数学、代码等可验证的具体问题进行试探和纠错。OpenAI的o1、DeepSeek R1等模型的示例证明,给予模型更多的“思考”和“自我纠错”时间,可以线性提升性能。
  • 证据/案例:Sebastian提到将Qwen 3通过RLVR在数学数据集上训练50步,准确率从15%提升至50%,而不是因为模型记住了数据;Nathan提到调度算力的新趋势(如300天训练 vs 30亿参数的5天训练 vs 30亿参数的3.5周训练)。

B. 开源生态的地缘政治化:中国模型对美国开源的“失语”反噬

  • 核心观点:DeepSeek等中国公司崛起的开源运动,客观上成为了“mantra”(咒语),唤醒了美国乃至全球的开发者对开源的重视,填补了美国相比中国的技术空白。美国开源模型(如Llama)正面临来自中国更优参数模型的双重挤压。
  • 原理解构:这不仅是性能竞争,更是商业模式与信任链的竞争。中国公司倾向于提供无限制的开源权重文件以绕过出口管制并建立用户粘性,而美国公司受限于版权诉讼和霸权心态,开源策略变得保守。
  • 证据/案例:Nathan Lambert提到的开源生态对比——DeepSeek、Z.ai、MiniMax的论文发表量和模型迭代速度远超同类美国开源项目,且定量指标(如Qwen 5千亿)领先。美国唯一的应对是政府内部开始讨论建设类似Manhattan Project的“American Open Models“(亚当计划)。

C. “锯齿状”智能与计算机使用的障碍

  • 核心观点:AI不会是全能的管家,而是“锯齿状”分布的超强工具。虽然AI在代码和特定领域极其强大,但在通用计算机使用上面临巨大的工程和安全鸿沟。其推理能力在API调用层面可以不完美,但在直接操作复杂GUI(图形用户界面)时困难重重。
  • 原理解构:现有的LLM无法处理“连续的高分辨率视觉流”和“语义级粗糙度”。当AI操作电脑时,它无法像人一样容忍模糊的截图或截断的操作栏,从而引发失败。AGI的定义不应是“全知全能”,而应侧重于解决特定高价值经济任务的能力。
  • 证据/案例:OpenAI的“Operator“和Anthropic的Demo在处理网页交互时表现出的笨拙;Lex提到的Claude不仅能写代码,还能从零重建Slack的消息应用功能,验证了其在特定模态的强大,但距离通用Agent还有距离。

D. “挣扎”在智能中的价值

  • 核心观点:知识的内化需要“痛苦”的过程。过度依赖AI进行解码,会导致学习者丧失建模世界的直觉。未来的教育不再兜底简单查询,而是鼓励AI辅助下的人工“挣扎”以提炼核心原理。
  • 原理解构:LSTM(长短期记忆)通过学习过程强化神经元连接提升智商,同理,人类的学习也是通过克服困难建立心理表征。AI的“即时满足”若过度使用,会削弱人类的认知韧性。
  • 证据/案例:Sebastian关于编程的例子——在没有AI时调试Bug的成就感,以及在遇到复杂数学题时被迫思考的“Goldilocks zone”。

3. 💡 反直觉与批判性视角

打破共识

  1. 预训练是“伪高潮”:虽然大家都在谈论万亿参数模型,但Nathan和Sebastian认为,目前最激动人心的进步(如Reasoning Models)并非来自更大的模型,而是来自更聪明的训练算法(RLVR)和更灵活的推理调度。只有“算法的空手道”才能打败“蛮力的举重”。
  2. 硅谷的泡沫正在转为“建材燥热”:Nathan警示,目前的AI热潮不再仅仅是“好的技术想法”,更像是为了吸引顶级人才的“热力学泡沫”。这种氛围可能导致为了赚钱而忽视安全或长期可持续性(如广告植入的决定)。
  3. 教育体系的倒退风险:由于AI的易用性,短期的教育体系(如在线大考)可能会在设计上为了防AI作弊而被迫倒退回纸笔考试,这是一把双刃剑。

盲点与局限

  1. “赢家通吃”的死胡同:人们普遍担心中美技术脱钩导致全球停滞,但对话指出,由于研究人员流动性极高、开源模型的传播速度,真正的技术垄断从未发生过。“思想”比“资源”更难封锁。
  2. 对AGI的过度神话:对于“AGI(通用人工智能)”和“ASI(超级人工智能)”的宏大叙事,嘉宾们持怀疑态度。他们认为目前的进展更多是“工具效率的提升”,而非意识的涌现。在没有看到关于偏好的统一量化方法之前,谈论奇点为时尚早。

未解之谜

  • 计算机使用的终极形态:如何让LLM获得像人类一样的“粗糙视觉理解和反直觉操作能力”是一个未解的工程难题。目前的Agent系统操作太精确,而现实世界操作需要模糊容忍度。
  • 数据的可持续性:当互联网数据被大模型污染(Contamination),合成数据又面临模型遗忘和泛化差的问题时,下一个高质量数据来源在哪里?

4. 💎 金句与高光时刻

  • “The dream of AGI is kind of dying… What we are evolving into is having many agents for different tasks.” (Nathan Lambert) —— 语境:关于AGI的迷思正在消散,未来更多是功能专精的AI Agents。
  • “You use it until it breaks, and then you explore other options.” (Sebastian Raschka) —— 语境:解释了LLM用户留存率的本质——直到模型犯错才换产品。
  • “The only way to learn is to struggle… If you always have an LLM solving coding for me, now there’s no coding.” (Sebastian Raschka) —— 语境:强调了人类在AI辅助下的主动反思与“挣扎”对于掌握技能的重要性。
  • “We are not putting it back… How do I make the best use of it?” (Nathan Lambert) —— 语境:关于个人如何应对AI时代,拒绝逃避,而是将其作为增强自身能力的工具。
  • “Winning is a very broad term… I don’t think there will be a clear winner in terms of technology access.” (Sebastian Raschka) —— 语境:对中美AI竞争格局的冷静评估,观点已经流动,资源才是核心壁垒。

5. 💼 行业启示与未来推演

短期影响 (1-3年)

  • 产品形态的分化:基础文本能力将迅速通货紧缩,竞争焦点转向**“系统集成能力”。模型产品将从“单纯的Chatbot”转向集成了工具调用、代码解释器和富文本编辑的“Composer“或“IDE-like“体验**(如Cursor, Claude Code)。
  • 服务成本的割裂:企业级市场将出现明显的“双轨制”。高风险、高合规要求的领域(如银行)将被迫使用Trusted Closed Models(受信任的封闭模型)或私有化微调模型;而标准化的内部办公流程将大量采用Open-weight Models hosted domestically

长期终局 (5-10年)

  • 算力与生态的摩尔定律:随着Nemotron、GPUs的普及以及软件栈(如CUDA替代品)的完善,运行大规模局部模型的成本将边际递减。同时,“模型即服务“的成本将进一步下降,导致出现像AWS一样的API巨头,或者甚至专门的云计算厂商。
  • 就业市场的震荡:软件开发将从“代码编写”转向“系统设计”和“Prompt Engineering/Supervision”。那种目前能看到代码、不需要知道底层逻辑的“低端码农”将被彻底淘汰;未来优秀的工程师更像是一个 orchestration(编排)中心,管理多个AI子系统。

行动建议

  • 给创业者:不要试图从头训练一个与GPT-4匹敌的基座模型。重心应放在特定领域的后训练复杂软件的端到端生成(如Slack自动化)以及针对RLVR的垂直科学验证
  • 给研究者:关注 Graph Neural Networks(图神经网络)和 World Models(世界模型)的结合,这是突破LLM局限、迈向AGI的关键路径;同时,AI教育是一个巨大的蓝海,设计“强制挣扎”的学习体验将是差异化策略。
  • 给个人:不要恐惧AI是自动化你的工作,而应视其为外置大脑。练习系统设计能力提问的艺术,学会和Agent协作解决麻烦,而不是等待指令。

逐字稿

Introduction

Lex Fridman (00:00:00) The following is a conversation all about the state of the art in artificial intelligence, including some of the exciting technical breakthroughs and developments in AI that happened over the past year, and some of the interesting things we think might happen this upcoming year. At times, it does get super technical, but we do try to make sure that it remains accessible to folks outside the field without ever dumbing it down. It is a great honor and pleasure to be able to do this kind of episode with two of my favorite people in the AI community, Sebastian Raschka and Nathan Lambert. They are both widely respected machine learning researchers and engineers who also happen to be great communicators, educators, writers, and X posters.

Lex Fridman (00:00:51) Sebastian is the author of two books I highly recommend for beginners and experts alike. First is Build a Large Language Model from Scratch, and Build a Reasoning Model from Scratch. I truly believe in the machine learning and computer science world, the best way to learn and understand something is to build it yourself from scratch. Nathan is the post-training lead at the Allen Institute for AI, and author of the definitive book on reinforcement learning from human feedback. Both of them have great X accounts, great Substacks. Sebastian has courses on YouTube, Nathan has a podcast. And everyone should absolutely follow all of those. This is the Lex Fridman podcast.

Lex Fridman (00:01:40) To support it, please check out our sponsors in the description, where you can also find links to contact me, ask questions, get feedback, and so on. And now, dear friends, here’s Sebastian Raschka and Nathan Lambert.

China vs US: Who wins the AI race?

Lex Fridman (00:01:57) So I think one useful lens to look at all this through is the so-called DeepSeek moment. This happened about a year ago in January 2025, when the open weight Chinese company DeepSeek released DeepSeek R1 that I think it’s fair to say surprised everyone with near or at state-of-the-art performance, with allegedly much less compute for much cheaper. And from then to today, the AI competition has gotten insane, both on the research level and the product level. It’s just been accelerating.

Lex Fridman (00:02:32) Let’s discuss all of this today, and maybe let’s start with some spicy questions if we can. Who’s winning at the international level? Would you say it’s the set of companies in China or the set of companies in the United States? Sebastian, Nathan, it’s good to see you guys. So Sebastian, who do you think is winning?

Sebastian Raschka (00:02:53) So winning is a very broad term. I would say you mentioned the DeepSeek moment, and I do think DeepSeek is definitely winning the hearts of the people who work on open weight models because they share these as open models. Winning, I think, has multiple timescales to it. We have today, we have next year, we have in ten years. One thing I know for sure is that I don’t think nowadays, in 2026, that there will be any company having access to a technology that no other company has access to. And that is mainly because researchers are frequently changing jobs, changing labs. They rotate. So I don’t think there will be a clear winner in terms of technology access.

Sebastian Raschka (00:03:37) However, I do think the differentiating factor will be budget and hardware constraints. I don’t think the ideas will be proprietary, but rather the resources that are needed to implement them. And so I don’t currently see a winner-takes-all scenario. I can’t see that at the moment.

Lex Fridman (00:03:59) Nathan, what do you think?

Nathan Lambert (00:04:00) You see the labs put different energy into what they’re trying to do. To demarcate the point in time when we’re recording this, the hype over Anthropic’s Claude Opus 4.5 model has been absolutely insane. I’ve used it and built stuff in the last few weeks, and it’s almost gotten to the point where it feels like a bit of a meme in terms of the hype. It’s kind of funny because this is very organic, and then if we go back a few months ago, Gemini 3 from Google got released, and it seemed like the marketing and wow factor of that release was super high. But then at the end of November, Claude Opus 4.5 was released and the hype has been growing, while Gemini 3 was before this.

Nathan Lambert (00:04:44) And it kind of feels like people don’t really talk about it as much, even though when it came out, everybody was like, this is Gemini’s moment to retake Google’s structural advantages in AI. Gemini 3 is a fantastic model, and I still use it. It’s just that differentiation is lower. I agree with what you’re saying, Sebastian, that the idea space is very fluid, but culturally Anthropic is known for betting very hard on code, and this Claude Code thing is working out for them right now. So I think that even if the ideas flow pretty freely, so much of this is bottlenecked by human effort and the culture of organizations, where Anthropic seems to at least be presenting as the least chaotic.

Nathan Lambert (00:05:23) It’s a bit of an advantage if they can keep doing that for a while. But on the other side of things, there’s a lot of ominous technology from China where there are way more labs than DeepSeek. DeepSeek kicked off a movement within China similar to how ChatGPT kicked off a movement in the US where everything had a chatbot. There are now tons of tech companies in China that are releasing very strong frontier open weight models, to the point where I would say that DeepSeek is kind of losing its crown as the preeminent open model maker in China, and the likes of Z.ai with their GLM models, MiniMax’s models, and Kimi K2 Thinking from Moonshot, especially in the last few months, have shone more brightly.

Nathan Lambert (00:06:04) The new DeepSeek models are still very strong, but that could be looked back on as a big narrative point where in 2025 DeepSeek came and provided this platform for way more Chinese companies that are releasing these fantastic models to have this new type of operation. These models from these Chinese companies are open weight, and depending on this trajectory, the business models that these American companies are doing could be at risk. But currently, a lot of people are paying for AI software in the US, and historically in China and other parts of the world, people don’t pay a lot for software.

Lex Fridman (00:06:37) So some of these models like DeepSeek have the love of the people because they are open weight. How long do you think the Chinese companies keep releasing open weight models?

Nathan Lambert (00:06:47) I would say for a few years. I think that, like in the US, there’s not a clear business model for it. I have been writing about open models for a while, and these Chinese companies have realized it. I get inbound from some of them. They’re smart and realize the same constraints, which is that a lot of top US tech companies and other IT companies won’t pay for an API subscription to Chinese companies for security concerns. This has been a long-standing habit in tech, and the people at these companies then see open weight models as an ability to influence and take part in a huge growing AI expenditure market in the US. They’re very realistic about this, and it’s working for them.

Nathan Lambert (00:07:24) And I think the government will see that that is building a lot of influence internationally in terms of uptake of the technology, so there’s going to be a lot of incentives to keep it going. But building these models and doing the research is very expensive, so at some point, I expect consolidation. But I don’t expect that to be a story of 2026; there will be more open model builders throughout 2026 than there were in 2025. And a lot of the notable ones will be in China.

Lex Fridman (00:07:50) You were going to say something?

Sebastian Raschka (00:07:51) Yes. You mentioned DeepSeek losing its crown. I do think to some extent, yes, but we also have to consider that they are still slightly ahead. It’s not that DeepSeek got worse, it’s just like the other ones are using the ideas from DeepSeek. For example, you mentioned Kimi, same architecture, they’re training it. And then again, we have this leapfrogging where they might be at some point in time a bit better because they have the more recent model. I think this comes back to the fact that there won’t be a clear winner. One person releases something, the other one comes in, and the most recent model is probably always the best model.

Nathan Lambert (00:08:30) Yeah. We’ll also see the Chinese companies have different incentives. DeepSeek is very secretive, whereas some of these startups are like the MiniMaxes and Z.ais of the world. Those two literally have filed IPO paperwork, and they’re trying to get Western mindshare and do a lot of outreach there. So I don’t know if these incentives will change the model development, because DeepSeek famously is built by a hedge fund, Highflyer Capital, and we don’t know exactly what they use the models for or if they care about this.

Lex Fridman (00:08:59) They’re secretive in terms of communication, but they’re not secretive in terms of the technical reports that describe how their models work. They’re still open on that front. And we should also say on the Claude Opus 4.5 hype, there’s the layer of something being the darling of the X echo chamber, the Twitter echo chamber, and the actual amount of people that are using the model. I think it’s probably fair to say that ChatGPT and Gemini are focused on the broad user base that just wants to solve problems in their daily lives, and that user base is gigantic. So the hype about the coding may not be representative of the actual use.

Sebastian Raschka (00:09:38) I would say also a lot of the usage patterns are name recognition and brand, but also almost muscle memory, where ChatGPT has been around for a long time. People just got used to using it, and it’s almost like a flywheel where they recommend it to other users. One interesting point is also the customization of LLMs. For example, ChatGPT has a memory feature. So you may have a subscription and you use it for personal stuff, but I don’t know if you want to use that same thing at work because there is a boundary between private and work. If you’re working at a company, they might not allow that or you may not want that.

Sebastian Raschka (00:10:16) And I think that’s also an interesting point where you might have multiple subscriptions. One is just clean code; it has nothing of your personal images or hobby projects in there. It’s just for work. And then the other one is your personal thing. I think the future involves multiple models for different use cases. It doesn’t mean you only have to have one.

ChatGPT vs Claude vs Gemini vs Grok: Who is winning?

Lex Fridman (00:10:38) What model do you think won 2025, and what model do you think is going to win ’26?

Nathan Lambert (00:10:43) I think in the context of consumer chatbots, the question is: are you willing to bet on Gemini over ChatGPT? Which I would say in my gut feels like a bit of a risky bet because OpenAI has been the incumbent and there are so many benefits to that in tech. I think the momentum in 2025 was on Gemini’s side, but they were starting from such a low point. RIP Bard and those earlier attempts. I think huge credit to them for powering through the organizational chaos to make that happen. But also it’s hard to bet against OpenAI because they always come off as so chaotic, but they’re very good at landing things.

Nathan Lambert (00:11:26) Personally, I have very mixed reviews of GPT-5, but it must have saved them so much money with the high-line feature being a router where most users are no longer charging their GPU costs as much. So I think it’s very hard to dissociate the things that I like out of models versus the things that are actually going to be a general public differentiator.

Lex Fridman (00:11:50) What do you think about 2026? Who’s going to win?

Nathan Lambert (00:11:52) I’ll say something, even though it’s risky. I think Gemini will continue to make progress on ChatGPT. Google has the scale when both of these are operating at such extreme scales, and Google has the ability to separate research and product a bit better, whereas you hear so much about OpenAI being chaotic operationally and chasing the high-impact thing, which is a very startup culture. Then on the software and enterprise side, I think Anthropic will have continued success as they’ve again and again been set up for that. Obviously Google Cloud has a lot of offerings, but I think this Gemini name brand is important for them to build.

Nathan Lambert (00:12:28) Google Cloud will continue to do well, but that’s a more complex thing to explain in the ecosystem because that’s competing with the likes of Azure and AWS rather than on the model provider side.

Lex Fridman (00:12:40) So in infrastructure, you think TPUs give them an advantage?

Nathan Lambert (00:12:45) Largely because the margin on NVIDIA chips is insane and Google can develop everything from top to bottom to fit their stack and not have to pay this margin, and they’ve had a head start in building data centers. So all of these things that have both high lead times and very hard margins on high costs, Google has a kind of historical advantage there. And if there’s going to be a new paradigm, it’s most likely to come from OpenAI. Their research division again and again has shown this ability to land a new research idea or a product. Like Deep Research, Sora, o1 thinking models—all these definitional things have come from OpenAI, and that’s got to be one of their top traits as an organization.

Nathan Lambert (00:13:28) So it’s kind of hard to bet against that, but I think a lot of this year will be about scale and optimizing what could be described as low-hanging fruit in models.

Lex Fridman (00:13:37) And clearly there’s a trade-off between intelligence and speed. This is what GPT-5 was trying to solve behind the scenes. It’s like, do people actually want intelligence, the broad public, or do they want speed?

Sebastian Raschka (00:13:52) I think it’s a nice variety actually, or the option to have a toggle there. For my personal usage, most of the time when I look something up, I use ChatGPT to ask a quick question and get the information I wanted fast. For most daily tasks, I use the quick model. Nowadays, I think the auto mode is pretty good where you don’t have to specifically say “thinking” or “non-thinking.” Then again, I also sometimes want the pro mode. Very often, when I have something written, I put it into ChatGPT and say, “Hey, do a very thorough check. Are all my references correct? Are all my thoughts correct? Did I make any formatting mistakes? Are the figure numbers wrong?” or something like that. And I don’t need that right away.

Sebastian Raschka (00:14:33) I can finish my stuff, maybe have dinner, let it run, come back and go through it. This is where I think it’s important to have this option. I would go crazy if for each query I had to wait 30 minutes, or even 10 minutes.

Nathan Lambert (00:14:46) That’s me. I’m sitting over here losing my mind that you use the router and the non-thinking model. I’m like, “How do you live with that?”

Nathan Lambert (00:14:55) That’s like my reaction. I’ve been heavily on ChatGPT for a while. I never touched GPT-5 non-thinking. I find it just… its tone and then its propensity for errors. It just has a higher likelihood of errors. Some of this is from back when OpenAI released o3, which was the first model to do this Deep Research and find many sources and integrate them for you. So I became habituated with that. I will only use GPT-5.2 thinking or pro when I’m finding any sort of information query for work, whether that’s a paper or some code reference. I will regularly have five pro queries going simultaneously, each looking for one specific paper or feedback on an equation.

Sebastian Raschka (00:15:38) I have a fun example where I just needed the answer as fast as possible for this podcast before I was going on the trip. I have a local GPU running at home and I wanted to run a long RL experiment. Usually I unplug things because if you’re not at home, you don’t want to have things plugged in, and I accidentally unplugged the GPU. My wife was already in the car and it was like, “Oh dang.” Basically, I wanted a Bash script as fast as possible that runs my different experiments and the evaluation. I know how to use the Bash terminal, but in that moment I just needed the command in 10 seconds.

Lex Fridman (00:16:18) This is a hilarious situation but yeah, so what did you use?

Sebastian Raschka (00:16:21) So I did the non-thinking fastest model. It gave me the Bash command. I wanted to chain different scripts to each other and route this to a log file with the `tee` command. Off the top of my head, I was just in a hurry; I could have thought about it myself.

Lex Fridman (00:16:37) By the way, I don’t know if there’s a representative case: wife waiting in the car, you have to run, unplug the GPU, you have to generate a Bash script. This sounds like a movie… …Mission Impossible.

Nathan Lambert (00:16:46) I use Gemini for that. I use thinking for all the information stuff and then Gemini for fast things or stuff that I could sometimes Google. It’s good at explaining things and I trust that it has this background of knowledge and it’s simple. And the Gemini app has gotten a lot better.

Nathan Lambert (00:17:01) It’s good for those sorts of things. And then for code and any sort of philosophical discussion, I use Claude Opus 4.5, also always with extended thinking. Extended thinking and inference-time scaling is just a way to make the models marginally smarter. I will always edge on that side when the progress is very high because you don’t know when that’ll unlock a new use case. And then I sometimes use Grok for real-time information or finding something on AI Twitter that I knew I saw and I need to dig up. Although when Grok 4 came out, the Grok 4 Heavy—which was their pro variant—was actually very good and I was pretty impressed with it, and then I just kind of lost track of it with muscle memory from having the ChatGPT app open. So I use many different things.

Lex Fridman (00:17:45) Yeah. I actually do use Grok 4 Heavy for debugging. For hardcore debugging that the other ones can’t solve, I find that it’s the best at. And it’s interesting because you say ChatGPT is the best interface. For me, for that same reason—but this could be just momentum— Gemini is the better interface for me. I think because I fell in love with their needle-in-the-haystack capabilities. If I ever put in something that has a lot of context but I’m looking for very specific information to make sure it tracks all of it, I find Gemini has been the best. So it’s funny with some of these models, if they win your heart over—

Lex Fridman (00:18:28) …for one particular feature on a particular day, for that particular query or prompt, you’re like, “This model’s better.” And so you’ll just stick with it for a bit until it does something really dumb. There’s like a threshold effect. Some smart thing happens and then you fall in love with it, and then it does some dumb thing and you’re like, “You know what? I’m gonna switch and try Claude or ChatGPT.” And all that kind of stuff.

Sebastian Raschka (00:18:51) This is exactly it. You use it until it breaks, until you have a problem, and then you change the LLM. I think it’s the same way we use anything, like our favorite text editor, operating system, or browser. I mean, there are so many browser options: Safari, Firefox, Chrome. They’re relatively similar, but then there are edge cases, maybe extensions you want to use, and then you switch. But I don’t think anyone types the same thing into different browsers and compares them. You only do that when the website doesn’t render or if something breaks. So that’s a good point. You use it until it breaks, and then you explore other options.

Nathan Lambert (00:19:28) On the long context thing, I was also a Gemini user for this, but the GPT-5.2 release blog had crazy long context scores, where a lot of people were like, “Did they just figure out some algorithmic change?” It went from like 30% to like 70% or something in this minor model update. So it’s also very hard to keep track of all of these things, but now I look more favorably at GPT-5.2’s long context. So it’s just kind of like a never-ending battle to actually get to testing this.

Lex Fridman (00:19:57) Well, it’s interesting that none of us talked about the Chinese models from a user perspective. What does that say? Does that mean the Chinese models are not as good, or does that mean we’re just very biased and US-focused?

Sebastian Raschka (00:20:11) I do think that’s currently the discrepancy between the model and the platform. I think the open models are more known for the open weights, not their platform yet.

Nathan Lambert (00:20:21) There are also a lot of companies that are willing to sell you open-model inference at a very low cost. I think, like OpenRouter, it’s easy to look at multi-model things. You can run DeepSeek on Perplexity. I think all of us sitting here are like, “We use OpenAI GPT-5 Pro consistently.” We’re all willing to pay for the marginal—

Nathan Lambert (00:20:39) …intelligence gain. And these models from the US are better in terms of the outputs. I think the question is, will they stay better for this year and for years going forward? But so long as they’re better, I’m going to pay for them. I think there’s also analysis that shows that the way the Chinese models are served—which you could argue is due to export controls or not—is that they use fewer GPUs per replica, which makes them slower and leads to different errors. It’s about speed and intelligence.

Nathan Lambert (00:21:09) If these things are in your favor as a user, I think in the US a lot of users will go for this. I think that is one thing that will spur these Chinese companies to want to compete in other ways, whether it’s free or substantially lower costs, or it’ll breed creativity in terms of offerings, which is good for the ecosystem. But I just think the simple thing is the US models are currently better, and we use them. I tried these other open models, and I’m like, “Fun, but I’m not gonna… I don’t go back to it.”

Lex Fridman (00:21:38) We didn’t really mention programming. That’s another use case that a lot of people deeply care about. I use basically half-and-half Cursor and Claude Code, because I find them to be fundamentally different experiences and both useful. You program quite a bit— …so what do you use? What’s the current vibe?

Sebastian Raschka (00:21:59) So, I use the Codeium plugin for VS Code. You know, it’s very convenient. It’s just a plugin, and then it’s a chat interface that has access to your repository. I know that Claude Code is a bit different. It is a bit more agentic. It touches more things; it does the whole project for you. I’m not quite there yet where I’m comfortable with that because maybe I’m a control freak, but I still like to see what’s going on. Codeium is the sweet spot for me right now where it is helping me, but it is not taking over completely.

Lex Fridman (00:22:29) I should mention, one of the reasons I do use Claude Code is to build the skill of programming with English. I mean, the experience is fundamentally different. As opposed to micromanaging the details of the generation and looking at the diff—which you can in Cursor if that’s the IDE you use—you are understanding the code deeply as you progress, versus just thinking in this design space and guiding it at a macro level. I think that is another way of thinking about the programming process. Also, Claude Code just seems to be a better utilization of Claude Opus 4.5.

Nathan Lambert (00:23:18) It’s a good side-by-side for people to do. You can have Claude Code open, you can have Cursor open, you can have VS Code open, and you can select the same models on all of them— …and ask questions, and it’s very interesting. Claude Code is way better in that domain. It’s remarkable.

Lex Fridman (00:23:32) All right, we should say that both of you are legit on multiple fronts: researchers, programmers, educators, and on the book front, too. Nathan, at some point soon, hopefully has an RLHF book coming out.

Nathan Lambert (00:23:50) It’s available for preorder, and there’s a full digital preprint. I’m just making it pretty and better organized for the physical thing, which is a lot of why I do it—it’s fun to create things that you think are excellent in physical form when so much of our life is digital.

Lex Fridman (00:24:05) I should say, going to Perplexity here, Sebastian Raschka is a machine learning researcher and author known for several influential books. A couple that I wanted to mention—and a book I highly recommend—is Build a Large Language Model From Scratch, and the new one, Build a Reasoning Model From Scratch. I’m really excited about that. Building stuff from scratch is one of the most powerful ways of learning.

Sebastian Raschka (00:24:27) Honestly, building an LLM from scratch is a lot of fun and a lot to learn. Like you said, it’s probably the best way to learn how something really works, because you can look at figures, but figures can have mistakes. You can look at conceptual explanations, but you might misunderstand them. But if there is code and the code works, you know it’s correct. There’s no misunderstanding; it’s precise. Otherwise, it wouldn’t work. I think that’s the beauty behind coding. It doesn’t lie. It’s math, basically. Even with math, you can have mistakes in a book you would never notice because you aren’t running the math while reading, so you can’t verify it. And with code, what’s nice is you can verify it.

Lex Fridman (00:25:09) Yeah, I agree with you about the Build a Large Language Model From Scratch book. It’s nice to tune out everything else, the internet and so on, and just focus on the book. But, you know, compared to history books, it’s just less lonely somehow. It’s really more fun. For example, on the programming front, I think it’s genuinely more fun to program with an LLM. And I think it’s genuinely more fun to read with an LLM. But you’re right. This distraction should be minimized. So you use the LLM to basically enrich the experience, maybe add more context. Maybe I just… the rate of ‘aha’ moments for me on a small scale is really high with LLMs.

Sebastian Raschka (00:25:54) 100%. I also want to correct myself: I’m not suggesting not to use LLMs. I suggest doing it in multiple passes. Like, one pass just offline, focus mode, and then after that… I mean, I also take notes, but I try to resist the urge to immediately look things up. I do a second pass. For me, it’s just more structured this way and I get less… I mean, sometimes things are answered in the chapter, but also it just helps to let it sink in and think about it. Other people have different preferences. I would highly recommend using LLMs when reading books. For me, it’s just not the first thing to do; it’s the second pass.

Lex Fridman (00:26:29) By way of recommendation, I do the opposite. I like to use the LLM at the beginning— …to lay out the full context of what is this world that I’m now stepping into. But I try to avoid clicking out of the LLM into the world of Twitter and blogs because then you’re down this rabbit hole. You’re reading somebody’s opinion, there’s a flame war about a particular topic, and all of a sudden you’re now in the realm of the internet and Reddit and so on. But if you’re purely letting the LLM give you the context of why this matters, what are the big picture ideas… sometimes books themselves are good at doing that, but not always.

Nathan Lambert (00:27:12) This is why I like the ChatGPT app, because it gives the AI a home in your computer where you can focus on it, rather than just being another tab in my mess of internet options. And I think Claude Code in particular does a good job of making that a joy, where it seems very engaging as a product design to be an interface that your AI will then go out into the world. There’s something very intangible between it and Codex; it just feels warm and engaging, whereas Codex from OpenAI can often be as good but it just feels a little bit rough around the edges.

Nathan Lambert (00:27:45) Whereas Claude Code makes it fun to build things, particularly from scratch where you trust that it’ll make something. Obviously this is good for websites and refreshing tooling, which I use it for, or data analysis. On my blog, we scrape Hugging Face so we keep the download numbers for every dataset and model over time now. Claude was just like, “Yeah, I’ve made use of that data, no problem.” And I was like, “That would’ve taken me days.” And then I have enough situational awareness to be like, “Okay, these trends obviously make sense,” and you can check things. But that’s just a wonderful interface where you can have an intermediary and not have to do the awful low-level work that you would have to do to maintain different web projects.

Open Source vs Closed Source LLMs

Lex Fridman (00:28:29) All right. So we just talked about a bunch of the closed-weight models. Let’s talk about the open ones. Tell me about the landscape of open LLM models. Which are interesting ones? Which stand out to you and why? We already mentioned DeepSeek.

Nathan Lambert (00:28:44) Do you wanna see how many we can name off the top of our head?

Lex Fridman (00:28:47) Yeah, yeah. Without looking at notes.

Nathan Lambert (00:28:48) DeepSeek, Kimi, MiniMax, Z.ai, Antlang. We’re just going Chinese.

Sebastian Raschka (00:28:57) Let’s throw in Mistral AI, Gemma— …gpt-oss, the open source model by OpenAI. Actually, NVIDIA had a really cool one, Nemotron 3. There’s a lot of stuff, especially at the end of the year. Qwen might be the one—

Nathan Lambert (00:29:12) Oh, yeah. Qwen was the obvious name I was gonna say. I was trying to get through… you can get at least 10 Chinese and at least 10 Western. I mean, OpenAI released their first open model—

Sebastian Raschka (00:29:21) A long time ago.

Nathan Lambert (00:29:22) …since GPT-2. When I was writing about OpenAI’s open model release, people were like, “Don’t forget about GPT-2,” which I thought was really funny because it’s just such a different time. But gpt-oss is actually a very strong model and does some things that the other models don’t do very well. Selfishly, I’ll promote a bunch of Western companies; both in the US and Europe have these fully open models. I work at the Allen Institute for AI where we’ve been building OLMo, which releases data and code and all of this. And now we have actual competition for people that are trying to release everything so that other people can train these models.

Nathan Lambert (00:29:57) So there’s the Institute for Foundation Models/LM360, which has had their K2 models of various types. Apertus is a Swiss research consortium. Hugging Face has SmolLM, which is very popular. And NVIDIA’s Nemotron has started releasing data as well. And then Stanford’s Marine Community Project, which is kind of making it so there’s a pipeline for people to open a GitHub issue and implement a new idea and then have it run in a stable language modeling stack. So this space, that list was way smaller in 2024-

Nathan Lambert (00:30:31) … so I think it was just AI2. So that’s a great thing for more people to get involved and to understand language models, which doesn’t really have a Chinese company that is an analog. While I’m talking, I’ll say that the Chinese open language models tend to be much bigger and that gives them this higher peak performance as MoEs, whereas a lot of these things that we like a lot, whether it was Gemma or Nemotron, have tended to be smaller models from the US, which is starting to change. Mistral Large 3 came out, which was a giant MoE model, very similar to DeepSeek architecture in December. And then a startup, Reka AI, and both Nemotron have… Nemotron and NVIDIA have teased MoE models way bigger than 100 billion parameters-

Nathan Lambert (00:31:16) … in the 400 billion parameter range coming in this Q1 2026 timeline. So I think this kind of balance is set to change this year in terms of what people are using the Chinese versus US open models for, which I’m personally going to be very excited to watch.

Lex Fridman (00:31:32) First of all, huge props for being able to name so many of these. Did you actually name LLaMA?

Sebastian Raschka (00:31:41) This was not on purpose.

Lex Fridman (00:31:43) RIP LLaMA. All right. Can you mention what are some interesting models that stand out? You mentioned Qwen 3 is obviously a standout.

Sebastian Raschka (00:31:51) So I would say the year’s almost book-ended by DeepSeek-V3 and DeepSeek R1. And then on the other hand, in December, DeepSeek-V3.2. Because what I like about those is they always have an interesting architecture tweak- … that others don’t have. But otherwise, if you want to go with the familiar but really good performance, Qwen 3 and, like Nathan said, also gpt-oss. And I think with gpt-oss, what’s interesting about it is it’s kind of the first open-weight model that was really trained with tool use in mind, which I do think is a bit of a paradigm shift where the ecosystem was not quite ready for it. So with tool use, I mean that the LLM is able to do a web search or call a Python interpreter.

Sebastian Raschka (00:32:33) And I do think it’s a standout because it’s a huge unlock. One of the most common complaints about LLMs is, for example, hallucinations, right? And so, in my opinion, one of the best ways to solve hallucinations is to not try to always remember information or make things up. For math, why not use a calculator app or Python?

Sebastian Raschka (00:32:54) If I ask the LLM, “Who won the soccer World Cup in 1998?” instead of just trying to memorize, it could go do a search. I think mostly it’s usually still a Google search. So ChatGPT and gpt-oss, they would do a tool call to Google, maybe find the FIFA website, and find that it was France. It would get you that information reliably instead of just trying to memorize it. So I think it’s a huge unlock which right now is not fully utilized yet by the open-weight ecosystem. A lot of people don’t use tool call modes because I think it’s a trust thing. You don’t want to run this on your computer where it has access to tools and could wipe your hard drive, so you want to containerize that. But I do think that is a really important step for the upcoming years to have this ability.

Lex Fridman (00:33:44) So a few quick things. First of all, thank you for defining what you mean by tool use. I think that’s a great thing to do in general for the concepts we’re talking about, even things as sort of well-established as MOEs. You have to say that means mixture of experts, and you kind of have to build up an intuition for people about what that means, how it’s actually utilized, what are the different flavors. So what does it mean that there’s just such an explosion of open models? What’s your intuition?

Nathan Lambert (00:34:13) If you’re releasing an open model, you want people to use it, is the first and foremost thing. And then after that comes things like transparency and trust. I think when you look at China, the biggest reason is that they want people around the world to use these models, and I think a lot of people will not. If you look outside of the US, a lot of people will not pay for software, but they might have computing resources where you can put a model on it and run it. I think there can also be data that you don’t want to send to the cloud. So the number one thing is getting people to use models, use AI, or use your AI that might not be able to do it without having access to the model.

Lex Fridman (00:34:46) I guess we should state explicitly, so we’ve been talking about these Chinese models and open weight models. Oftentimes, the way they’re run is locally. So it’s not like you’re sending your data to China or to whoever developed the model in Silicon Valley.

Nathan Lambert (00:35:04) A lot of American startups make money by hosting these models from China and selling them. It’s called selling tokens, which means somebody will call the model to do some piece of work. I think the other reason is for US companies like OpenAI. OpenAI is so GPU deprived; they’re at the limits of the GPUs. Whenever they make a release, they’re always talking about how their GPUs are hurting. And I think in one of these gpt-oss-120b release sessions, Sam Altman said, “Oh, we’re releasing this because we can use your GPUs. We don’t have to use our GPUs and OpenAI can still get distribution out of this,” which is another very real thing, because it doesn’t cost them anything.

Sebastian Raschka (00:35:43) And for the user, I think also, I mean, there are users who just use the model locally how they would use ChatGPT. But also for companies, I think it’s a huge unlock to have these models because you can customize them, you can train them, you can add more data post-training, like specialize them into, let’s say, law, medical models, whatever you have. And you mentioned Llama; the appeal of the open weight models from China is that the licenses are even friendlier. I think they are just unrestricted open source licenses, whereas if we use something like Llama or Gemma, there are some strings attached. I think it’s like an upper limit in terms of how many users you have.

Sebastian Raschka (00:36:21) And then if you exceed so many million users, you have to report your financial situation to, let’s say, Meta or something like that. And I think while it is a free model, there are strings attached, and people like things where strings are not attached. So I think that’s also one of the reasons besides performance why the open weight models from China are so popular, because you can just use them. There’s no catch in that sense.

Nathan Lambert (00:36:46) The ecosystem has gotten better on that front, but mostly downstream of these new providers providing such open licenses. That was funny when you pulled up Perplexity and said, “Kimi K2 Thinking hosted in the US.” Which is an exact example of what we’re talking about where people are sensitive to this. Kimi K2 Thinking is a model that is very popular. People say that has very good creative writing and also in doing some software things. So it’s just these little quirks that people pick up on with different models that they like.

Lex Fridman (00:37:14) What are some interesting ideas that some of these models have explored that you can speak to, like that are particularly interesting to you?

Sebastian Raschka (00:37:21) Maybe we can go chronologically. I mean, there was, of course, DeepSeek R1 that came out in January of 2025. However, this was based on DeepSeek-V3, which came out the year before in December 2024. There are multiple things on the architecture side. What is fascinating is you can still—I mean, that’s what I do with my from-scratch coding projects—you can still start with GPT-2, and you can add things to that model to make it into this other model. So it’s all still kind of like the same lineage. There is a very close relationship between those. But top of my head, DeepSeek, what was unique there is the Mixture of Experts. I mean, they were not inventing Mixture of Experts.

Sebastian Raschka (00:38:00) We can maybe talk a bit more about what Mixture of Experts means. But just to list these things first before we dive into detail: Mixture of Experts, but then they also had multi-head latent attention, which is a tweak to the attention mechanism. This was, I would say in 2025, the main distinguishing factor between these open weight models: different tweaks to make inference or KV cache size more economical. We can also define KV cache in a few moments. But it makes it more economical to have long context, to shrink the KV cache size. So what are tweaks that we can do? Most of them focused on the attention mechanism. There is multi-head latent attention in DeepSeek; there is group query attention, which is still very popular.

Sebastian Raschka (00:38:44) It’s not invented by any of those models; it goes back a few years. But that would be the other option. Sliding window attention, I think OLMo 3 uses it if I remember correctly. So there are these different tweaks that make the models different. Otherwise, I put them all together in an article once where I just compared them; they are surprisingly similar. It’s just different numbers in terms of how many repetitions of the transformer block you have in the center and just little knobs that people tune. But what’s so nice about it is it works no matter what. You can tweak things, you can move the normalization layers around to get some performance gains.

Sebastian Raschka (00:39:23) And OLMo is always very good in ablation studies, showing what it actually does to the model if you move something around. Ablation studies: does it make it better or worse? But there are so many ways you can implement a transformer and make it still work. The big ideas that are still prevalent are Mixture of Experts, multi-head latent attention, sliding window attention, and group query attention. And then at the end of the year, we saw a focus on making the attention mechanism scale linearly with inference token prediction. So there was Qwen3-neXt, for example, which added a gated delta net. It’s inspired by state space models, where you have a fixed state that you keep updating. But it makes essentially this attention cheaper, or it replaces attention with a cheaper operation.

Transformers: Evolution of LLMs since 2019

Lex Fridman (00:40:08) And it may be useful to step back and talk about transformer architecture in general.

Sebastian Raschka (00:40:13) Yeah, so maybe we should start with GPT-2 architecture, the transformer that was derived from the “Attention Is All You Need” paper.

Sebastian Raschka (00:40:21) So the “Attention Is All You Need” paper had a transformer architecture that had two parts: an encoder and a decoder. And GPT went with just focusing in on the decoder part. It is essentially still a neural network and it has this attention mechanism inside. And you predict one token at a time. You pass it through an embedding layer. There’s the transformer block. The transformer block has attention modules and a fully connected layer. And there are some normalization layers in between. But it’s essentially neural network layers with this attention mechanism. So coming from GPT-2 when we move on to gpt-oss-120b, there is, for example, the Mixture of Experts layer. It’s not invented by GPT-OSS; it’s a few years old.

Sebastian Raschka (00:41:04) But it is essentially a tweak to make the model larger without consuming more compute in each forward pass. So there is this fully connected layer, and if listeners are familiar with multi-layer perceptrons, you can think of a mini multi-layer perceptron, a fully connected neural network layer inside the transformer. And it’s very expensive because it’s fully connected. If you have a thousand inputs and a thousand outputs, that’s like a million connections. And it’s a very expensive part in this transformer. And the idea is to kind of expand that into multiple feedforward networks. So instead of having one, let’s say you have 256, but you don’t use all of them at the same time.

Sebastian Raschka (00:41:49) So you now have a router that says, “Okay, based on this input token, it would be useful to use this fully connected network.” And in that context, it’s called an expert. So a Mixture of Experts means you have multiple experts. And depending on what your input is—let’s say it’s more math-heavy—it would use different experts compared to, let’s say, translating input text from English to Spanish. It would maybe consult different experts. It’s not as clear-cut to say, “Okay, this is only an expert for math and this for Spanish.” It’s a bit more fuzzy. But the idea is essentially that you pack more knowledge into the network, but not all the knowledge is used all the time.

Sebastian Raschka (00:42:27) That would be very wasteful. So yeah, kind of like during the token generation, you are more selective. There’s a router that selects which tokens should go to which expert. It adds more complexity. It’s harder to train. There’s a lot that can go wrong, like collapse and everything. So I think that’s why OLMo 3 still uses dense… I mean, you have, I think, OLMo models with Mixture of Experts, but dense models, where dense means… So also, it’s jargon. There’s a distinction between dense and sparse. So Mixture of Experts is considered sparse because we have a lot of experts, but only a few of them are active. And then dense would be the opposite, where you only have, like, one fully connected module, and it’s always utilized.

Lex Fridman (00:43:08) So maybe this is a good place to also talk about KV cache. But actually, before that, even zooming out, fundamentally, how many new ideas have been implemented from GPT-2 to today? Like, how different really are these architectures?

Sebastian Raschka (00:43:25) Picture like the Mixture of Experts. The attention mechanism in gpt-oss-120b, that would be the Group Query Attention mechanism. So it’s a slight tweak from multi-head attention to Group Query Attention, so that we have two. I think they replaced LayerNorm by RMSNorm, but it’s just like a different normalization there and not a big change. It’s just like a tweak. The nonlinear activation function—for people familiar with deep neural networks, I mean, it’s the same as changing sigmoid with ReLU. It’s not changing the network fundamentally. It’s just like a tweak. And that’s about it, I would say. It’s not really fundamentally that different. It’s still the same architecture. So you can convert one from one… You can go from one into the other by just adding these changes, basically.

Lex Fridman (00:44:09) It fundamentally is still the same architecture.

Sebastian Raschka (00:44:12) Mm-hmm. Yep. So for example, you mentioned my book earlier. That’s a GPT-2 model in the book because it’s simple and it’s very small, so 124 million parameters approximately. But in the bonus materials, I do have OLMo from scratch, Gemini 3 from scratch, and other types of from-scratch models. And I always start with my GPT-2 model and just, you know, add different components and you get from one to the other. It’s kind of like a lineage in a sense. Yeah.

Lex Fridman (00:44:37) Can you build up an intuition for people? Because sort of when you zoom out and look at it, there’s so much rapid advancement in the AI world, and at the same time, fundamentally the architectures have not changed. So where is all the turbulence, the turmoil of the advancement happening? Where are the gains to be had?

Sebastian Raschka (00:45:01) So there are the different stages where you develop or train the network. You have pre-training. Now back in the day, it was just pre-training with GPT-2. Now you have pre-training, mid-training, and post-training. So I think right now we are in the post-training focus stage. I mean, pre-training still gives you advantages if you scale it up to better, higher quality data. But then we have capability unlocks that were not there with GPT-2, for example. ChatGPT is basically a GPT-3 model. And GPT-3 is the same as GPT-2 in terms of architecture. What was new was adding the supervised fine-tuning and the Reinforcement Learning with Human Feedback. So, it’s more on the algorithmic side rather than the architecture.

Nathan Lambert (00:45:44) I would say that the systems also change a lot. I think if you listen to NVIDIA’s announcements, they talk about things like, “You now do FP8, you can now do FP4.” And what is happening is these labs are figuring out how to utilize more compute to put into one model, which lets them train faster and lets them put more data in. And then you can find better configurations faster by doing this. So you can look at the tokens per second per GPU as a metric that you look at when you’re doing large-scale training. And you can go from, like, 10K to 13K by turning on FP8 training, which means you’re using less memory per parameter in the model. And by saving less information, you do less communication and you can train faster.

Nathan Lambert (00:46:24) So all of these system things underpin way faster experimentation on data and algorithms. It’s this kind of loop that keeps going where it’s kinda hard to describe when you look at the architecture and they’re exactly the same. But the code base used to train these models is gonna be vastly different— …and you could probably… the GPUs are different, but you probably train gpt-oss-20b way faster in wall clock time than GPT-2— …was trained at the time.

Sebastian Raschka (00:46:54) Yeah. Like you said, they had, for example, in the Mixture of Experts, this NVIDIA FP4 optimization where you get more throughput. But I do think for the speed, this is true, but it doesn’t give the model new capabilities in a sense. It’s just: how much can we make the computation coarser without suffering in terms of model performance degradation? But I do think there are alternatives popping up to the transformer. There are text diffusion models, a completely different paradigm. And although text diffusion models might use transformer architectures, it’s not an autoregressive transformer. And also Mamba models; it’s a State Space Model.

Sebastian Raschka (00:47:34) But they do have trade-offs, and what’s true is there’s nothing that has replaced the autoregressive transformer as the state-of-the-art model. So, for state-of-the-art, you would still go with that thing, but there are now alternatives for the cheaper end—alternatives that are kind of making compromises, but it’s not just one architecture anymore. There are little ones coming up. But if we talk about the state-of-the-art, it’s pretty much still the transformer architecture, autoregressive, derived from GPT-2 essentially.

AI Scaling Laws: Are they dead or still holding?

Lex Fridman (00:48:06) I guess the big question here is—we talked quite a bit here on the architecture behind the pre-training—are the scaling laws holding strong across pre-training, post-training, inference, context size, data, and synthetic data?

Nathan Lambert (00:48:20) I’d like to start with the technical definition of a scaling law-

Nathan Lambert (00:48:23) …which kind of informs all of this. The scaling law is the power law relationship between… You can think of the x-axis—what you are scaling—as a combination of compute and data, which are kind of similar, and then the y-axis is like the held-out prediction accuracy over our next tokens. We talked about models being autoregressive. It’s like if you keep a set of text that the model has not seen, how accurate will it get when you train? And the idea of scaling laws came when people figured out that that was a very predictable relationship. I think that technical term is continuing, and then the question is, what do users get out of it? And then there are more types of scaling, where OpenAI’s o1 was famous for introducing inference-time scaling.

Nathan Lambert (00:49:07) And I think less famously for also showing that you can scale reinforcement learning training and get kind of this log x-axis and then a linear increase in performance on the y-axis. So there are kind of these three axes now where the traditional scaling laws are talked about for pre-training—which is how big your model is and how big your dataset is—and then scaling reinforcement learning, which is like how long can you do this trial and error learning that we’ll talk about. We’ll define more of this, and then this inference-time compute, which is just letting the model generate more tokens on a specific problem.

Nathan Lambert (00:49:37) So I’m kind of bullish; they’re all really still working, but the low-hanging fruit has mostly been taken, especially in the last year on Reinforcement Learning with Verifiable Rewards, which is this RLVR, and then inference-time scaling. That’s why these models feel so different to use, where previously you would get that first token immediately. And now they’ll go off for seconds, minutes, or even hours generating these hidden thoughts before giving you the first word of your answer. And that’s all about this inference-time scaling, which is such a wonderful kind of step function in terms of how the models change abilities. They enabled this tool use stuff and enabled this much better software engineering that we were talking about.

Nathan Lambert (00:50:17) And this is, when we say enabled, almost entirely downstream of the fact that this Reinforcement Learning with Verifiable Rewards training just let the models pick up these skills very easily. So if you look at the reasoning process when the models are generating a lot of tokens, what it’ll often be doing is: it tries a tool, it looks at what it gets back, it tries another API, it sees what it gets back and if it solves the problem. The models, when you’re training them, very quickly learn to do this.

Nathan Lambert (00:50:46) And then at the end of the day, that gives this kind of general foundation where the model can use CLI commands very nicely in your repo, handle Git for you, move things around, organize things, or search to find more information—which, if we were sitting in these chairs a year ago, is something that we didn’t really think of the models doing. So this is just something that has happened this year and has totally transformed how we think of using AI, which I think is very magical. It’s such an interesting evolution and unlocks so much value. But it’s not clear what the next avenue will be in terms of unlocking stuff like this.

Nathan Lambert (00:51:23) I think that there’s—we’ll get to continual learning later, but there’s a lot of buzz around certain areas of AI, but no one knows when the next step function will really come.

Lex Fridman (00:51:31) So you’ve actually said quite a lot of things there, and said profound things quickly. It would be nice to unpack them a little bit. You say you’re bullish basically on every version of scaling. So can we just start at the beginning? Pre-training: are we implying that the low-hanging fruit on pre-training scaling has been picked? Has pre-training hit a plateau, or are you still bullish on even pre-training?

Nathan Lambert (00:52:01) Pre-training has gotten extremely expensive. I think to scale up pre-training, it’s also implying that you’re going to serve a very large model to the users. So I think that it’s been loosely established the likes of GPT-4 and similar models were around one trillion parameters at the biggest size. There’s a lot of rumors that they’ve actually gotten smaller as training has gotten more efficient. You want to make the model smaller because then your costs of serving go down proportionately. The cost of training these models is really low relative to the cost of serving them to hundreds of millions of users. I think DeepSeek had this famous number of about five million dollars for pre-training at cloud market rates.

Nathan Lambert (00:52:40) In the OLMo 3 paper, section 2.4, we just detailed how long we had the GPU clusters sitting around for training—which includes engineering issues, multiple seeds—and it was about two million dollars to rent the cluster to deal with all the problems and headaches of training a model. So these models are… a lot of people could get one to 10 million dollars to train a model, but the recurring costs of serving millions of users is really billions of dollars of compute. A thousand GPU rental you can pay 100 grand a day for. And these companies could have millions of GPUs. You can look at how much these things cost to sit around.

Nathan Lambert (00:53:19) So that’s kind of a big thing, and then it’s like, if scaling is actually giving you a better model, is it going to be financially worth it? And I think we’ll slowly push it out as AI solves more compelling tasks—like the likes of Claude Opus 4.5 making Claude Code just work for things. I launched this project called the ATOM project, which is American Truly Open Models, in July, and that was like a true vibe-coded website. I have a job to make plots and stuff. Then I came back to refresh it in the last few weeks and Claude Opus 4.5, versus whatever model was available at the time, just crushed all the issues that it had from building in June and July. It might be a bigger model. There’s a lot of things that go into this, but there’s still progress coming.

Lex Fridman (00:54:04) So what you’re speaking to is the nuance of the y-axis of the scaling laws—that the way it’s experienced versus on a benchmark, the actual intelligence might be different. But still, your intuition about pre-training: if you scale the size of compute, will the models get better? Not whether it’s financially viable, but just from the law aspect of it, do you think the models will get smarter?

Nathan Lambert (00:54:28) Yeah. And I think that there’s… And this sometimes comes off as almost disillusioned from leadership at AI companies saying this, but they’re like, “It’s held for 13 orders of magnitude of compute; why would it ever end?” So I think fundamentally it is pretty unlikely to stop. It’s just like eventually we’re not even going to be able to test the bigger scales because of all the problems that come with more compute. I think that there’s a lot of talk on how 2026 is a year when very large NVIDIA Blackwell compute clusters—like gigawatt-scale facilities—are coming online. And these were all contracts for power and data centers that were signed and sought out in ’22 and 2023, before or right after ChatGPT.

Nathan Lambert (00:55:13) So it took this two-to-three-year lead time to build these bigger clusters to train the models, while there’s obviously immense interest in building even more data centers than that. So that is kind of the crux that people are saying: these new clusters are coming. The labs are going to have more compute for training. They’re going to utilize this, but it’s not a given. I’ve seen so much progress that I expect it, and I expect a little bit bigger models. I would say it’s more like we’ll see a $2,000 subscription this year; we’ve already seen $200 subscriptions. It’s like that could 10x again, and these are the kind of things that could come—and they’re all downstream of a bigger model that offers just a little bit more of a cutting edge.

Lex Fridman (00:55:53) So, it’s reported that xAI is going to hit that one-gigawatt scale early ’26, and a full two gigawatts by year end. How do you think they’ll utilize that in the context of scaling laws? Is a lot of that inference? Is a lot of that training?

Nathan Lambert (00:56:12) It ends up being all of the above. I think that all of your decisions when you’re training a model come back to pre-training. So if you’re going to scale RL on a model, you still need to decide on your architecture that enables this. We were talking about other architectures and using different types of attention. We’re also talking about Mixture of Experts models. The sparse nature of MoE models makes it much more efficient to do generation, which becomes a big part of post-training, and you need to have your architecture ready so that you can actually scale up this compute. I still think most of the compute is going in at pre-training. Because you can still make a model better, you still want to go and revisit this.

Nathan Lambert (00:56:53) You still want the best base model that you can. And in a few years that’ll saturate and the RL compute will just go longer.

Lex Fridman (00:57:00) Are there people who disagree with you that say basically pre-training is dead? That it’s all about scaling inference, scaling post-training, scaling context, continual learning, and scaling synthetic data?

Nathan Lambert (00:57:15) People vibe that way and describe it in that way, but I think it’s not the practice that is happening.

Lex Fridman (00:57:19) It’s just the general vibe of people saying this thing is dead—

Nathan Lambert (00:57:21) The excitement is elsewhere. So the low-hanging fruit— …in RL is elsewhere. For example, we released our model in November. Every company has deadlines. Our deadline was like November 20th, and for that, our run was five days, which compared to 2024 is a very long time to just be doing post-training on a model of about 30 billion parameters. It’s not a big model. And then in December, we had another release, which was just letting the RL run for another three and a half weeks, and the model got notably better, so we released it. And that’s a big amount of time to just allocate to something that is going to be your peak— …for the year. So it’s like—

Nathan Lambert (00:57:58) There’s these types of decisions that happen when they’re training a model where they just can’t leave it forever. You have to keep pulling in the improvements you have from your researchers. So you redo pre-training, you’ll do this post-training for a month, but then you need to give it to your users. You need to do safety testing. I think there’s a lot in place that reinforces this cycle of just keep updating the models. There’s things to improve. You get a new compute cluster that lets you do something maybe more stably or faster. You hear a lot about Blackwell having rollout issues, where at AI2 most of the models we’re pre-training are on like 1,000 to 2,000 GPUs.

Nathan Lambert (00:58:36) But when you’re pre-training on 10,000 or 100,000 GPUs, you hit very different failures. GPUs are known to break in weird ways, and doing a 100,000 GPU run is like… you’re pretty much guaranteed to always have at least one GPU that is down. And you need to have your training code handle that redundancy, which is just a very different problem. Whereas what we’re doing like, “Oh, I’m playing with post-training on DJI Spark,” or people learning ML, what they’re battling to train these biggest models is just like— …mass distributed scale, and it’s very different. But that’s somewhat different than… that’s a systems problem—

Nathan Lambert (00:59:11) …in order to enable the scaling laws, especially at pre-training. You need all of these GPUs at once. When we shift to reinforcement learning, it actually lends itself to heterogeneous compute because you have many copies of the model. To do a primer for language model reinforcement learning, what you’re doing is you have two sets of GPUs. One you can call the actor and one you call the learner. The learner is where your actual reinforcement learning updates happen. These are traditionally policy gradient algorithms. Proximal Policy Optimization, PPO, and Group Relative Policy Optimization, GRPO, are the two popular classes.

Nathan Lambert (00:59:50) On the other side, you’re going to have actors which are generating completions, and these completions are the things that you’re going to grade. Reinforcement learning is all about optimizing reward. In practice, you can have a lot of different actors in different parts of the world doing different types of problems, and then you send it back to this highly networked compute cluster to do this actual learning, where you take the gradients and you need to have a tightly meshed network where you can do different types of parallelism and spread out your model for efficient training. Every different type of training and serving has these considerations you need to scale.

Nathan Lambert (01:00:27) We talked about pre-training, we talked about RL, and then inference time scaling is: how do you serve a model that’s thinking for an hour to 100 million users? I don’t really know about that, but I know that’s a hard problem. In order to give people this intelligence, there’s all these systems problems, and we need more compute and you need more stable compute to do it.

Lex Fridman (01:00:46) But you’re bullish on all of these kinds of scaling is what I’m hearing. On the inference, on the reasoning, even on the pre-training?

Sebastian Raschka (01:00:54) Yeah, so that’s a big can of worms, but there are basically two knobs: training and inference scaling, where you can get gains. In a world where we had infinite compute resources, you’d want to do all of them. You have training, you have inference scaling, and training is like a hierarchy: pre-training, mid-training, and post-training. Changing the model size, more training data, training a bigger model—it gives you more knowledge. Then the model is a better base model, or what we still call a foundation model, and it unlocks capabilities. But you don’t necessarily have the model be able to solve your most complex tasks—

Sebastian Raschka (01:01:34) …tasks during pre-training or after pre-training. You still have these other unlock phases, mid-training or post-training with RL, that unlocks capabilities that the model has in terms of knowledge from the pre-training. And I think, sure, if you do more pre-training, you get a better base model that you can unlock later. But like Nathan said, it just becomes too expensive. We don’t have infinite compute, so you have to decide: do I want to spend that compute more on making the model larger? It’s a trade-off. In an ideal world, you want to do all of them. And I think in that sense, scaling is still pretty much alive.

Sebastian Raschka (01:02:08) You would still get a better model, but like we saw with Claude 4.5, it’s just not worth it. I mean, because you can unlock more performance with other techniques at that moment, especially if you look at inference scaling. That’s one of the biggest gains this year with o1, where it took a smaller model further than pre-training a larger model like Claude 4.5. So, I wouldn’t say pre-training scaling is dead; it’s just that there are other more attractive ways to scale right now. But at some point, you will still want to make some progress on the pre-training. The thing to consider is where you want to spend your money.

Sebastian Raschka (01:02:47) If you spend it more on pre-training, it’s a fixed cost. You train the model, and then it has this capability forever. You can always use it. With inference scaling, you don’t spend money during training; you spend money later per query, and then it’s about the math. How long is my model going to be on the market if I replace it in half a year? Maybe it’s not worth spending 5 million, 10 million, or 100 million dollars on training it longer. Maybe I will just do more inference scaling and get the performance from there. It maybe costs me 2 million in terms of user queries. It becomes a question of how many users you have and doing the math. I think that’s also where it’s interesting, where ChatGPT is in a position.

Sebastian Raschka (01:03:27) I think they have a lot of users where they need to go a bit cheaper, where they have that GPT-5 model that is a bit smaller. For other companies, their customers have other trade-offs. For example, there were the math problems or the Math Olympiad where they had a proprietary model, and I’m pretty sure it’s just a model that has been fine-tuned a little bit more, but most of it was inference scaling to achieve peak performance in certain tasks where you don’t need that all the time. But yeah, long story short, I do think pre-training, mid-training, post-training, and inference scaling are all still things you want to do. At the moment, this year, it’s finding the right ratio that gives you the best bang for the buck, basically.

How AI is trained: Pre-training, Mid-training, and Post-training

Lex Fridman (01:04:13) I think this might be a good place to define pre-training, mid-training, and post-training.

Sebastian Raschka (01:04:18) So, pre-training is the classic training one next token prediction at a time. You have a big corpus of data. Nathan probably also has very interesting insights there because of OLMo 3. A big portion of the paper focuses on the right data mix. So, pre-training is essentially just training across entropy loss, training on next token prediction on a vast corpus of internet data, books, papers and so forth. It has changed a little bit over the years in the sense people used to throw in everything they can. Now, it’s not just raw data. It’s also synthetic data where people rephrase certain things. So synthetic data doesn’t necessarily mean purely AI-made-up data.

Sebastian Raschka (01:04:58) It’s also taking something from a Wikipedia article and then rephrasing it as a Q&A question or summarizing it, rewarding it, and making better data that way. I think of it like with humans. If someone reads a book compared to a messy—no offense, but like—Reddit post or something like that. I do think you learn—no offense, but I think—

Lex Fridman (01:05:25) There’s going to be a post about this, Sebastian.

Nathan Lambert (01:05:28) Some Reddit data is very coveted and excellent for training. You just have to filter it.

Sebastian Raschka (01:05:33) And I think that’s the idea. I think it’s like if someone took that and rephrases it in a, let’s say, more concise and structured way— I think it’s higher quality data that gets the LLM maybe the same—you get the same LLM out of it at the end, but it gets there faster. It trains faster because if the grammar and the punctuation are correct, it already learns the correct way versus getting information from a messy way and then learning later how to correct that. So, I think that is how pre-training evolved and why scaling still works; it’s not just about the amount of data, it’s also the tricks to make that data better for you. And then mid-training is… I mean, it used to be called pre-training.

Sebastian Raschka (01:06:21) I think it’s called mid-training because it was awkward to have pre-training and post-training but nothing in the middle, right? It sounds a bit weird. You have pre-training and post-training, but what’s the actual training? So, the mid-training is usually similar to pre-training, but it’s a bit more specialized. It’s the same algorithm, but what you do is you focus, for example, on long context documents. The reason you don’t do that during pre-training is because you don’t have that many long context documents. We have a specific phase. And one problem of LLMs is still that it’s a neural network; it has the problem of catastrophic forgetting.

Sebastian Raschka (01:06:56) So, you teach it something, it forgets other things. It’s not 100% forgetting, but there’s no free lunch. It’s also the same with humans. If you ask me some math I learned 10 years ago, I wouldn’t know; I would have to look at it again.

Lex Fridman (01:07:09) Nathan was actually saying that he’s consuming so much content that there’s a catastrophic forgetting issue.

Nathan Lambert (01:07:14) Yeah, I’m trying to learn so much about AI, and it’s like when I was learning about pre-training parallelism, I’m like, “I lost something and I don’t know what it was.”

Sebastian Raschka (01:07:22) I don’t want to anthropomorphize LLMs, but I think it’s the same in terms of how humans learn. Quantity is not always better because it’s about being selective. Mid-training is being selective in terms of quality content at the end, so the last thing the LLM has seen is the quality stuff. And then post-training is all the fine-tuning: supervised fine-tuning, DPO, RLVR with human feedback and so forth. So, the refinement stages. And it’s also interesting, the cost thing, right? Pre-training, you spend a lot of money on that right now. RL a bit less. RL, you don’t really teach it knowledge; it’s more like unlocking the knowledge.

Sebastian Raschka (01:08:03) It’s more like skill learning, like how to solve problems with the knowledge that it has from pre-training. There are actually three papers this year, or last year, 2025, on RL for pre-training. But I don’t think anyone does that in production.

Nathan Lambert (01:08:17) Toy, toy examples for now.

Sebastian Raschka (01:08:18) Toy examples, right. Но to generalize, RL post-training is more like the skill unlock, where pre-training is like soaking up the knowledge essentially.

Nathan Lambert (01:08:26) A few things that could be helpful for people. A lot of people think of synthetic data as being bad for training models. You mentioned that DeepSeek got an OCR—Optical Character Recognition—paper. A lot of labs did; AI2 had one, others had multiple. And the reason each of these labs has these is because there’s vast amounts of PDFs and other digital documents on the web in formats that aren’t encoded with text easily. So you use these, like DeepSeek OCR or what we called OLMo OCR, to extract what can be trillions of tokens of candidate data. Pre-training dataset size is on the order of trillions; it’s measured in trillions of tokens.

Nathan Lambert (01:09:10) Smaller models from researchers can be something like 5 to 10 trillion. Qwen is documented going up to like 50 trillion, and there’s rumors that these closed labs can go to 100 trillion tokens. Getting this potential data is a very big funnel, and the data you actually train the model on is a small percentage of this. This character recognition data would be described as synthetic data for pre-training in a lab. And then there’s also the fact that ChatGPT now gives wonderful answers, and you can train on those best answers; that’s synthetic data. It’s very different than the early ChatGPT hallucinations data.

Sebastian Raschka (01:09:48) One interesting question is, if I recall correctly, OLMo 3 was trained with less data than specifically some other open-weight models, maybe even OLMo 2. But you still got better performance, and that might be one of the examples of how the data helped.

Nathan Lambert (01:10:01) It’s mostly down to data quality. I think if we had more compute, we would train for longer. I think we’d ultimately see that as something we would want to do. Especially with big models, you need more compute because big models can absorb more from data, and you get more benefit out of this. It’s like one of those logarithmic graphs—a small model will level off sooner if you’re measuring tons of tokens, and bigger models need more. But mostly, we aren’t training that big of models right now at AI2, and getting the highest quality data we can is the natural starting point.

Lex Fridman (01:10:38) Is there something to be said about the topic of data quality? Is there some low-hanging fruit there still where the quality could be improved?

Nathan Lambert (01:10:46) It’s like turning the crank. So I think historically, in the open, there’s been a canonical best pre-training dataset that has moved around between who has the most recent one or the best recent effort. Like AI2’s Dolma was very early with the first OLMo and Hugging Face had FineWeb. And there’s the DCLM project, which stands for Data Comp Language Model. There’s been Data Comp for other machine learning projects, and they had a very strong dataset. A lot of it is the internet becoming fairly closed off, so we have Common Crawl, which I think is hundreds of trillions of tokens, and you filter it.

Nathan Lambert (01:11:21) And it looks like a lot of scientific work where you’re training classifiers and making decisions based on how you prune down this dataset into the highest quality stuff and the stuff that suits your tasks. Previously, language models were tested a lot more on knowledge and conversational things, but now they’re expected to do math and code. To train a reasoning model, you need to remix your whole dataset. And there are actually some wonderful scientific methods here where you can take your gigantic dataset and sample a lot of really tiny things from different sources, like GitHub, Stack Exchange, Reddit, or Wikipedia.

Nathan Lambert (01:11:56) You can sample small things from them, train small models on each of these mixes, and measure their performance on your evaluations. And you can just do basic linear regression, and it’s like, “Here’s your optimal dataset.” But if your evaluations change, your dataset changes a lot. So a lot of OLMo 3 was adding new sources for reasoning to be better at math and code, and then you do this mixing procedure and it gives you the answer. I think a lot of that’s happened at labs this year; there are new hot things, whether it’s coding environments or web navigation, and you just need to bring in new data and change your whole pre-training so that your post-training can work better. And that’s like the constant re-evolution and the re-determining of what they care about for their models.

Lex Fridman (01:12:35) Are there fun anecdotes of what sources of data are particularly high quality that we wouldn’t expect? You mentioned Reddit sometimes can be a source.

Nathan Lambert (01:12:45) Reddit was very useful. I think that PDFs are definitely one.

Sebastian Raschka (01:12:51) Oh, especially arXiv.

Nathan Lambert (01:12:52) Yeah, AI2 has run Semantic Scholar for a long time, which is a competitor to Google Scholar with a lot more features. To do this, AI2 has found and scraped a lot of PDFs for openly accessible papers that might not be behind the closed paid garden of a certain publisher—truly open scientific PDFs. If you sit on all of these and process them, you can get value out of it. I think a lot of that style of work has been done by the frontier labs much earlier. You need to have a pretty skilled researcher that understands how things change models, and they bring it in and clean it; it’s a lot of labor.

Nathan Lambert (01:13:34) I think at a lot of frontier labs, when they scale researchers, a lot more goes into data. If you join a frontier lab and you want to have impact, the best way to do it is just find new data that’s better. The fancy, glamorous algorithmic things, like figuring out how to make o1, is like the sexiest thought for a scientist. It’s like, “Oh, I figured out how to scale RL.” There’s a group that did that, but I think most of the contributions are-

Nathan Lambert (01:13:58) … “I’m gonna make the data better,” or, “I’m gonna make the infrastructure better so that everybody on my team can run experiments 5% faster.”

Sebastian Raschka (01:14:04) At the same time, I think it’s also one of the closest guarded secrets—what your training data is—for legal reasons. And so there’s also a lot of work that goes into hiding what your training data was, essentially, trying to get the model to not give away the sources because of those legal reasons.

Nathan Lambert (01:14:19) The other thing, to be complete, is that some people are trying to train on only licensed data, whereas Common Crawl is a scrape of the whole internet. If I host multiple websites, I’m happy to have them train language models, but I’m not explicitly licensing what governs it. Therefore, Common Crawl is largely unlicensed, which means your consent really hasn’t been provided for how to use the data. There’s another idea where you can train language models only on data that has been licensed explicitly so that the kind of governing contract is provided. I’m not sure if Apertus is the copyright thing or the license thing. I know that the reason they did it was for an EU compliance thing, where they wanted to make sure that their model fit one of those checks.

Sebastian Raschka (01:15:01) Mm-hmm. And on that note, there’s also the distinction between the licensing. Some people, like you said, just purchase the license. Let’s say they buy an Amazon Kindle book or a Manning book, and then use that in the training data; that is a gray zone because you paid for the content and you might want to train on it. But then there are also restrictions where even that shouldn’t be allowed. That is where it gets a bit fuzzy.

Sebastian Raschka (01:15:28) And I think that is still a hot topic right now. Big companies like OpenAI approached private companies for their proprietary data, and private companies are becoming more and more protective of their data because they know, “Okay, this is going to be my moat in a few years.” And I do think that’s the interesting question. If LLMs become more commoditized, and a lot of people learn about LLMs, there will be a lot more people able to train them. Of course, there are infrastructure challenges.

Sebastian Raschka (01:16:00) But if you think of big industries like pharmaceuticals, law, or finance, I do think they at some point will hire people from other frontier labs to build their in-house models on their proprietary data, which will be another unlock with pre-training that is currently not there. Because even if you wanted to, you can’t get that data—you can’t get access to clinical trials most of the time and these types of things. So I do think scaling in that sense might still be pretty much alive if you look at domain-specific applications, because right now we are just looking at general-purpose LLMs like ChatGPT, Anthropic, and so forth. They are just general purpose. They’re not even scratching the surface of what an LLM can do if it is really specifically trained and designed for a specific task.

Nathan Lambert (01:16:47) I think on the data thing, this is one of the things where, like, this happened in 2025 and we totally forget it: Anthropic lost in court and owed $1.5 billion to authors. Anthropic, I think, bought thousands of books and scanned them and was cleared legally for that because they bought the books, and that is going through the system. And then on the other side, they also torrented some books, and I think this torrenting was the path where the court said that they were then culpable to pay these billions of dollars to authors, which is just such a mind-boggling lawsuit that kind of just came and went. Like, that is so much money- … from the VC ecosystem.

Lex Fridman (01:17:22) These are court cases that will define the future of human civilization because it’s clear that data drives a lot of this, and there’s this very complicated human tension. I mean, you can empathize. You’re both authors. And there’s some degree to which, I mean, you put your heart and soul and your sweat and tears into the writing that you do. It feels a little bit like theft for somebody to train on your data without giving you credit.

Sebastian Raschka (01:17:49) And there are, like Nathan said, also two layers to it. Someone might buy the book and then train on it, which could be argued fair or not fair, but then there are the straight-up companies who use pirated books where they’re not even compensating the author. That is, I think, where people got a bit angry about it specifically, I would say.

Lex Fridman (01:18:06) Yeah, but there has to be some kind of compensation scheme. This is like moving towards something like Spotify streaming did originally for music. You know, what does that compensation look like? You have to define those kinds of models. You have to think through all of that. One other thing I think people are generally curious about, I’d love to get your thoughts: as LLMs are used more and more, if you look at even arXiv or GitHub, more and more of the data is generated by LLMs. What do you do in that kind of world? How big of a problem is that?

Nathan Lambert (01:18:38) The largest problem is the infrastructure and systems, but from an AI point of view, it’s kind of inevitable.

Lex Fridman (01:18:45) So it’s basically LLM-generated data that’s curated by humans essentially, right?

Nathan Lambert (01:18:49) Yes, and I think that a lot of open source contributors are legitimately burning out. If you have a popular open source repo, somebody’s like, “Oh, I want to do open source AI. It’s good for my career,” and they just vibe code something and throw it in. You might get more of this than I do.

Sebastian Raschka (01:19:05) Yeah, so I actually have a case study here. I have a repository called mlxtend that I developed as a student, around 10 or 15 years ago, and it is a reasonably popular library still for certain algorithms, especially frequent data mining stuff. There were recently two or three people who submitted a lot of PRs in a very short amount of time. I do think LLMs have been involved in submitting these PRs. Me, as the maintainer, there are two things. First, I’m a bit overwhelmed; I don’t have time to read through it because, especially since it’s an older library, that is not a priority for me. At the same time, I kind of also appreciate it because I think something people forget is it’s not just using the LLM.

Sebastian Raschka (01:19:46) There’s still a human layer that verifies something, and that is in a sense also how data is labeled, right? One of the most expensive things is getting labeled data for RLHF (Reinforcement Learning from Human Feedback) phases. This is kind of like that, where it goes through phases and then you actually get higher quality data out of it. So I don’t mind it, in a sense. It can feel overwhelming, but I do think there is also value in it.

Lex Fridman (01:20:11) It feels like there’s a fundamental difference between raw LLM-generated data and LLM-generated data with a human in the loop that does some kind of verification, even if that verification is a small percentage- … of the lines of code.

Sebastian Raschka (01:20:25) I think this goes with anything where people think, “Oh, yeah. I can just use an LLM to learn about XYZ,” which is true. You can, but there might be a person who is an expert who might have used an LLM to write specific code. There is this human work that went into it to make it nice and throwing out the not-so-nice parts to pre-digest it for you, and that saves you time. And I think that’s the value-add where you have someone filtering things or even using the LLMs correctly. I think this is still labor that you get for free. For example, when you read a Substack article.

Sebastian Raschka (01:21:05) I could maybe ask an LLM to give me opinions on that, but I wouldn’t even know what to ask. And I think there is still value in reading that article compared to me going to the LLM because you are the expert. You select what knowledge is actually spot on and should be included, and you give me this executive summary. This is a huge value-add because now I don’t have to waste three to five hours to go through this myself and maybe get some incorrect information. And so I think that’s also where the future still is for writers, even though there are LLMs that can save you time.

Lex Fridman (01:21:43) It’s kind of fascinating to actually watch—and I’m sure you guys do this, but for me to look at the difference between a summary and the original content. Even if it’s a page-long summary of page-long content, it’s interesting to see how the LLM-based summary takes the edge off. What is the signal it removes from the thing?

Nathan Lambert (01:22:07) The voice is what I talk about a lot.

Lex Fridman (01:22:09) Voice? Well, voice… I would love to hear what you mean by voice, that’s really powerful, but sometimes there’s like literally insights. Like in removing an insight, you’re actually fundamentally changing the meaning of the thing. So I’m continuously disappointed by how bad LLMs are at really getting to the core insights, which is what a great summary does. Yet even if you use extensive, extremely elaborate prompts where I’m really trying to dig for the insights, it’s still not quite there which… I mean, that’s a whole deep philosophical question about what is human knowledge and wisdom and what does it mean to be insightful. But when you talk about the voice, what do you mean?

Nathan Lambert (01:22:52) So when I write, I think a lot of what I’m trying to do is take what you think as a researcher, which is very raw. A researcher is trying to encapsulate an idea at the frontier of their understanding, and they’re trying to put what is a feeling into words. And I think that in my writing, I try to do this, which makes it come across as raw but also high-information in a way that some people will get and some won’t. And that’s kind of the nature of research. And I think this is something that language models don’t do well. Particularly, they’re all trained with this reinforcement learning from human feedback which is designed to take feedback from a lot of people and, in a way, average how the model behaves from this.

Nathan Lambert (01:23:30) And I think that it’s going to be hard for a model to be very incisive when there’s that sort of filter in it. This is a wonderful fundamental problem for researchers in RLHF: this provides so much utility in making the models better, but also the problem formulation has this knot in it that you can’t get past. These language models don’t have this prior in their deep expression that they’re trying to get at. I don’t think it’s impossible to do. I think there are stories of models that really shock people. Like, I would love to have tried Bing Sydney—did that have more voice? Because it would so often go off the rails on people and affect…

Nathan Lambert (01:24:13) And what is historically, obviously, a scary way—like telling a reporter to leave his wife—is a crazy model to potentially put in general adoption. But that’s kind of the trade-off: is this RLHF process, in some ways, adding limitations?

Lex Fridman (01:24:28) That’s a terrifying place to be as one of these frontier labs and companies because millions of people are using them.

Nathan Lambert (01:24:35) There was a lot of backlash last year with GPT-4o getting removed. I’ve personally never used the model, but I’ve talked to people at OpenAI who get emails from users that might be detecting subtle differences in the deployments in the middle of the night. And they email them and say, “My friend is different.” They find these employees’ emails and send them things because they are so attached to what is a set of model weights and a configuration that is deployed to the users. We see this with TikTok. I don’t use TikTok, but supposedly, in five minutes, the algorithm gets you. It’s locked in. And those are language models doing recommendations.

Nathan Lambert (01:25:15) Like, I think there are ways that you can do this with a language model where, within five minutes of chatting with it, the model just gets you. And that is something that people aren’t really ready for. I think that—don’t give that to kids. Don’t give that to kids- at least until we know what’s happening.

Lex Fridman (01:25:30) But there’s also going to be this mechanism… What’s going to happen with these LLMs as they’re used more and more… Unfortunately, the nature of the human condition is such that people commit suicide. And what journalists will do is report extensively on the people who commit suicide, and they will very likely link it to the LLMs because they have that data about the conversations. If you’re really struggling, if you’re depressed, if you’re thinking about suicide, you’re going probably to talk to LLMs about it. And so what journalists will do is say, “The suicide was committed because of the LLM.” And that’s going to lead to the companies, because of legal issues and so on, more and more taking the edge off of the LLM.

Lex Fridman (01:26:13) So it’s going to be as generic as possible. It’s so difficult to operate in this space because, of course, you don’t want an LLM to cause harm to humans at that level, but also, this is the nature of the human experience—to have a rich conversation, a fulfilling conversation, one that challenges you and from which you grow. You need that edge. And that’s something extremely difficult for AI researchers on the RLHF front to actually have to solve because you’re actually dealing with the human condition.

Nathan Lambert (01:26:47) A lot of researchers at these companies are so well-motivated. Anthropic and OpenAI are culturally so wanting to do good for the world through this. And it’s such a… I’m like, “Ooh, I don’t want to work on this,” because, on the one hand, a lot of people see AI as a health ally, as somebody they can talk to about their health confidentially, but then it bleeds all the way into talking about mental health. It’s heartbreaking that this might be the thing where somebody goes over the edge, but other people might be saved. And there’s things that as a researcher training models, it’s like, I don’t want to train image generation models and release them openly because I don’t want to enable somebody to have a tool on their laptop that can harm other people.

Nathan Lambert (01:27:34) I don’t have the infrastructure in my company to do that safely. There are a lot of areas like this where it needs people who will approach it with complexity and the conviction that it’s just such a hard problem.

Lex Fridman (01:27:47) But also, we as a society and as users of these technologies need to make sure that we’re having the complicated conversation about it versus just fearmongering that big tech is causing harm to humans or stealing your data. It’s more complicated than that. And you’re right, there’s a very large number of people inside these companies, many of whom you know and many of whom I know, that deeply care about helping people. They are considering the full human experience of people from across the world, not just Silicon Valley—what their needs are and what that means. It’s really difficult to design this one system that is able to help all these different kinds of people across different age groups, cultures, and mental conditions.

Nathan Lambert (01:28:31) I wish that the timing of AI was different regarding the relationship of big tech to the average person. Big tech’s reputation is so low, and because AI is so expensive, it’s inevitably going to be a big tech thing. It takes so many resources, and people say the US is, quote-unquote, “betting the economy on AI” with this build-out. To have these be intertwined at the same time makes for such a hard communication environment. It would be good for me to go talk to more people in the world who hate big tech and see AI as a continuation of that.

Lex Fridman (01:29:02) One of the things you actually recommend, one of the antidotes that you talk about, is to find agency in this whole system, as opposed to sitting back in a powerless way and consuming the AI slop as it rapidly takes over the internet. Find agency by using AI to build things—build apps, build… One, that actually helps you build intuition, but two, it’s empowering because you can understand how it works and what the weaknesses are. It gives your voice power to say, “This is bad use of the technology, and this is good use of technology.” You’re more plugged into the system then, so you can understand it better and steer it better as a consumer.

Sebastian Raschka (01:29:48) I think that’s a good point you brought up about agency. Instead of ignoring it and saying, “Okay, I’m not going to use it,” I think it’s probably long-term healthier to say, “Okay, it’s out there. I can’t put it back.” It’s like the internet and computers when they first came out. How do I make the best use of it, and how does it help me up-level myself? The one thing I worry about here, though, is if you just fully use it for something you love to do, the thing you love to do is no longer there. That could potentially lead to burnout. For example, if I use an LLM to do all my coding for me, now there’s no coding; I’m just managing something that is coding for me.

Sebastian Raschka (01:30:24) Two years later, let’s say, if I just do that eight hours a day—having something code for me—do I still feel fulfilled? Is this hurting me in terms of being excited about my job and what I’m doing? Am I still proud to build something?

Lex Fridman (01:30:43) On that topic of enjoyment, it’s quite interesting. We should just throw this in there, that there’s this recent survey of about 791 professional developers—professional meaning 10-plus years of experience.

Nathan Lambert (01:30:55) That’s a long time. As a junior developer?

Lex Fridman (01:31:01) Yeah, in this day and age. The results are surprising on many fronts. They break it down by junior and senior developers, and it shows that both groups use AI-generated code in the code they ship. This is not just for fun or learning; this is code they ship. Most of them use it for around 50% or more. What’s interesting is that for the category where over 50% of the shipped code is AI-generated, senior developers are much more likely to do so. But you don’t want AI to take away the thing you love. I think this speaks to my experience. These particular results show that about 80% of people find it either somewhat more enjoyable or significantly more enjoyable to use AI as part of their work.

Sebastian Raschka (01:31:59) I think it depends on the task. From my personal usage, for example, I have a website where I sometimes tweak things. I personally don’t enjoy this, so if the AI can help me implement something on my website, I’m all for it. It’s great. But at the same time, when I solve a complex problem—if there’s a bug, and I hunt this bug and find it—it’s the best feeling in the world. You get so much joy. But now, if you don’t even think about the bug and just go directly to the LLM, you never have that kind of feeling, right?

Sebastian Raschka (01:32:38) But then there could be a middle ground where you try it yourself, you can’t find it, you use the LLM, and then you don’t get frustrated because it helps you move on to something that you enjoy. Looking at these statistics, what is not factored in is that it’s averaging over all different scenarios. We don’t know if it’s for the core task or for something mundane that people would not have enjoyed otherwise. In a sense, AI is really great for doing mundane things that take a lot of work.

Sebastian Raschka (01:33:09) For example, my wife has a podcast for book club discussions, and she was transferring the show notes from Spotify to YouTube, and the links somehow broke. She had some episodes with 100 links or something, and it would have been really painful to go in there and fix each link manually. So I suggested, “Hey, let’s try ChatGPT.” We copied the text into ChatGPT, and it fixed them. Instead of two hours going from link to link, it made that work seamless. I think everyone has a use case where AI is useful for something like that—something that would be really boring and mundane.

Lex Fridman (01:33:51) For me personally, since we’re talking about coding, a lot of the enjoyment comes from the cursor side—Claude Code side—where I have a pair programmer. It’s less lonely. You made debugging sound like this great joy. No, I would say debugging is like a drink of water after you’ve been going through a desert for— —for days. You skip the whole desert part where you’re suffering. Sometimes it’s nice to have a friend who can’t really find the bug, but can give you some intuition about the code, and together you go through the desert and find that drink of water. For me, maybe it speaks to the loneliness of the programming experience. That is a source of joy.

Sebastian Raschka (01:34:48) It’s maybe also related to delayed gratification. I’m a person who even as a kid liked the idea of Christmas presents better than actually getting them. I would look forward to the day, but then it’s over and I’m disappointed. Maybe it’s like food—it tastes better when you’re really hungry. With debugging, it’s not always great; it’s often frustrating, but if you can solve it, then it’s great. But there’s also a Goldilocks zone where if it’s too hard, then you’re wasting your time. I think another challenge, though, is: how will people learn?

Sebastian Raschka (01:35:33) The chart we looked at showed that more senior developers are shipping AI-generated code than the junior ones. I think it’s interesting because intuitively you would think it’s the junior developers because they don’t know how to do the thing yet. It could mean the AI is not good enough yet to solve those tasks, but it could also mean experts are more effective at using it—they know how to review the code and they trust it more. One issue in society in the future will be: how do you become an expert if you never try to do the thing yourself?

Sebastian Raschka (01:36:12) I learned by trying things myself. With math textbooks, if you look at the solutions, you learn something, but you learn better if you try first and then appreciate the solution because you know how to put it into your mental framework. If LLMs are here all the time, would you actually go through the length of struggling? Would you be willing to struggle? Struggle is not nice, but if you use the LLM to do everything, at some point you will never really take the next step and you won’t get that unlock that you get as an expert using an LLM.

Sebastian Raschka (01:36:53) So, I think there’s a Goldilocks sweet spot where maybe the trick is you make dedicated offline time where you study two hours a day, and the rest of the day you use LLMs. I think it’s important for people to still invest in themselves, in my opinion, and not just LLM everything.

Post-training explained: Exciting new research directions in LLMs

Lex Fridman (01:37:10) Yeah, there is a sense that we, together as a civilization, each individually have to find that Goldilocks zone. And in the programming context as developers. Now, we’ve had this fascinating conversation that started with pre-training and mid-training. Let’s get to post-training. There’s a lot of fun stuff in post-training. So, what are some of the interesting ideas in post-training?

Nathan Lambert (01:37:31) The biggest one from 2025 is learning this reinforcement learning with verifiable rewards, RLVR. You can scale up the training there, which means doing a lot of this kind of iterative generate-grade loop, and that lets the models learn both interesting behaviors on the tool use and software side. This could be searching, running commands on their own and seeing the outputs, and then also that training enables this inference-time scaling very nicely. It just turned out that this paradigm was very nicely linked, where this kind of RL training enables inference-time scaling. But inference-time scaling could have been found in different ways. So, it was kind of this perfect storm where the models change a lot, and the way that they’re trained is a major factor in doing so.

Nathan Lambert (01:38:15) And this has changed how people approach post-training dramatically.

Lex Fridman (01:38:20) Can you describe RLVR, popularized by DeepSeek R1? Can you describe how it works?

Nathan Lambert (01:38:25) Yeah. Fun fact, I was on the team that came up with the term RLVR, which is from our Tulu 3 work before DeepSeek. We don’t take a lot of credit for being the people to popularize the scaling RL, but as much fun as academics get, as an aside, is the ability to name and influence—

Nathan Lambert (01:38:43) —the discourse, because the closed labs can only say so much. One of the things you can do as an academic is, while you might not have the compute to train the model, you can frame things in a way that ends up being… I describe it as like a community can come together around this RLVR term, which is very fun. And then DeepSeek are the people that did the training breakthrough, which is, they scaled the reinforcement learning. They have the model generate answers and then grade the completion if it was right, and then that accuracy is your reward for reinforcement learning. So reinforcement learning is classically an agent that acts in an environment, and the environment gives it a state and a reward back, and you try to maximize this reward.

Nathan Lambert (01:39:26) In the case of language models, the reward is normally accuracy on a set of verifiable tasks, whether it’s math problems or coding tasks. And it starts to get blurry with things like factual domains. That is also, in some ways, verifiable or constraints on your instruction, like ‘respond only with words that start with A.’ All of these things are verifiable in some way. The core idea is you find a lot more of these problems that are verifiable and you let the model try it many times while taking these RL gradient updates. The infrastructure evolved from reinforcement learning from human feedback, RLHF, where in that era, the score they were trying to optimize was a learned reward model of aggregate human preferences.

Nathan Lambert (01:40:13) So you kind of changed the problem domains and that let the optimization go on to much bigger scales, which kind of kickstarted a major change in what the models can do and how people use them.

Lex Fridman (01:40:24) What kind of domains is RLVR amenable to?

Nathan Lambert (01:40:28) Math and code are the famous ones, and then there’s a lot of work kind of on what is called the rubrics, which is related to a word people might have heard, LLM-as-a-judge. For each problem, I’ll have a set of problems in my training dataset. I will then have another language model and ask it, “What would a good answer to this problem look like?” And then you could try the problem a bunch of times over and over again and assign a score based on this rubric. So that’s not necessarily verifiable like a math and code domain, but this rubrics idea and other scientific problems where it might be a little bit more vague is where a lot of the attention is. They’re trying to push this set of methods into these more open-ended domains so the models can learn a lot more.

Sebastian Raschka (01:41:11) I think that’s called reinforcement learning with AI feedback, right?

Nathan Lambert (01:41:14) That’s the older term from it that was coined in Anthropic’s Constitutional AI paper. So a lot of these things come in cycles.

Sebastian Raschka (01:41:21) Also, just one step back for the RLVR. I think the interesting, beautiful thing here is that you ask the LLM a math question, you know the correct answer, and you let the LLM figure it out, but how it does it is… I mean, you don’t really constrain it much. There are some constraints you can add, like ‘use the same language’ or ‘don’t switch between Spanish and English.’ But let’s say you’re pretty much hands-off.

Sebastian Raschka (01:41:44) You only give the question and the answer, and then the LLM has the task to arrive at the right answer. But the beautiful thing here is what happens in practice: the LLM will do a step-by-step description, like how a student or a mathematician would derive the solution. It will use those steps and that helps the model to improve its own accuracy. And then, like you said, the inference scaling. Inference scaling loosely means spending more compute while using the LLM during inference, and here the inference scaling is that the model would use more tokens. In the DeepSeek R1 paper, they showed the longer they train the model, the longer the responses are.

Sebastian Raschka (01:42:28) They grow over time. They use more tokens, so it becomes more expensive for simple tasks, but these explanations help the model with accuracy. There are also a lot of papers showing what the model explains does not necessarily have to be correct, or maybe it’s even unrelated to the answer, but for some reason, it still helps the model—the fact that it is explaining. And I think it’s also—again, I don’t want to anthropomorphize these LLMs—but it’s kind of like how we humans operate, right? If there’s a complex math problem in a math class, you usually have a note paper and you do it step by step. You cross things out.

Sebastian Raschka (01:43:03) And the model also self-corrects, and that was, I think, the aha moment in the DeepSeek R1 paper. They called it the ‘aha moment’ because the model itself recognized it made a mistake and then said, “Ah, I did something wrong, let me try again.” I think that’s just so cool that this falls out of just giving it the correct answer and having it figure out how to do it—that it kind of does, in a sense, what a human would do. Although LLMs don’t think like humans, it’s a kind of interesting coincidence. And the nice side effect is it’s great for us humans to see these steps. It builds trust, and we can learn or double-check things.

Nathan Lambert (01:43:40) There’s a lot in here. I think- There’s been a lot of debate this year on if the language models—I think these aha moments are kind of fake because in pre-training, you essentially have seen the whole internet. So you have definitely seen people explaining their work, even verbally, like a transcript of a math lecture: “You try this, oh, I messed this up.” And what reinforcement learning—this RLVR—is very good at doing, is amplifying— —these behaviors, because they’re very useful in enabling the model to think longer and to check its work. I agree that it is very beautiful that this training kind of… the model learns to amplify this in a way that is just so useful at the final answers being better.

Sebastian Raschka (01:44:16) I can give you also a hands-on example. I was training the Qwen 3 base model with RLVR on MATH-500. The base model had an accuracy of about 15%. Just 50 steps, like in a few minutes with RLVR, the model went from 15% to 50% accuracy. And you can’t tell me it’s learning anything fundamentally about math in—

Nathan Lambert (01:44:38) The Qwen example is weird because there’s been two papers this year, one of which I was on, that talks about data contamination in Qwen— —and specifically that they train on a lot of this special mid-training phase that we— —can chime in on for a minute because it’s weird— —because they train on problems that are almost identical to MATH.

Sebastian Raschka (01:44:53) Exactly. And so you can see that basically the RL is not teaching the model any new knowledge about math. You can’t do that in 50 steps. So the knowledge is already there in the pre-training; you’re just unlocking it.

Nathan Lambert (01:45:03) I still disagree with the premise because there’s a lot of weird complexities that you can’t prove. One of the things that points to weirdness is that if you take the Qwen 3 so-called base model—you could Google “math dataset Hugging Face” and take a problem—if you put it into Qwen 3 base… all these math problems have words, so it would be like, “Alice has five apples and gives three to whoever,” and there are these word problems. With these Qwen-based models, why people are suspicious of them is if you change the numbers but keep the words— —Qwen will produce, without tools, a very high accuracy decimal representation—

Nathan Lambert (01:45:43) —of the answer, which means at some point it was shown problems that were almost identical to the test set, and it was using tools to get a very high precision answer. But a language model without tools will never actually have this. So it’s been this big debate in the research community: how much of these reinforcement learning papers that are training on Qwen and measuring specifically on this math benchmark—where there’s been multiple papers talking about contamination—how much can you believe them? I think this is what caused the reputation of RLVR being about formatting, because you can get these gains so quickly and therefore it must already be in the model. But there’s a lot of complexity here. It’s not really like controlled experimentation— —so we don’t really know.

Sebastian Raschka (01:46:26) But if it weren’t true, I would say distillation wouldn’t work, right? Distillation can work to some extent, but the biggest problem—and I’m researching this contamination—is we don’t know what’s in the data. Unless you have a new dataset, it is really impossible. Even something simpler like MMLU, which is a multiple-choice benchmark—if you just change the format slightly, like using a dot instead of a parenthesis, the model accuracy will vastly differ.

Nathan Lambert (01:47:04) I think that that could be like a model issue rather than a general issue.

Sebastian Raschka (01:47:09) It’s not even malicious by the developers of the LLM, like, “Hey, we want to cheat at that benchmark.” It’s just it has seen something at some point. I think the only fair way to evaluate an LLM is to have a new benchmark that is after the cutoff date when the model was deployed.

Lex Fridman (01:47:22) Can we lay out what would be the recipe of all the things that go into post-training? And you mentioned RLVR was a really exciting, effective thing. Maybe we should elaborate. RLHF still has a really important component to play. What kind of other ideas are there on post-training?

Nathan Lambert (01:47:40) I think you can take this in order. You could view it as what made o1, which is this first reasoning model, possible. You’re going to have similar interventions where you start with mid-training. The thing that is rumored to enable o1 and similar models is really careful data curation where you’re providing a broad set of what is called reasoning traces. This is just the model generating words in a forward process that reflects breaking down a problem into intermediate steps and trying to solve them. So at mid-training, you need to have data similar to this so that when you move into post-training, primarily with these verifiable rewards, it can learn.

Nathan Lambert (01:48:27) And then what is happening today is you’re figuring out which problems to give the model, how long you can train it for, and how much inference you can enable the model to use when solving these verifiable problems. As models get better, certain problems are no longer useful; the model will solve them 100% of the time, and therefore there’s very little signal. If we look at the GRPO equation, this one is famous for this because essentially the reward given to the agent is based on how good a given action—a completion—is relative to the other answers to that same problem. So if all the problems get the same answer, there’s no signal in these types of algorithms.

Nathan Lambert (01:49:09) So what they’re doing is finding harder problems, which is why you hear about things like scientific domains, which are so hard to get anything right in. If you have a lab or something, it just generates so many tokens, or much harder software problems. The frontier models are all pushing into these harder domains where they can train on more problems and the model will learn more skills at once. The RLHF link to this is that RLHF has been, and still is, the finishing touch on the models, where it makes them more useful by improving the organization, style, or tone.

Nathan Lambert (01:49:42) There are different things that resonate with different audiences. Some people like a really quirky model, and RLHF could be good at enabling that personality, and some people hate the markdown bulleted list thing that the models do, but it’s actually really good for quickly parsing information. This human feedback stage is really great for putting this into the model at the end of the day. It’s what made ChatGPT so magical for people. And that use has actually remained fairly stable. This formatting can also help the models get better at math problems, for example.

Nathan Lambert (01:50:17) The border between style and formatting and the method that you use to answer a problem are actually very closely linked when you’re training these models. RLHF can still make a model better at math, but these verifiable domains are a much more direct process for doing this because it makes more sense with the problem formulation. To summarize: mid-training gives the model the skills it needs to learn; RL with verifiable rewards lets the model try many times, putting a lot of compute into trial-and-error learning across hard problems; and then RLHF finishes the model, making it easy to use and rounding it out.

Lex Fridman (01:51:02) Can you comment on the amount of compute required for RL VR?

Nathan Lambert (01:51:06) It’s only gone up and up. I think Grok 4 was famous for saying they use a similar amount of compute for pre-training and post-training. Back to the scaling discussion, they involve very different hardware for scaling. Pre-training is very compute-bound, which is like the FLOPS discussion: how many matrix multiplications can you get through in one time. Because in RL you’re generating these answers and trying the model in real-world environments, it ends up being much more memory-bound. You’re generating long sequences, and the attention mechanisms have a behavior where you get a quadratic increase in memory as you get to longer sequences. So the compute becomes very different.

Nathan Lambert (01:51:44) In pre-training, we would talk about a model—if we go back to the Biden administration executive order—it’s like 10 to the 25th FLOPS to train a model. If you’re using FLOPS in post-training, it’s a lot weirder because the reality is just how many hours you are allocating how many GPUs for. In terms of time, the RL compute is getting much closer because you just can’t put it all into one system. Pre-training is so computationally dense where all the GPUs are talking to each other and it’s extremely efficient, whereas RL has all these moving parts and it can take a long time to generate a sequence of a hundred thousand tokens.

Nathan Lambert (01:52:17) If you think about Gemini 3 Pro taking an hour, what if your training run has to sample for an hour? You have to make sure that’s handled efficiently. So in GPU hours or wall-clock hours, the RL runs are probably approaching the same number of days as pre-training, but they probably aren’t using as many GPUs at the same time. There are rules of thumb in labs where you don’t want your pre-training runs to last more than a month because they fail catastrophically. If you are planning a huge cluster to be held for two months and then it fails on day 50, the opportunity costs are just so big.

Nathan Lambert (01:52:54) People don’t want to put all their eggs in one basket. GPT-4 was like the ultimate YOLO run, and nobody ever wanted to do it before where it took three months to train and everybody was shocked that it worked. I think people are a little bit more cautious and incremental now.

Sebastian Raschka (01:53:07) So RLVR is more unlimited in how much you can train or still get benefit, whereas RLHF, because it’s preference tuning, reaches a certain point where it doesn’t really make sense to spend more budget on it. To take a step back with preference tuning: there are multiple people that can give multiple explanations for the same thing and they can both be correct, but at some point, you learn a certain style and it doesn’t make sense to iterate on it. My favorite example is if relatives ask me what laptop they should buy. I give them an explanation or ask about their use case, and they might prioritize battery life and storage.

Sebastian Raschka (01:53:46) Other people, like us, would prioritize RAM and compute. Both answers are correct, but different people require different answers. With preference tuning, you are trying to average somehow; you are asking the data labelers to give you the preferred answer and then you train on that. But at some point, you learn that average preferred answer, and there’s no reason to keep training longer on it because it’s just a style. With RLVR, you let the model solve more and more complex, difficult problems. So I think it makes more sense to allocate more budget long-term to RLVR.

Sebastian Raschka (01:54:27) Right now, we are in an RLVR 1.0 phase where it’s still that simple thing where we have a question and answer, but we don’t do anything with the stuff in between. There were multiple research papers, by Google for example, on process reward models that also give scores for the explanation—how correct is the explanation? I think that will be the next thing, let’s say RLVR 2.0 for this year, focusing on the steps between question and answer and how to leverage that information to improve the explanation and accuracy. That’s one angle. And there was a DeepSeek-V3.2 paper where they also had interesting inference scaling.

Sebastian Raschka (01:55:11) Well, first they had developed models that grade themselves as a separate model. I think that will be one aspect. And the other, like Nathan mentioned, will be RLVR branching into other domains.

Nathan Lambert (01:55:23) The place where people are excited is value functions— —which is pretty similar. Process reward models assign how good something is to each intermediate step in a reasoning process, whereas value functions apply value to every token the language model generates. Both of these have been largely unproven in the language modeling and reasoning model era. People are more optimistic about value functions for whatever reason now. I think process reward models were tried a lot more in the pre-o1 era, and a lot of people had headaches with them. Value models have a very deep history in reinforcement learning.

Nathan Lambert (01:56:06) They’re one of the first things that were core to deep reinforcement learning existing—training value models. So right now the literature shows people are excited about trying value models, but there’s very little proof in it. And there are negative examples in trying to scale up process reward models.

Nathan Lambert (01:56:22) These things don’t always hold in the future. To summarize the scaling: you don’t want to do too much RLHF because of how the signal scales. People have worked on RLHF for years, especially after ChatGPT, but the first release of a reasoning model trained with RLVR, OpenAI’s o1, had a scaling plot where if you increase the training compute logarithmically, you get a linear increase in evaluations. This has been reproduced multiple times; I think DeepSeek had a plot like this. But there’s no scaling law for RLHF where if you log-increase the compute, you get linear performance.

Nathan Lambert (01:57:02) In fact, the seminal scaling paper for RLHF is about scaling loss for reward model over-optimization. That’s a big line to draw with RLVR and the methods we have now; they will follow this scaling paradigm where you can let the best runs go for an extra 10x and you get performance, but you can’t do this with RLHF. That is going to be field-defining. To do the best RLHF you might not need the extra 10 or 100x compute, but to do the best RLVR you do. There’s a seminal paper from a Meta internship called “The Art of Scaling Reinforcement Learning with Language Models.”

Nathan Lambert (01:57:47) Their framework is called ScaleRL. Their incremental experiment was like 10,000 V100 hours, which is thousands or tens of thousands of dollars per experiment, and they do a lot of them. This cost is not accessible to the average academic, which creates a hard equilibrium when trying to figure out how to learn from each community.

Advice for beginners on how to get into AI development & research

Lex Fridman (01:58:11) I was wondering if we could take a bit of a tangent and talk about education and learning. If you’re somebody listening to this who’s a smart person interested in programming and interested in AI, I presume building something from scratch is a good beginning. Can you just take me through what you would recommend people do?

Sebastian Raschka (01:58:32) I would personally start, like you said, by implementing a simple model from scratch that you can run on your computer. The goal of building a model from scratch is not to have something you use every day for your personal projects. It’s not going to be your personal assistant replacing an existing open-weight model or ChatGPT. It’s to see exactly what goes into the LLM, what exactly comes out of the LLM, and how pre-training works on your own computer. And then you learn about pre-training, supervised fine-tuning, and the attention mechanism.

Sebastian Raschka (01:59:03) You get a solid understanding of how things work, but at some point you will reach a limit because smaller models can only do so much. The problem with learning about LLMs at scale is that it’s exponentially more complex to make a larger model because it’s not just that the model becomes larger. You have to think about sharding your parameters across multiple GPUs. Even for the KV cache, there are multiple ways you can implement it. One is just to understand how it works, like a cache you grow step-by-step by concatenating lists, but then that wouldn’t be optimal on GPUs. You would pre-allocate a tensor and then fill it in. But that adds another 20 or 30 lines of code.

Sebastian Raschka (01:59:45) And for each thing, you add so much code. I think the trick with the book is basically to understand how the LLM works. It’s not going to be your production-level LLM, but once you have that, you can understand the production-level LLM.

Lex Fridman (01:59:56) So you’re trying to always build an LLM that’s going to fit on one GPU?

Sebastian Raschka (02:00:00) Yes. Most of the examples I have fit on one GPU. I have some bonus materials on some MoE models; one or two of them may require multiple GPUs, but the goal is to have it on one GPU. And the beautiful thing is also you can self-verify. It’s almost like RLVR. When you code these from scratch, you can take an existing model from the Hugging Face Transformers library. The Hugging Face Transformers library is great, but if you want to learn about LLMs, I think that’s not the best place to start because the code is so complex. It has to fit so many use cases and some people use it in production. It has to be really sophisticated, so it’s intertwined and hard; it’s not linear to read.

Nathan Lambert (02:00:39) It started as a fine-tuning library, and then it grew to be the standard representation of every model architecture and the way it is loaded. Hugging Face is the default place to get a model, and Transformers is the software that enables it— —so people can easily load a model— —and do something basic with it.

Sebastian Raschka (02:00:56) And all frontier labs that have open-weight models have a Hugging Face Transformers version of it, from DeepSeek to gpt-oss. That’s the canonical way that you can load them. But again, even the Transformers library is not used in production for inference. People use SGLang or vLLM, and it adds another layer of complexity.

Lex Fridman (02:01:15) We should say that the Transformers library has something like 400 models.

Sebastian Raschka (02:01:19) So it’s the one library that tries to implement a lot of LLMs, and so you have a huge codebase. It’s massive. It’s—I don’t know, maybe millions— —hundreds of thousands of lines of code. Understanding the part that you want to understand is like finding the needle in the haystack. But what’s beautiful about it is you have a working implementation, so you can work backwards from it. What I would recommend doing is if I want to understand, for example, how OLMo 3 is implemented, I would look at the weights in the model hub and the config file. You can see, “Oh, they used so many layers. They use group query attention.” Then you see all the components in a human-readable 100-line config file. And then you start with your GPT-2 model and add these things.

Sebastian Raschka (02:02:06) The cool thing here is you can then load the pre-trained weights and see if they work in your model. You want to match the same output that you get with a Transformers model, and then you can use that basically as a verifiable reward to make your architecture correct. Sometimes it takes me a day. With OLMo 3, the challenge was RoPE for the position embeddings; they had a YaRN extension and there was some custom scaling there. I couldn’t quite match it at first, but in this struggle you kind of understand things. At the end, you know you have it correct because you can unit test it against the reference implementation. I think that’s one of the best ways to learn. Basically, you reverse-engineer something.

Nathan Lambert (02:02:51) I think that is something everyone interested in getting into AI today should do, and that’s why I liked your book. I came to language models from the RL and robotics field, so I never had taken the time to just learn all the fundamentals. The Transformer architecture is as fundamental today as deep learning was in the past, and people need to learn it. I think where a lot of people get overwhelmed is how to apply this to have an impact or find a career path.

Nathan Lambert (02:03:23) AI language models make this fundamental stuff so accessible, and people with motivation will learn it. Then it’s like, “How do I get the cycles on goal to contribute to research?” I’m actually fairly optimistic because the field moves so fast that a lot of times the best people don’t fully solve a problem because there’s a bigger, lower-hanging fruit to solve, so they move on. In my RLHF book, I try to take post-training techniques and describe how they influence the model. It’s remarkable how many things people just stop studying.

Nathan Lambert (02:04:06) I think people trying to go narrow after doing the fundamentals is good. Reading relevant papers and being engaged in the ecosystem—you actually… The proximity that random people have online to leading researchers is incredible. The anonymous accounts on X in ML are very popular, and no one knows who all these people are. It could just be random people who study this stuff deeply. Especially with AI tools to help you keep digging into things you don’t understand, it’s very useful. There are research areas that might only have three papers you need to read, and then one of the authors will probably email you back.

Nathan Lambert (02:04:45) But you have to put in a lot of effort into these emails to show you understand the field. It would take a newcomer weeks of work to truly grasp a very narrow area, but going narrow after the fundamentals is very useful. I became very interested in character training—how you make a model funny, sarcastic, or serious, and what you do to the data to achieve this. A student at Oxford reached out to me and said, “Hey, I’m interested in this,” and I advised him. Now that paper exists. There were maybe only two or three people in the world very interested in that specific topic.

Nathan Lambert (02:05:25) He’s a PhD student, which gives you an advantage, but for me, that was a topic where I was waiting for someone to say, “Hey, I have time to spend cycles on this.” I’m sure there are a lot more narrow things where you’re just like, “It doesn’t make sense that there was no answer to this.” There’s so much information coming in that people feel they can’t grab onto anything, but if you actually stick to one area, I think there are a lot of interesting things to learn.

Sebastian Raschka (02:05:48) Yeah, I think you can’t try to do it all because it would be very overwhelming and you would burn out. For example, I haven’t kept up with computer vision in a long time; I’ve just focused on LLMs. But coming back to your book, I think it’s a really great resource and a good bang for the buck if you want to learn about RLHF. I wouldn’t just go out there and read raw RLHF papers because you would be spending two years—

Nathan Lambert (02:06:10) —and some of them contradict each other. I’ve just edited the book, and there’s no chapter where I had to say, “X papers say one thing and Y papers say another, and we’ll see what comes out to be true.”

Lex Fridman (02:06:21) What are some of the ideas we might have missed in the bigger picture of post-training? To go through the table of contents: first, you did the problem setup, training overview, what are preferences, preference data and the optimization tools, reward modeling, regularization, instruction tuning, rejection sampling, reinforcement learning. Then constitutional AI and AI feedback, reasoning and inference-time scaling, tool use and function calling, synthetic data and distillation, evaluation, and then the open questions section: over-optimization, style and information, product UX, character and post-training. What are some ideas worth mentioning that connect both the educational component and the research component? You mentioned the character training, which is pretty interesting.

Nathan Lambert (02:07:08) Character training is interesting because there’s so little out there, but we talked about how people engage with these models. We feel good using them because they’re positive, but that can go too far; it can be too positive. It’s essentially how you change your data or decision-making to make it exactly what you want. OpenAI has this thing called a “model spec,” which is essentially their internal guideline for what they want the model to do, and they publish this to developers. So you can know what is a failure of OpenAI’s training—where they have the intention but haven’t met it yet—versus what is something they actually wanted to do that you just don’t like.

Nathan Lambert (02:07:46) That transparency is very nice, but all the methods for curating these documents and how easy it is to follow them is not very well known. I think the way the book is designed is that the reinforcement learning chapter is obviously what people want because everybody hears about it with RLVR, and it’s the same algorithms and the same math, but you can use it in very different documents. I think the core of RLHF is how messy preferences are. It’s essentially a rehash of a paper I wrote years ago, but this is the chapter that tells you why RLHF is never fully solvable, because the way that RL is set up assumes that preferences can be quantified and reduced to single values.

Nathan Lambert (02:08:33) I think it relates in the economics literature to the Von Neumann-Morgenstern utility theorem. That is the chapter where all of that philosophical, economic, and psychological context tells you what gets compressed when doing RLHF. Later in the book, you use this RL map to make the number go up. I think that’s why it’ll be very rewarding for people to do research on, because quantifying preferences is something humans have designed the problem around to make them studyable. But there are fundamental debates; for example, in a language model response, you have different things you care about, whether it’s accuracy or style.

Nathan Lambert (02:09:13) When you’re collecting the data, they all get compressed into, “I like this more than another.” There’s a lot of research in other areas of the world that goes into how you should actually do this. I think social choice theory is the subfield of economics around how you should aggregate preferences. I went to a workshop that published a white paper on how you can think about using social choice theory for RLHF. I want people who get excited about the math to stumble into this broader context. I also keep a list of all the tech reports of reasoning models that I like. In Chapter 14, where there’s a short summary of RLVR, there’s a gigantic table where I list every single reasoning model that I like. I think in education, a lot of it needs to be, at this point, what I like—

Nathan Lambert (02:10:08) —because language models are so good at the math. For example, the famous paper on Direct Preference Optimization, which is a much simpler way of solving the problem than RL—the derivations in the appendix skip steps of math. I tried for this book to redo the derivations and I was like, “What the heck is this log trick that they use?” But when doing it with language models, they just say, “This is the log trick.” I don’t know if I like that the math is so commoditized. I think some of the struggle in reading this appendix— —and following the math is good for learning.

Lex Fridman (02:10:43) Yeah, we’re returning to this often on the topic of education. You both have brought up the word “struggle” quite a bit. There is value in that. If you’re not struggling as part of this process, you’re not fully following the proper process for learning, I suppose.

Nathan Lambert (02:11:02) Some of the providers are starting to work on models for education designed to not give… actually, I haven’t used them, but I would guess they’re designed to not give all the information at once— —and make people work for it. I think you could train models to do this and it would be a wonderful contribution. In the book, you had to reevaluate every decision— Which is such a great example. I think there’s a chance we work on it at AI2, which I think would be so fun.

Sebastian Raschka (02:11:26) It makes sense. I did something like that the other day for video games. In my spare time, I like video games with puzzles, like Zelda and Metroid. There’s this new game where I got really stuck. I didn’t want to struggle for two days, so I used an LLM. But I told it, “Please don’t add any spoilers. I’m at this point; what do I have to do next?” You can do the same thing for math where you say, “I’m at this point and I’m getting stuck. Don’t give me the full solution, but what is something I could try?” You kind of carefully probe it.

Sebastian Raschka (02:12:02) But the problem is that it requires discipline. A lot of people enjoy math, but there are also a lot of people who need to do it for their homework, and then it’s just a shortcut. We can develop an educational LLM, but the other LLMs are still there, and there’s still a temptation to use them.

Lex Fridman (02:12:20) I think a lot of people, especially in college, understand the stuff they’re passionate about— —they’re self-aware about it, and they understand it shouldn’t be easy. Like, I think we just have to develop a good taste— —talk about research taste, like school taste about stuff that you should be struggling on— —and stuff you shouldn’t be struggling on. Which is tricky to know, because sometimes you don’t have good long-term vision about what would be actually useful to you in your career. But you have to develop that taste, yeah.

Nathan Lambert (02:12:51) I was talking to maybe my fiance or friends about this, and it’s like there’s this brief 10-year window where all of the homework and all the exams could be digital. But before that, everybody had to do all the exams in bluebooks because there was no other way. And now after AI, everybody’s going to need to be in bluebooks and oral exams because everybody could cheat so easily. It’s like this brief generation that had a different education system where everything could be digital, but you still couldn’t cheat. And now it’s just going back. It’s just very funny.

Lex Fridman (02:13:20) You mention character training. Just zooming out on a more general topic: for that topic, how much compute was required? And in general, to contribute as a researcher, are there places where not too much compute is required where you can actually contribute as an individual researcher?

Nathan Lambert (02:13:39) For the character training thing, I think this research is built on fine-tuning about seven billion parameter models with LoRA, which is like a… Essentially, you’re only fine-tuning a small subset of the weights of the model. I don’t know exactly how many GPU hours that would take.

Nathan Lambert (02:13:55) Not doable for every academic. So the situation for some academics is so dire that the only work you can do is doing inference where you have closed models or open models, and you get completions from them and you can look at them and understand the models. And that’s very well-suited to evaluation, where you want to be the best at creating representative problems that the models fail on or show certain abilities, which I think that you can break through with. I think that the top-end goal for a researcher working on evaluation, if you want to have career momentum, is that the frontier labs pick up your evaluation. So you don’t need to have every project do this.

Nathan Lambert (02:14:33) But if you go from a small university with no compute and you figure out something that Claude struggles with and then the next Claude model has it in the blog post, there’s your career rocket ship. I think that that’s hard but if you want to scope the maximum possible impact with minimum compute, it’s something like that—which is just get very narrow, and it takes learning of where the models are going. So you need to build a tool that tests where Claude 4.5 will fail. If I’m going to start a research project, I need to think where the models in eight months are going to be struggling.

Lex Fridman (02:15:05) But what about developing totally novel ideas?

Nathan Lambert (02:15:08) This is a trade-off. I think that if you’re doing a PhD, you could also be like, “It’s too risky to work in language models. I’m going way longer term,” which is like— —what is the thing that’s going to define language model development in 10 years? Which I think that I end up being a person that’s pretty practical. I mean, I went to my PhD where it was like, “I got into Berkeley. Worst case, I get a master’s, and then I go work in tech.” And so I’m very practical about it. The life afforded to people to work at these AI companies, the amount of… OpenAI’s average compensation is over a million dollars in stock a year per employee. For any normal person in the US, getting into this AI lab is transformative for your life. So I’m pretty practical about it—

Nathan Lambert (02:15:50) —there’s still a lot of upward mobility working in language models if you’re focused. And looking at the outcomes, look at these jobs. But from a research perspective, for transformative impact and these academic awards, being the next Yann LeCun comes from not caring about language model development very much.

Lex Fridman (02:16:07) It’s a big financial sacrifice in that case.

Nathan Lambert (02:16:09) So I get to work with some awesome students, and they’re like, “Should I go work at an AI lab?” And I’m like, “You’re getting a PhD at a top school. Are you going to leave to go to a lab?” If you go work at a top lab, I don’t blame you. Don’t go work at some random startup that might go to zero. But if you’re going to OpenAI, I think it could be worth leaving a PhD for.

Lex Fridman (02:16:30) Let’s more rigorously think through this. Where would you give a recommendation for people to do a research contribution? The options are academia—get a PhD, spend five years publishing, though compute resources are constrained. There are research labs that are more focused on open-weight models, so working there. Or closed frontier labs. OpenAI, Anthropic, xAI, and so on.

Nathan Lambert (02:17:04) The two gradients are: the more closed, the more money you tend to get, but also you get less credit. In terms of building a portfolio of things that you’ve done, it’s very clear what you have done as an academic. Versus if you are going to trade this fairly reasonable progression for being a cog in the machine, which could also be very fun. I think they’re very different career paths. But the opportunity cost for being a researcher is very high because PhD students are paid essentially nothing. I think it ends up rewarding people that have a fairly stable safety net and they realize they can operate in the long term, doing very interesting work and getting a very interesting job.

Nathan Lambert (02:17:50) So it is a privileged position to be like, “I’m going to see out my PhD and figure it out after because I want to do this.” And at the same time, the academic ecosystem is getting bombarded by funding getting cut and stuff. There are just so many different trade-offs where I understand plenty of people that are like, “I don’t enjoy it. I can’t deal with this funding search. My grant got cut for no reason by the government,” or, “I don’t know what’s going to happen.” So I think there’s a lot of uncertainty and trade-offs that, in my opinion, favor just taking the well-paying job with meaningful impact. It’s not like you’re getting paid to sit around at OpenAI. You’re building the cutting edge of things that are— changing millions of people’s relationship to tech.

Lex Fridman (02:18:34) But publication-wise, they’re being more secretive, increasingly so. So you’re publishing less and less. You are having a positive impact at scale, but you’re a cog in the machine.

Sebastian Raschka (02:18:47) I think, honestly, it hasn’t changed that much. I have been in academia; I’m not in academia anymore. At the same time, I wouldn’t want to miss my time in academia. But what I wanted to say before I get to that part, I think it hasn’t changed that much. I was using AI or machine learning methods for applications in computational biology with collaborators, and a lot of people went from academia directly to Google. I think it’s the same thing. Back then, professors were sad that their students went into industry because they couldn’t carry on their legacy in that sense. I think it’s the same thing. It hasn’t changed that much. The only thing that has changed is the scale.

Sebastian Raschka (02:19:32) But, you know, cool stuff was always developed in industry that was closed. You couldn’t talk about it. I think the difference now is your preference. Do you like to talk about your work and publish, or are you more in a closed lab? That’s one difference—the compensation, of course. But it’s always been like that. So it really depends on where you feel comfortable. And also, nothing is forever. The only thing right now is there’s a third option, which is starting a startup. There are a lot of people doing startups. Very risky move, but it’s a high-risk, high-reward type of situation, whereas joining an industry lab is pretty safe and offers upward mobility.

Sebastian Raschka (02:20:16) Honestly, I think once you have been at an industry lab, it will be easier to find future jobs. But then again, it’s like, how much do you enjoy the team and working on proprietary things versus how do you like the publishing work? I mean, publishing is stressful. Acceptance rates at conferences can be arbitrary and very frustrating, but it’s also high reward. If you have a paper published, you feel good because your name is on there. You have a high accomplishment.

Nathan Lambert (02:20:48) I feel like my friends who are professors seem on average happier than my friends who work at— a frontier lab, to be totally honest. Because there’s just a grounding and— the frontier labs definitely do this 9/9/6— which essentially is shorthand for work all the time.

Work culture in AI (72+ hour weeks)

Lex Fridman (02:21:03) Can you describe 9/9/6 as a culture? I believe you could say it was invented in China and adopted in Silicon Valley. What’s 9/9/6? It’s 9:00 AM to 9:00 PM—

Sebastian Raschka (02:21:14) six days a week.

Lex Fridman (02:21:15) Six days a week. What is that, 72 hours? Is this basically the standard in AI companies in Silicon Valley? More and more this kind of grind mindset.

Sebastian Raschka (02:21:26) Yeah, I mean, maybe not exactly like that, but I think there is a trend towards it. And it’s interesting—I think it almost flipped because when I was in academia, I felt like that because as a professor, you had to write grants, you had to teach, and you had to do your research. It’s like three jobs in one, and it is more than a full-time job if you want to be successful. And I feel like now, like Nathan just said, the professors in comparison to a lab have even less pressure or workload than at a frontier lab because—

Nathan Lambert (02:21:57) I think they work a lot. They’re just so fulfilled. By working with students— …and having a constant runway of mentorship and a mission that is very people-oriented. I think in an era when things are moving very fast and are very chaotic, it’s very rewarding to people.

Sebastian Raschka (02:22:11) Yeah, and I think at a startup, there’s this pressure. You have to make it. It is really important that people put in the time, but it is really hard because you have to deliver constantly. I’ve been at a startup. I had a good time, but I don’t know if I could do it forever. It’s an interesting pace and it’s exactly like we talked about in the beginning. These models are leapfrogging each other, and they are just constantly trying to take the next step compared to their competitors. It’s just ruthless right now.

Nathan Lambert (02:22:42) I think this leapfrogging nature and having multiple players is actually an underrated driver of language modeling progress where competition is so deeply ingrained. These companies have intentionally created very strong cultures. For example, Anthropic is known to be culturally deeply committed and organized. We hear so little from them, and everybody at Anthropic seems very aligned. Being in a culture that is super tight and having this competitive dynamic is a thing that’s going to make you work hard and create things that are better.

Nathan Lambert (02:23:20) But that comes at the cost of human capital. You can only do this for so long, and people are definitely burning out. I wrote a post on burnout as I’ve tread in and out of this myself, especially trying to be a manager while doing full-mode training. It’s a crazy job. In the book Apple in China, Patrick McGee talked about how hard the Apple engineers worked to set up the supply chains in China. He mentioned they had “saving marriage” programs, and he said in a podcast that people died from this level of working hard. It’s a perfect environment for creating progress based on human expense. The human expense is the 996 that we started this with, where people do really grind.

Sebastian Raschka (02:24:08) I also read this book. I think they had a code word for if someone had to go home to spend time with their family to save the marriage. Then the colleagues said, “Okay, this is red alert for this situation. We have to let that person go home this weekend.” But at the same time, I don’t think they were forced to work. They were so passionate about the product that you get into that mindset. I had that sometimes as an academic, and as an independent person. I overwork, and it’s unhealthy. I had back issues and neck issues because I did not take the breaks that I should have. But it’s not because anyone forced me; it’s because I wanted to work because it’s exciting stuff.

Nathan Lambert (02:24:46) That’s what OpenAI and Anthropic are like. They want to do this work.

Silicon Valley bubble

Lex Fridman (02:24:49) Yeah, but there’s also a feeling of fervor that’s building, especially in Silicon Valley, aligned with the scaling laws idea. There’s this hype where the world will be transformed in a scale of weeks and you want to be at the center of it. I have the great fortune of having conversations with a wide variety of human beings, and I get to see all these bubbles and echo chambers across the world. It’s fascinating to see how we humans form them. I think it’s fair to say that Silicon Valley is a kind of echo chamber, a kind of silo and bubble. I think bubbles are actually really useful and effective. It’s not necessarily a negative thing because you can be ultra-productive.

Lex Fridman (02:25:34) It could be the Steve Jobs reality distortion field, because you just convince each other the breakthroughs are imminent, and by convincing each other of that, you make the breakthroughs imminent.

Nathan Lambert (02:25:48) Bryne Hobart wrote a book classifying bubbles. One of them is financial bubbles, which involve speculation and are bad, and the other is effectively for build-outs, because it pushes people to build. I do think AI is in this, but I worry about it transitioning to a financial bubble.

Lex Fridman (02:26:05) Yeah, but also in the space of ideas, that bubble creates a reality distortion field. That means you are deviating from reality, and if you go too far while also working 996, you might miss some fundamental aspects of the human experience. This is a common problem in Silicon Valley. It’s a very specific geographic area. You might not understand the Midwest perspective or the experience of all the other different humans in the United States and across the world. You speak a certain way to each other and convince each other of a certain thing, and that can get you into real trouble.

Lex Fridman (02:26:47) Whether AI is a big success and becomes a powerful technology or it’s not, in either trajectory you can get yourself into trouble. So you have to consider all of that. Here you are, a young person trying to decide what you want to do with your life.

Nathan Lambert (02:27:02) The thing that is… I don’t even really understand this, but the SF AI memes have gotten to the point where the “permanent underclass” was one of them. This was the idea that the last six months of 2025 was the only time to build durable value in an AI startup or model. Otherwise, all the value will be captured by existing companies and you will therefore be poor. That’s an example of the SF thing that goes so far. I still think for young people who are really passionate about having an impact in AI, being physically in SF is the most likely place where you’re going to do this. But it has trade-offs.

Lex Fridman (02:27:41) I think SF is an incredible place, but there is a bit of a bubble. And if you go into that bubble, which is extremely valuable, just get out also. Read history books, read literature, and visit other places in the world. Twitter and Substack are not the entire world.

Nathan Lambert (02:28:01) I think I would say, one of the people I worked with is moving to SF, and I need to get him a copy of Season of the Witch. It’s a history of SF from 1960 to 1985 that goes through the hippie revolution, the culture emerging in the city, the HIV/AIDS crisis, and other things. That is so recent, with so much turmoil and hurt, but also love in SF. No one knows about this. It’s a great book, Season of the Witch; I recommend it. A bunch of my SF friends who do get out recommended it to me. I lived there and I didn’t appreciate this context, and it’s just so recent.

Text diffusion models and other new research directions

Lex Fridman (02:28:46) Yeah. Okay, let’s… we talked a lot about many things, certainly about what was exciting last year. But this year, one of the things you guys mentioned that’s exciting is the scaling of text diffusion models and just a different exploration of text diffusion. Can you talk about what that is and what possibilities it holds? So, different kinds of approaches than the current LMs?

Sebastian Raschka (02:29:13) Yeah, so we talked a lot about the transformer architecture and the autoregressive transformer architecture specifically, like GPT. And it doesn’t mean no one else is working on anything else. People are always on the lookout for the next big thing, because I think it would be almost stupid not to. Sure, right now the transformer architecture is the thing and it works best, but it’s always a good idea to not put all your eggs into one basket. People are developing alternatives to the autoregressive transformer. One of them would be, for example, text diffusion models.

Sebastian Raschka (02:29:49) And listeners may know diffusion models from image generation, like Stable Diffusion popularized it. Back then, people used GANs, Generative Adversarial Networks. And then there was this diffusion process where you iteratively de-noise an image, and that resulted in really good quality images over time. Other companies build their own diffusion models. And now people are like, “Okay, can we try this also for text?” It doesn’t make intuitive sense yet because it feels like it’s not something continuous like a pixel that we can differentiate. It’s discrete text, so how do we implement that de-noising process?

Sebastian Raschka (02:30:25) But it’s kind of similar to the BERT models by Google. When you go back to the original transformer, there were the encoder and the decoder. The decoder is what we are using right now in GPT and so forth. The encoder is more like a parallel technique where you have multiple tokens that you fill in in parallel. GPT models do autoregressive completion one token at a time. In BERT models, you have a sentence that has gaps—you mask them out—and then one iteration is filling in those gaps.

Sebastian Raschka (02:31:02) And text diffusion is kind of like that, where you are starting with some random text, and then you are filling in the missing parts or refining them iteratively over multiple iterations. The cool thing here is that this can do multiple tokens at the same time, so it has the promise of being more efficient. Now, the trade-off is, of course, how good is the quality? It might be faster, but the more de-noising steps you do, the better the text becomes. People are trying to see if that is a valid alternative to the autoregressive model in terms of giving you the same quality for less compute.

Sebastian Raschka (02:31:46) Right now, there are papers that suggest if you want to get the same quality, you have to crank up the de-noising steps and then you end up spending the same compute you would spend on an autoregressive model. The other downside is that while it’s parallel, some tasks are not. For reasoning tasks or tool use where you have to ask a code interpreter to give you an intermediate result, it is kind of tricky with diffusion models. So there are some hybrids. But the main idea is how can we parallelize it. It’s an interesting avenue. I think right now there are mostly research models out there, like LaMDA and some other ones.

Sebastian Raschka (02:32:24) I saw some by startups, some deployed models, but there is no big diffusion model at scale yet on the level of Gemini or ChatGPT. But there was an announcement by Google where they said they are launching Gemini Diffusion, and they put it into context of their Nano 2 model. They said for the same quality on most benchmarks, we can generate things much faster. I don’t think the text diffusion model is going to replace autoregressive LLMs, but it will be something for quick, cheap, at-scale tasks. Maybe the free tier in the future will be something like that.

Nathan Lambert (02:33:04) I think there are a couple of examples where it’s actually started to be used. To paint an example of why this is so much better: when a model like GPT-5 takes time to respond, it’s generating one token at a time. This diffusion idea is essentially generating all of those tokens in the completion in one batch, which is why it could be way faster.

Nathan Lambert (02:33:27) The startups I’m hearing are code startups where you have a codebase and somebody is effectively vibe coding. They say, “Make this change,” and a code diff is essentially a huge reply from the model. It doesn’t have to have that much external context, and you can get it really fast by using these diffusion models. They use text diffusion to generate really long diffs because doing it with an autoregressive model would take minutes, and that time causes a lot of churn for a user-facing product. Every second, you lose users. So I think that it’s going to be this thing where it’s going to-

Nathan Lambert (02:34:02) -grow and have some applications, but I actually thought that different types of models were going to be used for different things sooner than they have been. I think the tool use point is the one that’s stopping them from being most general purpose because, with something like Claude Code or ChatGPT with search, the autoregressive chain is interrupted with an external tool, and I don’t know how to do that with the diffusion setup.

Tool use

Lex Fridman (02:34:28) So what’s the future of tool use this year and in the coming years? Do you think there’s going to be a lot of developments there, and how that’s integrated into the entire stack?

Sebastian Raschka (02:34:37) I do think right now it’s mostly on the proprietary LLM side, but we will see more of that in open-source tooling. It is a huge unlock because then you can really outsource certain tasks from just memorization to actual computation—you know, instead of having the LLM memorize what is 23 plus 5, just use a calculator.

Lex Fridman (02:34:58) So do you think that can help solve hallucinations?

Sebastian Raschka (02:35:01) Not solve it, but reduce it. Still, the LLM needs to know when to ask for a tool call. And second, it doesn’t mean the internet is always correct. You can do a web search for who won the World Cup in 1998, but it still needs to find the right website and get the right information. You can still go to the incorrect website and get incorrect information. I don’t think it will fully solve it, but it is improving. There was another cool paper earlier this year—I think it was December 31st, so not technically 2026, but close—on the recursive language model.

Sebastian Raschka (02:35:43) That’s a cool idea to take this even a bit further. Nathan, you mentioned earlier it’s harder to do cool research in academia because of the compute budget. If I recall correctly, they did everything with GPT-5, so they didn’t even use local models. But the idea is, for a long-context task, instead of having the LLM solve all of it in one shot or in a chain, you break it down into sub-tasks. You have the LLM decide what is a good sub-task and then recursively call an LLM to solve that.

Sebastian Raschka (02:36:16) And then adding tools—you know, each sub-task maybe goes to the web and gathers information, and then you pull it all together at the end. I think there’s going to be a lot of unlock using things like that where you don’t necessarily improve the LLM itself, you improve how the LLM is used and what it can use. One downside right now with tool use is you have to give the LLM permission to use tools. That will take some trust, especially if you want to unlock things like having an LLM answer emails for you, or just sort them. I don’t know if I would today give an LLM access to my emails, right? I mean, this is a huge risk.

Nathan Lambert (02:37:03) I think there’s one last point on the tool use thing. You hinted at this, and we’ve both come at this in our own ways: open versus closed models use tools in very different ways. With open models, people go to Hugging Face and download the model, and then the person’s going to be like, “What tool do I want?” Maybe X.ai is my preferred search provider, but someone else might care for a different search startup. When you release a model, it needs to be useful for multiple tools, which is really hard because you’re making a general reasoning engine, which is actually what gpt-oss-120b is good for.

Nathan Lambert (02:37:36) But on the closed models, you’re deeply integrating the specific tool into your experience. I think that open models will struggle to replicate some of the things that I like to do with closed models, where you can reference a mix of public and private information. Something that I keep trying every three to six months is Codex on the web, which is just prompting a model to make an update to some GitHub repository that I have.

Nathan Lambert (02:38:01) That set of secure cloud environment is just so nice for just sending it off to do this thing and then come back to me. This will probably help define some of the local open and closed niches. Because there was such a rush to get tool use working, the open models were on the back foot, which is kind of inevitable. There are so many resources in these frontier labs, but it will be fun when the open models solve this because it’s going to necessitate a more flexible model that might work with this recursive idea to be an orchestrator. Hopefully, necessity drives innovation there.

Continual learning

Lex Fridman (02:38:45) So, continual learning—this is a longstanding topic and an important problem. I think that increases in importance as the cost of training models goes up. So can you explain what continual learning is and how important it might be this year and in the coming years to make progress?

Nathan Lambert (02:39:03) This relates a lot to this kind of SF zeitgeist of: what is AGI, Artificial General Intelligence, and what is ASI, Artificial Superintelligence? What are the language models that we have today capable of doing? I think language models can solve a lot of tasks, but a key milestone for the AI community is when AI can replace any remote worker, taking in information and solving digital tasks. The limitation is that a language model will not learn from feedback the same way an employee does. If you hire an editor, they might mess up, but you will tell them, and they don’t do it again.

Nathan Lambert (02:39:43) But language models don’t have this ability to modify themselves and learn very quickly. The idea is, if we are going to get to something that is a true, general adaptable intelligence that can go into any remote work scenario, it needs to be able to learn quickly from feedback and on-the-job learning. I’m personally more bullish on language models being able to just provide very good context. You can write extensive documents where you say, “I have all this information. Here are all the blog posts I’ve ever written. I like this type of writing; my voice is based on this.” But a lot of people don’t provide this to models.

Nathan Lambert (02:40:24) The agentic models are just starting. So it’s this kind of trade-off: do we need to update the weights of this model with this continual learning thing to make them learn fast? Or, the counterargument is we just need to provide them with more context and information, and they will have the appearance of learning fast by just having a lot of context and being very smart.

Lex Fridman (02:40:43) So we should mention the terminology here. Continual learning refers to changing the weights continuously so that the model adapts and adjusts based on the new incoming information, and does so continually, rapidly, and frequently. And then the thing you mentioned on the other side of it is generally referred to as in-context learning. As you learn stuff, there’s a huge context window. You can just keep loading it with extra information every time you prompt the system, which I think both can legitimately be seen as learning. It’s just a different place where you’re doing the learning.

Sebastian Raschka (02:41:24) I think, to be honest with you, continual learning—the updating of weights—we already have that in different flavors. I think the distinction here is: do you do that on a personalized custom model for each person, or do you do it on a global model scale? And I think we have that already with going from GPT-5 to 5.1 and 5.2. It’s maybe not immediate, but it is like a quick curated update where there was feedback by the community on things they couldn’t do. They updated the weights, released the next model, and so forth. So it is kind of a flavor of that. Another even finer-grained example is RLVR; you run it, it updates.

Sebastian Raschka (02:42:08) The problem is you can’t just do that for each person because it would be too expensive to update the weights for each person. Even at OpenAI scale, building the data centers, it would be too expensive. I think that is only feasible once you have something on the device where the cost is on the consumer. Like what Apple tried to do with the Apple Intelligence models, putting them on the phone so they learn from the experience.

Lex Fridman (02:42:33) A bit of a related topic, but this kind of—maybe anthropomorphized term—memory. What are different ideas of the mechanism of how to add memory to these systems as you’re increasingly seeing? Especially personalized memory?

Sebastian Raschka (02:42:49) Right now, it’s mostly like context—stuffing things into the context and then just recalling that. But again, it’s expensive because even if you cache it, you spend tokens on that. And the second one is you can only do so much. I think it’s more like a preference or style. A lot of people do that when they solve math problems. You can add previous knowledge, but you also give it certain preference prompts, like “do what I preferred last time.” But it doesn’t unlock new capabilities. For that, one thing people still use is LoRA adapters.

Sebastian Raschka (02:43:32) These are basically, instead of updating the whole weight matrix, two smaller weight matrices that you have in parallel or overlays, like the delta. But you can do that to some extent, and then again, it is economics. There were also papers showing, for example, LoRA learns less but forgets less. There’s no free lunch. If you want to learn more, you need to use more weights, but it gets more expensive. And then if you learn more, you forget more; you have to find that Goldilocks zone.

Long context

Lex Fridman (02:44:04) We haven’t really mentioned it much, but implied in this discussion is context length as well. Is there a lot of innovations that’s possible there?

Nathan Lambert (02:44:13) I think the colloquially accepted thing is that it’s a compute and data problem. Sometimes there are small architecture things, like attention variants. We talked about hybrid attention models, which is essentially if you have what looks like a state space model within your transformer. Those are better suited because you have to spend less compute to model the furthest along token. But those aren’t free because they have to be accompanied by a lot of compute or the right data. How many sequences of 100,000 tokens do you have in the world, and where do you get these? It just ends up being pretty expensive to scale them.

Nathan Lambert (02:44:56) So we’ve gotten pretty quickly to a million tokens of input context length. And I would expect it to keep increasing and get to 2 million or 5 million this year, but I don’t expect it to go to, like, 100 million. That would be a true breakthrough, and I think those breakthroughs are possible. I think of the continual learning thing as a research problem where there could be a breakthrough that makes transformers work way better at this and it’s cheap. These things could happen with so much scientific attention. Но turning the crank, it’ll be consistent increases over time.

Sebastian Raschka (02:45:27) I think also looking at the extremes, there’s no free lunch. One extreme to make it cheap is to have, let’s say, an RNN that has a single state where you save everything from the previous stuff. It’s a specific fixed-size thing, so you never really grow the memory. You are stuffing everything into one state, but then the longer the context gets, the more information you forget because you can’t compress everything into one state. Then on the other hand, you have the transformers, which try to remember every token. That is great if you want to look up specific information, but very expensive because you have the KV cache and the dot product that grow.

Sebastian Raschka (02:46:06) But then, like you said, the Mamba layers kind of have the same problem. Like an RNN, you try to compress everything into one state, and you’re a bit more selective there. I think it’s like this Goldilocks zone again with NVIDIA Nemotron 3; they found a good ratio of how many attention layers you need for the global information where everything is accessible compared to having these compressed states. I think we will scale more by finding better ratios in that Goldilocks zone between making it cheap enough to run and making it powerful enough to be useful.

Sebastian Raschka (02:46:43) And one more plug here: the recursive language model paper is one of the papers that tries to address the long context thing. What they found is, essentially, instead of stuffing everything into this long context, if you break it up into multiple smaller tasks, you save memory and can actually get better accuracy than having the LLM try everything all at once. It’s a new paradigm; we will see if there are other flavors of that. I think we will still make improvement on long context, but like Nathan said, the problem is for pre-training itself, we don’t have as many long-context documents as other documents. So it’s harder to study basically how LMs behave on that level.

Nathan Lambert (02:47:31) There are some rules of thumb where, essentially, you pre-train a language model—like OLMo, we pre-trained at an 8K context length and then extended to 32K with training. There’s a rule of thumb where doubling the training context length takes about 2X compute, and then you can normally 2 to 4X the context length again. I think a lot of it ends up being compute-bound at pre-training. Everyone talks about this big increase in compute for the top labs this year, and that should reflect in some longer context windows.

Nathan Lambert (02:48:02) But I think on the post-training side, there’s some more interesting things. As we have agents, the agents are going to manage this context on their own. Now people who use Claude Code a lot dread the compaction, which is when Claude takes its entire 100,000 tokens of work and compacts it into a bulleted list. But what the next models will do—I’m sure people are already working on this—is the model can control when it compacts and how. So you can essentially train your RL algorithm where compaction is an action,

Nathan Lambert (02:48:30) where it shortens the history. Then the problem formulation will be, “I want to keep the maximum evaluation scores while the model compacts its history to the minimum length.” Because then you have the minimum amount of tokens that you need to do this kind of compounding auto-regressive prediction. There are actually pretty nice problem setups in this where these agentic models learn to use their context in a different way than just plowing forward.

Sebastian Raschka (02:48:56) One interesting recent example would be DeepSeek-V3.2, where they had a sparse attention mechanism with a very efficient, small, lightweight indexer. Instead of attending to all the tokens, it selects which tokens I actually need. It almost comes back to the original idea of attention where you are selective, but attention is always on; you have maybe zero weight on some of them, but you use them all. But they are even more like, “Okay, let’s just mask that out or not even do that.” And even with sliding window attention in OLMo, that is also kind of like that idea. You have that rolling window where you keep it fixed, because you don’t need everything all the time.

Sebastian Raschka (02:49:34) Occasionally, in some layers you might, but it’s wasteful. But right now, I think if you use everything, you’re on the safe side; it gives you the best bang for the buck because you never miss information. And right now, I think this year will also be the year of figuring out, like you said, how to be smarter about that. Right now people want to have the next state-of-the-art, and the state-of-the-art happens to be the brute force, expensive thing. Once you have that, like you said, you want to keep that accuracy but see how we can do that cheaper now using tricks.

Nathan Lambert (02:50:07) Yeah, all this scaling thing. Like the reason we get the Claude 4.5 Sonnet model first is because you can train it faster and you’re not hitting these compute walls as soon. They can just try a lot more things and get the model out faster, even though the bigger model is actually better.

Robotics

Sebastian Raschka (02:50:22) I think we should say that there’s a lot of exciting stuff going on in the AI space. My mind has recently been really focused on robotics, so today we almost entirely didn’t talk about robotics. There’s a lot of stuff on image generation and video generation. I think it’s fair to say that the most exciting research work in terms of intensity and fervor is in the LLM space, which is why I think it’s justified for us to focus on the LLMs we’re discussing. But it’d be nice to bring in certain things that might be useful. For example, world models—there’s growing excitement about that. Do you think there will be any use in this coming year for world models in the LLM space?

Sebastian Raschka (02:51:08) Also with LLMs, what’s an interesting thing here is I think if we unlock more LLM capabilities, it also automatically unlocks all the other fields because it makes progress faster. Because, you know, a lot of researchers and engineers use LLMs for coding. So even if they work on robotics, if you optimize these LLMs that help with coding, it pays off. But then yes, world models are interesting. It’s basically where you have the model run a simulation of the world—like a little toy version of the real thing—which can unlock capabilities like data the LLM is not aware of. It can simulate things. I think LLMs happen to work well by pre-training and doing next-token prediction, but we could do this in a more sophisticated way.

Sebastian Raschka (02:52:05) There was a paper, I think by Meta, called “Coder World Models.” They basically apply the concept of world models to LLMs where, instead of just having next-token prediction and verifiable rewards checking the answer correctness, they also make sure the intermediate variables are correct. The model is basically learning a code environment. I think this makes a lot of sense; it’s just expensive to do. But it is making things more sophisticated by modeling the whole process, not just the result, and that can add more value.

Sebastian Raschka (02:52:51) I remember when I was a grad student, there’s a competition called CASP where they do protein structure prediction. They predict the structure of a protein that is not solved yet. In a sense, this is actually great, and I think we need something like that for LLMs also, where you do the benchmark but no one knows the solution until someone reveals it after the fact. When AlphaFold came out, it crushed this benchmark. I mean there were multiple iterations, but I remember the first one explicitly modeled the physical interactions and the physics of the molecule.

Sebastian Raschka (02:53:34) Also, things like impossible angles. Then in the next version, I think they got rid of this and just used brute force, scaling it up. I think with LLMs, we are currently in this brute-force scaling because it just happens to work, but I do think at some point it might make sense to bring back this approach. I think with world models, that might be actually quite cool. And of course, for robotics, that is completely related to LLMs.

Lex Fridman (02:54:03) Yeah, and robotics is very explicit. There’s the problem of locomotion or manipulation. Locomotion is much more solved, especially in the learning domain. But there’s a lot of value, just like with the initial protein folding systems… …Bringing in the traditional model-based methods. So it’s unlikely that you can just learn the manipulation or the whole-body local manipulation problem end-to-end. That’s the dream. But then you realize when you look at the magic of the human hand… …And the complexity of the real world, you realize it’s really hard to learn this all the way through- …the way I guess AlphaFold 2 didn’t.

Nathan Lambert (02:54:40) I’m excited about the robotic learning space. I think it’s collectively getting supercharged by all the excitement and investment in language models generally. The infrastructure for training transformers, which is a general modeling thing, is becoming world-class industrial tooling. Wherever there was a limitation for robotics, it’s just way better now. There’s way more compute. They take these language models and use them as central units where you can do interesting explorative work around something that already works. And then I see it emerging as, kind of like we talked about, Hugging Face transformers and Hugging Face.

Nathan Lambert (02:55:19) I think when I was at Hugging Face, I was trying to get this to happen, but it was too early. These open robotic models on Hugging Face enable people to contribute data and fine-tune them. I think we’re much closer now that the investment in robotics and self-driving cars is related and enables this. Once you get to the point where you have this sort of ecosystem, someone can download a robotics model and fine-tune it to their robot or share datasets across the world. There’s some work in this area like RTX from a few years ago where people are starting to do that. But once they have this ecosystem, it’ll look very different. And then this whole post-ChatGPT boom is putting more resources into that, which I think is a very good area for doing research.

Lex Fridman (02:56:02) This is also resulting in much better, more accurate, more realistic simulators being built, closing this sim-to-real gap in the robotic space. But you know, you mentioned a lot of excitement and investment. The downside of that, which happens in hype cycles—I personally believe, and most robotics people believe—is that robotics is not going to be solved on the timescale being implicitly or explicitly promised. So what happens when all these robotics companies spring up and then they don’t have a product that works? Then there’s going to be this crash of excitement, which is nerve-wracking. Hopefully something else will swoop in so that the continued development of some of these ideas keeps going.

Sebastian Raschka (02:56:53) I think it’s also related to the continual learning issue. The real world is so complex, whereas with LLMs, you don’t really need to have something learn for the user because there are a lot of things everyone has to do—everyone maybe wants to fix their grammar in their email or code. It’s more constrained, so you can prepare the model for that. But preparing a robot for the real world is harder. You have robotic foundation models, and you can learn things like grasping, but every house is different. It’s so different that the robot would have to learn on the job, essentially. And I think that is the bottleneck right now: customizing it on the fly.

Lex Fridman (02:57:42) I don’t think I can possibly understate the importance of the thing that doesn’t get talked about almost at all by robotics folks or anyone, and that is safety. All the interesting complexities we talk about regarding learning, all the failure modes and failure cases—everything we’ve been talking about with LLMs where sometimes it fails in interesting ways—all of that is fun and games in the LLM space. In the robotic space, in people’s homes, across millions of minutes and billions of interactions, you really are almost allowed to fail never. When you have embodied systems put out there in the real world, you just have to solve so many problems you never thought you’d have to solve when you’re just thinking about the general robot learning problem.

Nathan Lambert (02:58:32) I’m so bearish on in-home learned robots for consumer purchase. I’m very bullish on self-driving cars, and I’m very bullish for robotic automation, like Amazon distribution— …where Amazon has built whole new distribution centers designed for robots first rather than humans. There’s a lot of excitement in AI circles about AI enabling automation—

Nathan Lambert (02:58:54) …and mass-scale manufacturing, and I do think that the path to robots doing that is more reasonable. It’s a thing that is designed and optimized to do a repetitive task that a human could conceivably do but doesn’t want to. But it’s also going to take a lot longer than people probably predict. I think the leap from the AI singularity to scaling up mass manufacturing in the US because we have a massive AI advantage is one that is troubled by a lot of political and other challenging problems.

Timeline to AGI

Lex Fridman (02:59:31) Let’s talk about timelines specifically: timelines to AGI or ASI. Is it fair, as a starting point, to say that nobody really agrees on the definitions of AGI and ASI?

Nathan Lambert (02:59:46) I think there’s a lot of disagreement, but I’ve been getting pushback where people say it is something that could reproduce most digital economic work. The remote worker is a fairly reasonable example. I think OpenAI’s definition is somewhat related to that—an AI that can do a certain number of economically valuable tasks—which I don’t really love as a definition, but it could be a grounding point. Language models today, while immensely powerful, are not this remote worker drop-in. There are things an AI could do that are way harder than remote work, like solving a…

Nathan Lambert (03:00:29) …finding an unexpected scientific discovery that you couldn’t even posit, which would be an example of something people call an artificial superintelligence problem. Or taking in all medical records and finding linkages across certain illnesses that people didn’t know or figuring out that some common drug can treat a niche cancer. They would say that is a superintelligence thing. So these are natural tiers. My problem is that it becomes deeply entwined with the quest for meaning in AI and these religious aspects. There are different paths you can take.

Lex Fridman (03:01:06) And I don’t even know if remote work is a good definition. I liked the originally titled AI2027 report. They focus more on code and research taste, so the target there is the superhuman coder. They have several milestone systems: superhuman coders, superhuman AI researcher, then superintelligent AI researcher, and then the full ASI. After you develop the superhuman coder, everything else follows quickly. The task is to have fully autonomous, automated coding, so any kind of coding you need to do in order to perform research is fully automated.

Lex Fridman (03:01:58) From there, humans would be doing AI research together with that system, and they will quickly be able to develop a system that actually can do the research for you. That’s the idea. Initially, their prediction was 2027 or ’28, and now they’ve pushed it back by three to four years to 2031, mean prediction. My prediction is probably even beyond 2031, but at least you can think concretely about how difficult it is to fully automate programming.

Nathan Lambert (03:02:31) Yeah, I disagree with some of their presumptions and dynamics on how it would play out, but I think they did good work in defining concrete milestones to tell a useful story. That’s why the reach of this AI 2027 document well transcended Silicon Valley—because they told a good story and did a lot of rigorous work.

Nathan Lambert (03:02:53) I think the camp that I fall into is that AI is so-called jagged, which will be excellent at some things and really bad at some things. I think that when they’re close to this automated software engineer, what it will be good at is traditional ML systems and front end—the model is excellent at those—but the distributed ML, the models are actually really quite bad at because there’s so little training data on doing large-scale distributed learning and things. And this is something that we already see, and I think this will just get amplified. And then it’s kind of messier in these trade-offs, and then there’s how you think AI research works and so on.

Lex Fridman (03:03:28) So you think basically a superhuman coder is almost unachievable meaning, because of the jagged nature of the thing, you’re just always going to have gaps in capabilities?

Nathan Lambert (03:03:38) I think it’s assigning completeness to something where the models are kind of superhuman at some types of code, and I think that will continue. And people are creative, so they’ll utilize these incredible abilities to fill in the weaknesses of the models and move really fast. There will always be, for a long time, this dance between the humans enabling this thing that the model can’t do, and the best AI researchers are the ones that can enable this superpower.

Nathan Lambert (03:04:04) And I think those lines, compared to what we already see… I think like Claude Code for building a website, you can stand up a beautiful website in a few hours or do data analysis. But the whole thing is going to keep getting better at these things, and we’ll pick up some new code skills and stuff along the way. Linking to what’s happening in big tech, this AI 2027 report leans into the singularity idea where I think research is messy and social and largely in the data in ways that AI models can’t process. But what we do have today is really powerful, and these tech companies are all collectively buying into this with tens of billions of dollars of investment. So we are going to get some much better version of ChatGPT, a much better version of Claude Code than we already have.

Nathan Lambert (03:04:50) I think that it’s just hard to predict where that is going, but the bright clarity of that future is why some of the most powerful people in the world are putting so much money into this. And I think it’s just kind of small differences—we don’t actually know what a better version of ChatGPT is, but also can it automate AI research? I would say probably not, at least in this timeframe. Big tech is going to spend $100 billion much faster than we get an automated AI researcher that enables an AI research singularity.

Lex Fridman (03:05:22) So you think your prediction would be, if this is even a useful milestone, more than 10 years out?

Nathan Lambert (03:05:30) I would say less than that on the software side, but I think longer than that on things like research.

Lex Fridman (03:05:36) Well, let’s just for fun try to imagine a world where all software writing is fully automated. Can you imagine that world?

Nathan Lambert (03:05:46) By the end of this year, the amount of software that’ll be automated will be so high. But it’ll be things like you’re trying to train a model with RL and you need to have multiple bunches of GPUs communicating with each other. That’ll still be hard, but I think it’ll be much easier.

Lex Fridman (03:06:02) One of the ways to think about this, the full automation of programming, is just think of lines of useful code written—the fraction of that to the number of humans in the loop. So presumably there’ll be, for a long time, humans in the loop of software writing. It’ll just be fewer and fewer relative to the amount of code written. Right? And with the superhuman coder, I think the presumption there is the number of humans in the loop goes to zero. What does that world look like when the number of humans in the loop is in the hundreds, not in the hundreds of thousands?

Will AI replace programmers?

Nathan Lambert (03:06:39) I think software engineering will be driven more to system design and goals of outcomes, where I do think software is largely going to be… I think this has been happening over the last few weeks, where people have gone from a month ago saying, “Oh yeah, agents are kind of slop,” which is a famous Karpathy quote, to the industrialization of software when anyone can just create software with their fingerprints. I do think we are closer to that side of things, and it takes direction and understanding how the systems work to extract the best from the language models. And I think it’s hard to accept the gravity of how much is going to change with software development and how many more people can do things without ever looking at the code.

Sebastian Raschka (03:07:22) I think what’s interesting is to think about whether these systems will be independent, in the sense that while I have no doubt that LLMs will at some point solve coding in the way calculators solve calculating, right? At some point, humans developed a tool that you never need a human to calculate that number for; you just type it in, and it’s an algorithm. I think that’s the same probably for coding. But the question isn’t… I think what will happen is you will just say, “Build that website,” and it will make a really good website, and then you maybe refine it. But will it do things independently where…

Sebastian Raschka (03:07:59) Will you still have humans asking the AI to do something? Like will there be a person to say, “Build that website?” Or will there be AI that just builds websites or something, or whatever?

Lex Fridman (03:08:12) I think talking about building websites is the—

Lex Fridman (03:08:16) It’s just that the problem with websites and the problem with the web, you know, HTML and all that kind of stuff, it’s very resilient to just— slop. It will show you slop. It’s good at showing slop. I would rather think of safety-critical systems, like asking AI to end-to-end generate something that manages logistics— or manages cars— a fleet of cars, all that kind of stuff. So it end-to-end generates that for you.

Nathan Lambert (03:08:45) I think a more intermediate example is take something like Slack or Microsoft Word. I think if the organizations allow it, AI could very easily implement features end-to-end and do a fairly good job for things that you want to try. You want to add a new tab in Slack that you want to use, and I think AI will be able to do that pretty well.

Lex Fridman (03:09:06) Actually, that’s a really great example. How far away are we from that?

Nathan Lambert (03:09:09) Like this year.

Lex Fridman (03:09:11) See, I don’t know. I don’t know.

Nathan Lambert (03:09:14) I guess I don’t know— how bad production codebases are, but I think that within… on the order of a few years, a lot of people are going to be pushed to be more like a designer and product manager, where you have multiple of these agents that can try things for you, and they might take one to two days to implement a feature or attempt to fix a bug. And you have these dashboards—which I think Slack is actually a good dashboard—where your agents will talk to you and you’ll then give feedback. But things like, I make a website and it’s like, “Do you want to make a logo that’s passable?” I think these cohesive design things and the style is going to be very hard for models and deciding on what to add next.

Lex Fridman (03:09:54) I just… Okay. So I hang out with a lot of programmers and some of them are a little bit on the skeptical side in general—that’s just the vibe. I just think there’s a lot of complexity involved in adding features to complex systems. Like, if you look at the browser, Chrome. If I wanted to add a feature, if I wanted to have tabs as opposed to up top, I want them on the left side. Interface, right? I think we’re not… This is not a next year thing.

Nathan Lambert (03:10:26) One of the Claude releases this year, one of their tests was we give it a piece of software and leave Claude to run to recreate it entirely, and it could already almost rebuild Slack from scratch, just given the parameters of the software and left in a sandbox environment to do that.

Lex Fridman (03:10:41) So the from-scratch part, I like almost better.

Nathan Lambert (03:10:44) So it might be that the smaller and newer companies are advantaged and they’re like, “We don’t have to have the bloat and complexity, and therefore this feature exists.”

Sebastian Raschka (03:10:53) And I think this gets to the point that you mentioned that some people you talk to are skeptical, and I think that’s not because the LLM can’t do X, Y, Z. It’s because people don’t want it to do it this way.

Lex Fridman (03:11:05) Some of that could be a skill issue on the human side. Unfortunately, we have to be honest with ourselves. And some of that could be an underspecification issue. So, programming… this is like a communication type of issue in relationships and friendships. You’re assuming the LLM somehow is supposed to read your mind. I think this is where spec-driven design is really important. Like you just, using natural language, specify what you want.

Nathan Lambert (03:11:32) I think if you talk to people at the labs, they use these in their training and production code. Claude Code is built with Claude Code, and they all use these things extensively. And Dario talks about how much of Claude’s code… It’s like these people are slightly ahead in terms of the capabilities—

Nathan Lambert (03:11:49) —they have, and they probably spend on inference. They could spend 10 to 100 times as much as we’re spending, like we’re on a lowly 100 or $200 a month plan. They truly let it rip. And I think that with the pace of progress that we have, a year ago we didn’t have Claude Code and we didn’t really have reasoning models. The difference between sitting here today and what we can do with these models—it seems like there’s a lot of low-hanging fruit to improve them. The failure modes are pretty dumb. It’s like, “Claude, you tried to use the CLI command I don’t have installed 14 times, and then I sent you the command to run.” From a modeling perspective, that thing is pretty fixable. So, I don’t know.

Lex Fridman (03:12:34) I agree with you. I’ve been becoming more and more bullish in general. Speaking to what you’re articulating, I think it is a human skill issue. So Anthropic and other companies are leading the way in understanding how to best use the models for programming; therefore, they’re effectively using them. I think there’s a lot of programmers on the outskirts who don’t… I mean, there’s not a really good guide on how to use them. People are trying to figure it out exactly, but—

Nathan Lambert (03:13:04) It might be very expensive. It might be that the entry point for that is $2,000 a month, which is only for tech companies and rich people. That could be it.

Lex Fridman (03:13:13) But it might be worth it. If the final result is a working software system, it might be worth it. By the way, it’s funny how we converged from the discussion of the timeline to AGI to something more pragmatic and useful. Is there anything concrete and profound to be said about the timeline to AGI and ASI? Or are these discussions a bit too detached from the day-to-day?

Nathan Lambert (03:13:39) There’s interesting bets. There’s a lot of people trying to do Reinforcement Learning with Verifiable Rewards—RLVR—but in real scientific domains. There are startups spending hundreds of millions of dollars in funding, and they have wet labs where they’re having language models propose hypotheses that are tested in the real world. I would say that they’re early, but with the pace of progress—

Nathan Lambert (03:14:00) —maybe they’re early by six months and they make it because they were there first, or maybe they’re early by eight years. You don’t really know. So I think that type of moonshot to branch this momentum into other sciences would be very transformative if AlphaFold moments happen in all sorts of other scientific domains by a startup solving this. I think there are startups—maybe Harmonic is one—where they’re going all in on language models plus Lean for math. I think you had another podcast guest where you talked about this recently, and it’s like we don’t know exactly what’s going to fall out of spending $100 million on that model.

Nathan Lambert (03:14:41) Most of them will fail, but a couple of them might be big breakthroughs that are very different than ChatGPT or Claude Code type software experiences. Like a tool that’s only good for a PhD mathematician but makes them 100 times more effective.

Sebastian Raschka (03:14:58) I agree. I think this will happen in a lot of domains, especially those with a lot of resources like finance, legal, and pharmaceutical companies. But then again, is it really AGI? Because we are specializing it again. Is it really that much different from how we had specialized algorithms back in the day? I think it’s just the same thing but way more sophisticated. Is there a threshold when we call it AGI? I think the real cool thing here is that we have foundation models that we can specialize. That’s the breakthrough.

Sebastian Raschka (03:15:34) Right now, I think we are not there yet because first, it’s too expensive, but also ChatGPT doesn’t just give away their model to customize it. I can imagine a business model where OpenAI says at some point, “Hey, Bank of America, for $100 million we will do your custom model.” I think that will be the huge economic value add. The other thing though is, what is the differentiating factor? If everyone uses ChatGPT, they will all do the same thing. Everyone is moving in lockstep, but usually companies want to have a competitive advantage. I think there is no way around using some of their private data and experimenting with specialization. It’s going to be interesting.

Nathan Lambert (03:16:26) Given the pace of progress, it does feel like things are coming. I don’t think the AGI and ASI thresholds are particularly useful.

Lex Fridman (03:16:35) I think the real question, and this relates to the remote worker thing, is when are we going to see a big, obvious leap in economic impact? Because currently there’s not been an obvious leap in economic impact from LLM models, for example. Aside from AGI or ASI, there’s a real question of when we are going to see a GDP jump. Jump.

Nathan Lambert (03:17:06) Yeah, it’s like, what is the GDP made up of? A lot of it is financial services, so I don’t know what this is. It’s just hard for me to think about the GDP bump, but I would say that software development becomes valuable in a different way when you no longer have to look at the code anymore. So when it is like, Claude will make you a small business—which is essentially Claude can set up your website, your bank account, your email, and your whatever else—and you just have to express what you’re trying to put into the world. That’s not just an enterprise market, but it is hard. I don’t know how you get people to try doing that. I guess if ChatGPT can do it—people are trying ChatGPT.

Lex Fridman (03:17:49) I think it boils down to the scientific question of, “How hard is tool use to solve?” Because a lot of the stuff you’re implying, the remote work stuff, is tool use. It’s like computer use; how you have an LLM that goes out there, this agentic system, and does something in the world, and only screws up 1% of the time.

Nathan Lambert (03:18:11) Computer use is a good example of what labs care about and we haven’t seen a lot of progress on.

Nathan Lambert (03:18:12) We saw multiple demos in 2025 of, like, Claude can use your computer, or OpenAI had operator, and they all suck. So they’re investing money in this, and I think that’ll be a good example. Whereas actually, taking over the whole screen seems a lot harder than having an API that they can call in the back end. Some of that is you have to then set up a different environment for them all to work in. They’re not working on your MacBook; they are individually interfacing with Google and Amazon and Slack, and they handle all these things in a very different way than humans do. So some of this might be structural blockers.

Sebastian Raschka (03:18:55) Also, specification-wise, I think the problem for arbitrary tasks is that you still have to specify what you want your LLM to do. What is the environment? How do you specify? You can say what the end goal is, but if it can’t solve the end goal—with LLMs, if you ask it for text, it can always clarify or do sub-steps. How do you put that information into a system that, let’s say, books a travel trip for you? You can say, “Well, you screwed up my credit card information,” but even to get it to that point, as a user, how do you guide the model before it can even attempt that? I think the interface is really hard.

Lex Fridman (03:19:36) Yeah, it has to learn a lot about you specifically. And this goes to continual learning—about the general mistakes that are made throughout, and then mistakes that are made through you.

Nathan Lambert (03:19:48) All the AI interfaces are getting set up to ask humans for input. I think Claude Code we talked about a lot. It asks feedback and questions. If it doesn’t have enough specification on your plan or your desire, it starts to ask questions, “Would you rather?” We talked about Memory, which saves across chats. Its first implementation is kind of odd, where it’ll mention my dog’s name or something in a chat. I’m like, “You don’t need to be subtle about this. I don’t care.” But things are emerging, like ChatGPT has the Pulse feature.

Nathan Lambert (03:20:19) Which is like a curated couple paragraphs with links to something to look at or to talk about, and people talk about how the language models are going to ask you questions. It’s probably going to work. The language model knows you had a doctor appointment or something, and it’s like, “Hey, how are you feeling after that?” Which again, goes into the territory where humans are very susceptible to this and there’s a lot of social change to come. But also, they’re experimenting with having the models engage. Some people really like this Pulse feature, which processes your chats and automatically searches for information and puts it in the ChatGPT app. So there’s a lot of things coming.

Sebastian Raschka (03:20:58) I used that feature before, and I always feel bad because it does that every day, and I rarely check it out. How much compute is burned on something I don’t even look at, you know?

Nathan Lambert (03:21:11) There’s also a lot of idle compute in the world, so don’t feel too bad.

Lex Fridman (03:21:16) Okay. Do you think new ideas might be needed? Is it possible that the path to AGI—whatever that is, however we define that—to solve computer use more generally, to solve biology and chemistry and physics, sort of the Dario definition of AGI or powerful AI? Do you think it’s possible that totally new ideas are needed? Non-LLM, non-RL ideas. What might they look like? We’re now going into philosophy land a little bit.

Nathan Lambert (03:21:50) For something like a singularity to happen, I would say yes. And the new ideas could be architectures or training algorithms, which are fundamental deep learning things. But they’re, in that nature, pretty hard to predict. But I think we won’t get very far even without those advances. Like, we might get this software solution, but it might stop at software and not do computer use without more innovation. So I think that a lot of progress will be coming, but if you’re going to zoom out, there’s still ideas in the next 30 years that are going to look like a major scientific innovation that enabled the next chapter of this. And I don’t know if it comes in one year or in 15 years.

Lex Fridman (03:22:32) Yeah. I wonder if the Bitter Lesson holds true for the next 100 years, and what that looks like.

Nathan Lambert (03:22:37) If scaling laws are fundamental in deep learning, I think the Bitter Lesson will always apply, which is compute will become more abundant. But even within abundant compute, the ones that have a steeper scaling law slope or a better offset—like, this is a 2D plot of performance and compute—even if there’s more compute available, the ones that get 100x out of it will win.

Lex Fridman (03:23:01) It might be something like literally computer clusters orbiting Earth with solar panels.

Nathan Lambert (03:23:09) The problem with that is heat dissipation. You get all the radiation from the sun and you don’t have any air to dissipate heat. But there is a lot of space to put clusters. There’s a lot of solar energy there and you could figure out the heat dissipation, but there is a lot of energy and there probably could be engineering will to solve the heat problem— …so there could be.

Lex Fridman (03:23:27) Is it possible—and we should say that it definitely is possible—that we’re basically going to be plateauing this year? Not in terms of— …the system capabilities, but what the system capabilities actually mean for human civilization. So on the coding front, really nice websites will be built. Very nice autocomplete.

Lex Fridman (03:23:53) Very nice way to understand code bases and maybe help debug, but really just a very nice helper on the coding front. It can help research mathematicians do some math. It can help you with shopping. It’s a nice helper. It’s Clippy on steroids. What else? It may be a good education tool and all that kind of stuff, but computer use turns out extremely difficult to solve. So I’m trying to frame the cynical case in all these domains where there’s not a really huge economic impact, but realize how costly it is to train these systems at every level—both the pre-training and the inference, how costly the inference is, the reasoning, all of that. Is that possible? And how likely is that, do you think?

Nathan Lambert (03:24:47) When you look at the models, there are so many obvious things to improve, and it takes a long time to train these models and to do this art, that it’ll take us with the ideas that we have multiple years to actually saturate in terms of whatever benchmark or performance we are searching for. It might serve very narrow niches. The average ChatGPT user might not get a lot of benefit out of this, but it is going to serve different populations by getting better at different things.

Is the dream of AGI dying?

Lex Fridman (03:25:18) But I think what everybody’s chasing now is a general system that’s useful to everybody. So, okay, if that’s not… that can plateau, right?

Nathan Lambert (03:25:28) I think that dream is actually kind of dying. As you talked about with the specialized models where it’s like… and multimodal is often… like, video generation is a totally different thing.

Lex Fridman (03:25:39) “That dream is kind of dying” is a big statement, because I don’t know if it’s dying. If you ask the actual Frontier Lab people, they’re still chasing it, right?

Sebastian Raschka (03:25:48) I do think they are still rushing to get the next model out, which will be much better than the previous one. “Much” is a relative term, but it will be better than the previous one. I can’t see them slowing down. I just think the gains will be made or felt more through not only scaling the model, but now… I feel like there’s a lot of tech debt. It’s like, “Well, let’s just put the better model in there, and better model, better model.” And now people are like, “Okay, let’s also at the same time improve everything around it too.”

Sebastian Raschka (03:26:20) Like the engineering of the context and inference scaling. And the big labs will still keep doing that. And now also the smaller labs will catch up to that because now they are hiring more. There will be more people. LLMs, it’s kind of like a circle. They also make them more productive and it’s just like an amplifier. I think what we can expect is amplification, but not a paradigm change. I don’t think that is true, but everything will be just amplified and amplified and amplified, and I can see that continuing for a long time.

Nathan Lambert (03:26:52) Yeah. I guess my statement with the dream is dying depends on exactly what you think it’s going to be doing. Like Claude Code is a general model that can do a lot of things, but it depends a lot on integrations and other things. I bet Claude Code could do a fairly good job of doing your email, and the hardest part is figuring out how to give it information and how to get it to be able to send your emails and stuff like this. But I think it goes back to what is the “one model to rule everything” ethos, which is just like a thing in the cloud that handles your entire digital life and is way smarter than everybody.

Nathan Lambert (03:27:34) So it’s an interesting leap of faith to go from Claude Code becomes that—which, in some ways, there are some avenues for that—but I do think that the rhetoric of the industry is a little bit different.

Sebastian Raschka (03:27:49) I think the immediate thing we will feel next as a normal person using LLMs will probably be related to something trivial, like making figures. Right now LLMs are terrible at making figures. Is it because we are getting served the cheap models with less inference compute than behind the scenes? Maybe with some cranks we can already get better figures, but if you ask today to draw a flowchart of X, Y, Z, it’s most of the time terrible. And it is kind of a very simple task for a human. I think it’s almost easier sometimes to draw something than to write something.

Nathan Lambert (03:28:25) Yeah, the multimodal understanding does feel like something that is odd, that it’s not better solved.

Lex Fridman (03:28:31) I think we’re not saying one actually obvious thing that we’re not realizing, that’s a gigantic thing that’s hard to measure, which is making all of human knowledge accessible… …To the entire world. One of the things that I think is hard to articulate, but there’s just a huge difference between Google Search and an LLM. I feel like I can basically ask an LLM anything and get an answer, and it’s doing less and less hallucination.

Lex Fridman (03:29:04) And that means understanding my own life, figuring out a career trajectory, figuring out how to solve the problems all around me, learning about anything through human history. I feel like nobody’s really talking about that because they just immediately take it for granted that it’s awesome. That’s why everybody’s using it—it’s because you get answers for stuff, and think about the impact of that across time. This is not just in the United States; this is all across the world. Kids throughout the world being able to learn these ideas—the impact that has across time is probably where the real GDP growth will be. It won’t be like a leap.

Lex Fridman (03:29:51) It’ll be that that’s how we get to Mars, that’s how we build these things, that’s how we have a million new OpenAIs, all the kind of innovation that happens from there. And that’s just this quiet force that permeates everything, right? Human knowledge.

Sebastian Raschka (03:30:06) I do agree with you, and in a sense it makes knowledge more accessible, but it also depends on what the topic is. For something like math, you can ask it questions and it answers, but if you want to learn a topic from scratch—we talked about this earlier—I think the sweet spot is still math textbooks where someone laid it out linearly. That is a proven strategy to learn a topic, and it makes sense if you start from zero to get information-dense text to soak it up, but then you use the LLM to make infinite exercises.

Sebastian Raschka (03:30:47) If you have problems in a certain area or have questions about things you are uncertain about, you ask it to generate example problems, you solve them, and then maybe you need more background knowledge and you ask it to generate that. But it won’t give you anything that is not in the textbook. It’s just packaging it differently, if that makes sense.

Sebastian Raschka (03:31:13) But then there are things where it also adds value in a more timely sense, where there is no good alternative besides a human doing it on the fly. For example, if you’re planning to go to Disneyland and you try to figure out which tickets to buy for which park when, well, there is no textbook on that. There is no information-dense resource on that. There’s only the sparse internet, and then there is a lot of value in the LLM. You just ask it. You have the constraints on traveling on these specific days, you want to go to certain places, and you ask it to figure out what you need, when and from where… …What it costs and stuff like that. It is a very customized, on-the-fly package. Personalization is essentially like—

How AI will make money?

Sebastian Raschka (03:32:02) …pulling information from the sparse internet, the non-information-dense thing where there’s no better version that exists. You make it from scratch almost.

Lex Fridman (03:32:12) And if it does exist, it’s full of—speaking of Disney World—ad slop. Like any city in the world, if you ask “what are the top 10 things to do?” An LLM is just way better to ask… …Than anything on the internet.

Nathan Lambert (03:32:29) Well, for now, that’s because they’re massively subsidized, and eventually they’re going to be paid for by ads.

Lex Fridman (03:32:38) No. I’m hoping there’s a very clear indication of what’s an ad and what’s not an ad in that context, but—

Sebastian Raschka (03:32:46) That’s something I mentioned a few years ago. It’s like, I don’t know, if you are looking for a new running shoe, is it a coincidence that Nike maybe comes up first? Maybe, maybe not. I think there are clear laws around this. You have to be clear about that, but I think that’s what everyone fears. It’s like the subtle message in there or something like that. But also, this brings us to the topic of ads where, I think this was a thing, hopefully they try to launch in 2025 because I think they’re still not making money in that other way right now, so… …Like having actual ad spots in there. And then the thing, though, is they couldn’t because there are alternatives without ads and people would just flock-

Sebastian Raschka (03:33:31) …to the other products. And it also is just crazy how they’re one-upping each other, spending so much money just to get the users.

Nathan Lambert (03:33:41) I think so. Like some Instagram ads—I don’t use Instagram- …but I understand the appeal of paying a platform to find users who will genuinely like your product. That is the best case of things like Instagram ads.

Nathan Lambert (03:33:56) But there are also plenty of cases where advertising is very awful for incentives. I think that a world where the power of AI can integrate with that positive view—like, I am a person and I have a small business and I want to make the best damn steak knives in the world and I want to sell them to somebody who needs them. And if AI can make that sort of advertising work even better, that’s very good for the world, especially with digital infrastructure because that’s how the modern web has been built. But that’s not to say that addicting feeds so that you can show people more content is a good thing. So, I think that’s even what OpenAI would say is they want to find a way that can make the monetization upside of ads while still giving their users agency.

Nathan Lambert (03:34:45) And I personally would think that Google is probably going to be better at figuring out how to do this because they already have ad supply. If they figure out how to turn this demand in their Gemini app into useful ads, then they can turn it on. I don’t know if I think it’s this year, but there will be experiments with it.

Sebastian Raschka (03:35:06) I do think what holds companies back right now is really just that the competition is not doing it. It’s more like a reputation thing. I think people are just afraid right now of ruining their reputation or losing users- …because it would make headlines if someone launched these ads. But-

Nathan Lambert (03:35:23) Unless they were great, but the first ads won’t be great because it’s a hard problem that we don’t know how to solve.

Sebastian Raschka (03:35:28) Yeah, I think also the first version of that will likely be something like on X, like the timeline where you have a promoted post sometimes in between. It’ll be something like that where it will say “promoted” or something small, and then there will be an image or something. I think right now the problem is who makes the first move.

Nathan Lambert (03:35:43) If we go 10 years out, the proposition for ads is that you will make so much money on ads by having so many users- …that you can use this to fund better R&D and- …make better models, which is why- …like YouTube is dominating the market. Netflix is scared of YouTube. They make, I don’t know—I pay $28 a month for premium. They make at least $28 a month off of me and many other people, and they’re just creating such a dominant position in video. So I think that’s the proposition, which is that ads can give you a sustained advantage- …in what you’re spending per user. But there’s so much money in it right now that it’s like somebody starting that flywheel- is scary because it’s a long-term bet.

Big acquisitions in 2026

Lex Fridman (03:36:29) Do you think there’ll be some crazy big moves this year business-wise? Like Google or Apple acquiring Anthropic or something like this?

Nathan Lambert (03:36:40) Dario will never sell, but we are starting to see some types of consolidation, with Groq being valued at $20 billion and Scale AI for almost 30 billion. There are countless other deals structured in a way that is actually detrimental to the Silicon Valley ecosystem—these licensing deals where not everybody gets brought along, rather than a full acquisition that benefits the rank-and-file employees by getting their stock vested. That’s a big issue for Silicon Valley culture to address because the startup ecosystem is the lifeblood. If you join a startup, even if it’s not that successful, your startup very well might get acquired at a cheap premium and you’ll get paid out for your equity.

Nathan Lambert (03:37:24) And these licensing deals are essentially taking the top talent a lot of the time. I think the deal for Groq to NVIDIA is rumored to be better for the employees, but it is still this antitrust-avoiding thing. I think that this trend of consolidation will continue. Me and many smart people I respect have been expecting consolidation to have happened sooner, but it seems like things are starting to turn. But at the same time, you have companies raising ridiculous amounts of money for reasons that I don’t understand. I’m like, “I don’t know why you’re taking that money.” So it’s maybe mixed this year, but some consolidation pressure is starting.

Lex Fridman (03:38:04) What kind of surprising consolidation do you think we’ll see? You say Anthropic is a “never.” I mean, Groq is a big one—Groq with a Q, by the way.

Nathan Lambert (03:38:12) Yeah. There’s just a lot of startups and there’s a very high premium on AI startups. So there could be a lot of $10 billion range acquisitions, which is a really big acquisition for a startup that was maybe founded a year ago. I think Manus.ai—this company based in Singapore that was founded eight months ago and then had a $2 billion exit. I think there will be some other big multi-billion dollar acquisitions, like Perplexity.

Lex Fridman (03:38:39) Like Perplexity, right?

Nathan Lambert (03:38:40) Yeah, people rumor them to Apple. I think there’s a lot of pressure and liquidity in AI. There’s pressure on big companies to have outcomes, and I would guess that a big acquisition gives people leeway to then tell the next chapter of that story.

Lex Fridman (03:38:56) I mean, yeah, we’ve been talking about code. Maybe somebody acquires Cursor.

Nathan Lambert (03:39:02) They’re in such a good position because they have so much user data. And we talked about continual learning and stuff; they had one of the most interesting blog posts. They mentioned that their new Composer model was a fine-tune of one of these large Mixture of Experts models from China. You can know that from gossip or because the model sometimes responds in Chinese, which none of the American models do. They had a blog post where they said, “We’re updating the model weights every 90 minutes based on real-world feedback from people using it.” Which is the closest thing to real-world RL happening on a model, and it was just right there in one of their blog posts.

Lex Fridman (03:39:36) That’s incredible.

Nathan Lambert (03:39:36) —which is super cool.

Lex Fridman (03:39:38) And by the way, I should say I use Composer a lot because one of the benefits it has is it’s fast.

Nathan Lambert (03:39:43) I need to try it because everybody says this.

Lex Fridman (03:39:45) And there’ll be some IPOs potentially. You think Anthropic, OpenAI, xAI?

Nathan Lambert (03:39:51) They can all raise so much money so easily that they don’t feel a need to… So long as fundraising is easy, they’re not going to IPO because public markets apply pressure.

Nathan Lambert (03:40:00) I think we’re seeing in China that the ecosystem’s a little different, with both MiniMax and Z.ai applying for filing IPO paperwork, which will be interesting to see how the Chinese market reacts. I actually would guess that it’s going to be similarly hypey to the US so long as all this is going, and not based on the realities that they’re both losing a ton of money. I wish more of the American gigantic AI startups were public because it would be very interesting to see how they’re spending their money and have more insight. And also just to give people access to investing in these, because I think that they’re the companies of the era. And the tradition is now for so many of the big startups in the US to not go public.

Nathan Lambert (03:40:43) It’s like we’re still waiting for Stripe and their IPO, but Databricks definitely didn’t; they raised like a Series G or something. And I just feel like it’s a kind of a weird equilibrium for the market where I would like to see these companies go public and evolve in that way that a company can.

Lex Fridman (03:41:01) You think 10 years from now some of the frontier model companies are still around? Anthropic, OpenAI?

Nathan Lambert (03:41:08) I definitely don’t see it to be a winner-takes-all unless there truly is some algorithmic secret that one of them finds that lets this flywheel. Because the development path is so similar for all of them. Google and OpenAI have all the same products, and Anthropic’s more focused, but when you talk to people, it sounds like they’re solving a lot of the same problems. So I think… there’s offerings that’ll spread out. It’s a very big cake that’s being made that people are going to take money out of.

Lex Fridman (03:41:36) I don’t want to trivialize it, but OpenAI and Anthropic are primarily LLM service— —providers. And some of the other companies like Google and xAI, linked to X, do other stuff— —too. And so it’s very possible if AI becomes more commodified that the companies that are just providing LLMs will die.

Sebastian Raschka (03:42:00) I think the advantage they have is they have a lot of users, and I think they will just pivot. Like Anthropic, I think, pivoted. I don’t think they originally planned to work on code, but it happened that they found, “Okay, this is a nice niche and now we are comfortable in this niche and we push on this niche.” And I can see the same thing once… Let’s say hypothetically speaking, I’m not sure if it will be true, but let’s say Google takes all the market share of the general chatbot. Maybe OpenAI will then be focused on some other sub-topic— —like… They have too many users to go away in the foreseeable future, I think.

Lex Fridman (03:42:37) I think Google is always ready to say, “Hold my beer,” with AI mode.

Nathan Lambert (03:42:40) I think the question is if the companies can support the valuations. I’d see the AI companies being looked at in some ways like AWS, Azure, and GCP, which are all competing in the same space and all very successful businesses. There’s a chance that the API market is so unprofitable that they go up and down the stack to products and hardware. They have so much cash that they can build power plants and build data centers, which is a durable advantage now. But there’s also just a reasonable outcome that these APIs are so valuable and so flexible for developers that they become the likes of something like AWS. But AWS and Azure are also going to have these APIs, so five or six people competing in the API market is hard. So maybe that’s why they get squeezed out.

Lex Fridman (03:43:27) You mentioned “RIP LLaMA.” Is there a path to winning for Meta?

Nathan Lambert (03:43:32) I think nobody knows. They’re moving a lot, so they’re signing licensing deals with Black Forest Labs, which is image generation, or Midjourney. So I think in some ways, on the product and consumer-facing AI front, it’s too early to tell. I think they have some people that are excellent and very motivated being close to Zuckerberg. So I think that there’s still a story to unfold there. Llama is a bit different, where Llama was the most focused expression of the organization. And I don’t see Llama being supported to that extent anymore. I think it was a very successful brand for them, so they still might do some part of participation in the open ecosystem or continue the Llama brand into a different service, because people know what Llama is.

Lex Fridman (03:44:21) You think there’s a Llama 5?

Nathan Lambert (03:44:24) Not an open weight one.

Sebastian Raschka (03:44:26) It’s interesting. Just to recap a bit, I mean, Llama was the pioneering open-weight model—Llama 1, 2, 3, a lot of love. But I think then what happened, just hypothesizing or speculating, is that the leaders at Meta, like the upper executives, got very excited about Llama because they saw how popular it was in the community. And then I think the problem was trying to use the open source to make a bigger splash. It felt forced, like developing these very big Llama 4 models just to be on the top of the benchmarks.

Sebastian Raschka (03:45:09) But I don’t think the goal of Llama models is to be on top of the benchmarks beating, let’s say, ChatGPT or other models. I think the goal was to have a model that people can use, trust, modify, and understand. That includes having smaller models; they don’t have to be the best models. And what happened was just these models were—of course, the benchmarks suggest that they were better than they were because I think they had specific models trained on preferences so that they performed well on the benchmarks. That’s kind of this overfitting thing to force it to be the best. But then at the same time, they didn’t do the small models that people could use, and no one could run these big models.

Sebastian Raschka (03:45:45) And then there was kind of a weird thing. I think it’s just because people got too excited about headlines pushing the frontier. I think that’s it.

Lex Fridman (03:45:54) And too much on the benchmark-sync side.

Sebastian Raschka (03:45:56) It’s too much work.

Nathan Lambert (03:45:57) I think it imploded under internal political fighting and misaligned incentives. The researchers want to build the best models, but there’s a layer of organization— …and management that is trying to demonstrate that they do these things. There are a lot of pieces and rumors where some horrible technical decision was made, and it just seems like it got too bad where it all just crashed out.

Lex Fridman (03:46:24) Yeah, but we should also give huge props to Mark Zuckerberg. I think it comes from Mark actually, from the top of the leadership, saying open source is important. The fact that that exists means there could be a Llama 5, where they learn the lessons from the benchmark-syncing and say, “We’re going to be GPT-OSS—” “…and provide a really awesome library of open source.”

Nathan Lambert (03:46:51) What people say is that there’s a debate between Mark and Alexander Wang, who is very bright but much more against open source. To the extent that he has a lot of influence over the AI org, it seems much less likely, because Mark brought him in for fresh leadership in directing AI. And if being open or closed is no longer the defining nature of the model, I don’t expect that to be a defining argument between Mark and Alex. They’re both very bright, but I have a hard time understanding all of it because Mark wrote this piece in July of 2024, which was probably the best blog post at the time, making the case for open source AI. And then July 2025 came around and it was like, “We’re reevaluating our relationship with open source.” So it’s just kind of…

Sebastian Raschka (03:47:42) But I think also the problem—well, we may have been a bit too harsh, and that caused some of that. I mean, we as open source developers or the open source community. Because even though the model was maybe not what everyone hoped for, it got a lot of backlash. I think that was a bit unfortunate because as a company, they were hoping for positive headlines. Instead of just getting no headlines or positive headlines, they got negative headlines. And then it kind of reflected badly on the company. It’s maybe a spite reaction, almost like, “Okay, we tried to do something nice, we tried to give you something cool like an open source model, and now you are being negative about us.” So in that sense, it looks like, “Well, maybe then we’ll change our mind.” I don’t know.

Lex Fridman (03:48:38) Yeah, that’s where the dynamics of discourse on— …X can lead us as a community astray. Because sometimes it feels random; people pick the things they like and don’t like. I mean, you can see the same thing with Grok 4.1 and Grok Code Fast 1.0. I don’t think, vibe-wise, people love it publicly. But a lot of people use it. So if you look at Reddit and X, they don’t really give it praise from the programming community— … but, like, they use it. And the same thing with probably Llama. I don’t understand the dynamics of either positive hype or negative hype. I don’t understand it.

Nathan Lambert (03:49:25) I mean, one of the stories of 2025 is the US filling the gap of Llama, which is all the rise of these Chinese open-weight models- … to the point where I was like, “That was the single issue I’ve spent a lot of energy on in the last five months,” which is trying to do policy work- … to get the US to invest in this.

Lex Fridman (03:49:41) So just tell me the story of Adam.

Nathan Lambert (03:49:43) Adam Project is… It started as me calling it the American DeepSeek Project, which doesn’t really work for DC audiences, but it’s the story of what is the most impactful thing I can do with my career. These Chinese open-weight models are cultivating a lot of power and there is a lot of demand for building on open models, especially in enterprises in the US that are very cagey about Chinese models.

Lex Fridman (03:50:06) Looking at Perplexity, The Adam Project—American Truly Open Models—is a US-based initiative to build and host high-quality, genuinely open-weight AI models and supporting infrastructure explicitly aimed at competing with and catching up to China’s rapidly advancing open-source AI ecosystem.

Nathan Lambert (03:50:25) I think the one-sentence summary would be that—or two sentences. One is a proposition that open models are going to be an engine for AI research because that is what people start with; therefore, it’s important to own them. And the second one is, therefore, the US should be building the best models so that the best research happens in the US and those US companies take the value from being the home of where AI research is happening. Without more investment in open models, we have all the plots on the website where it’s like, “Qwen, Qwen, Qwen, Qwen,” and it’s all these models that are excellent from these Chinese companies that are cultivating influence in the US and internationally.

Nathan Lambert (03:51:07) And the US is spending way more on AI. The ability to create open models that are half a generation or a generation beyond what the cutting edge of closed labs is costs roughly $100 million, which is a lot of money, but not compared to what these companies have. Therefore, we need a centralizing force of people who want to do this. I think we got signed engagement from people pretty much across the full stack, including policy.

Lex Fridman (03:51:33) So there has been support from the administration?

Nathan Lambert (03:51:36) I don’t think anyone technically in government has signed it publicly, but I know that people that have worked in AI policy, both in the Biden and Trump administrations, are very supportive of trying to promote open-source models in the US. I think, for example, AI2 got a grant from the NSF for $100 million over four years, which is the biggest CS grant the NSF has ever awarded, for AI2 to attempt this, and I think it’s a starting point. But the best results happen when there are multiple organizations building models because they can cross-pollinate ideas and build this ecosystem. It doesn’t work if it’s just Llama releasing models to the world, because Llama could go away. The same thing applies for AI2; I can’t be the only one building models.

Nathan Lambert (03:52:24) It becomes a lot of time spent on talking to people, whether they’re in policy… I know NVIDIA is very excited about this. I think Jensen Huang has been specifically talking about the urgency for this, and they’ve done a lot more in 2025, where the Nemotron 3 models are more of a focus. They’ve started releasing some data along with NVIDIA’s open models and very few companies do this, especially of NVIDIA’s size. So there are signs of progress. We hear about Reflection AI where they say their two-billion-dollar fundraise is dedicated to building US open models, and their announcement tweet reads like a cultural tide starting to turn.

Nathan Lambert (03:53:09) I think in July was when we had four or five DeepSeek-caliber Chinese open-weight models and zero from the US. That’s the moment where I released this and was like, “I guess I have to spend energy on this because nobody else is gonna do it.” So it takes a lot of people contributing together. I’m not saying the Adam Project is the only thing moving the ecosystem, but it’s people like me doing this sort of thing to get the word out.

Manhattan Project for AI

Sebastian Raschka (03:53:35) Do you like the 2025 America’s AI Action Plan? That includes open source stuff. The White House AI Action Plan includes a dedicated section titled “Encourage Open-Source and Open-Weight AI,” defining such models and arguing they have unique value for innovation and startups.

Nathan Lambert (03:53:52) Yeah. I mean, the AI Action Plan is just a plan, but I think it’s maybe the most coherent policy document that has come out of the administration, and I hope that it largely succeeds. I know people that have worked on it. The challenge is taking policy and making it real, and I have no idea how to do this as an AI researcher, but largely a lot of things in that were very real. There’s a huge build-out of AI in the country, and while there are issues people hear about, from water use to whatever, we should be able to build things in this country without ruining places in the process. It’s worthwhile to spend energy on.

Nathan Lambert (03:54:35) I think that’s a role for the federal government. They set the agenda. And setting the agenda so that open-weight should be a first consideration is a large part of what they can do to get people thinking about it.

Sebastian Raschka (03:54:49) Also, for education and talent, it’s very important. Otherwise, if there are only closed models, how do you get the next generation of people contributing? You would only be able to learn after you joined a company, but at that point, how do you identify and hire talented people? I think open source is essential for educating the population and training the next generation of researchers. It’s the only way.

Nathan Lambert (03:55:24) The way that I could’ve gotten this to go more viral was to tell a story of Chinese AI integrating with an authoritarian state, becoming ASI and taking over the world, and therefore we need our own American models. But it’s very intentional why I talk about innovation and science in the US, because I think it’s both more realistic as an outcome and it’s a world that I would like to manifest.

Sebastian Raschka (03:55:47) I would say, though, that any open-weight model is a valuable model.

Nathan Lambert (03:55:55) Yeah. And my argument is that we should be in a leading position. But I think it’s worth saying it so simply because there are still voices in the AI ecosystem that say we should consider banning the release of open models due to the safety risks. And I think it’s worth adding that, effectively, that’s impossible without the US having its own Great Firewall, which is known to not work that well. The cost for training these models, whether it’s one to a hundred million dollars, is attainable to a huge amount of people in the world that want to have influence, so these models will be getting trained all over the world. We want this information and these tools to flow freely across the world and into the US so that people can use them and learn from them.

Nathan Lambert (03:56:47) Stopping that would be such a restructuring of our internet that it seems impossible.

Sebastian Raschka (03:56:51) Do you think maybe the big open-weight models from China are actually a good thing for US companies? You mentioned earlier they are usually one generation behind in terms of what they release open source. For example, gpt-oss-120b might not be the cutting-edge model, or Gemini 3 might not be, because they want to ensure it is safe. But when these companies see that DeepSeek-V3.2 is really awesome and is being used with no backlash or security risk, that could encourage them to release better models. Maybe that is a very positive thing.

Nathan Lambert (03:57:30) A hundred percent. These Chinese companies have set things into motion that I think would potentially not have happened if they were not all releasing models. I’m almost sure that those discussions have been had by leadership.

Sebastian Raschka (03:57:45) Is there a possible future where the dominant models, AI models in the world are all open source?

Nathan Lambert (03:57:50) Depends on the trajectory of progress that you predict. If you think saturation in progress is coming within a few years, essentially within the time where financial support is still very good, then open models will be so optimized and so much cheaper to run that they’ll win out. Essentially, this goes back to open source ideas where so many more people will be putting money into optimizing the serving of these open-weight common architectures that they will become standards. Then you could have chips dedicated to them and it’ll be way cheaper than the offerings from these closed companies that are custom.

Sebastian Raschka (03:58:25) We should say that the AI2027 report predicts—one of the things it does from a narrative perspective is that there will be a lot of centralization. As the AI system gets smarter and smarter, the national security concerns will come to be, and you’ll centralize the labs, and you’ll become super secretive, and there’ll be this whole race.

Lex Fridman (03:58:45) …from a military perspective of how you… between China and the United States. And so all of these fun conversations we’re having about LLMs—all the generals and soldiers will come into the room and be like, “All right, we’re now in the Manhattan Project stage of this whole thing.”

Sebastian Raschka (03:59:02) I think 2025, ’26, ’27—I don’t think something like that is even remotely possible. I mean, you can make the same argument for computers, right? You can say, “Okay, computers are capable and we don’t want the general public to get them.” Or chips—even AI chips—but you see how Huawei makes chips now. It took a few years, but… and I don’t think there is a way you can contain knowledge like that. I think in this day and age, it is impossible, like the internet. I don’t think this is a possibility.

Nathan Lambert (03:59:37) On the Manhattan Project thing, one of my funny things looking at them is I think that a Manhattan Project-like thing for open models would actually be pretty reasonable, because it wouldn’t cost that much. But I think that that will come. It seems like culturally, the companies are changing. But I agree with Sebastian on all of the stuff that he just said. It’s just like, I don’t see it happening nor being helpful.

Lex Fridman (03:59:58) Yeah. I mean, the motivating force behind the Manhattan Project was that there was civilizational risk. It’s harder to motivate that for open-source models.

Nathan Lambert (04:00:08) There’s not civilizational risk.

Future of NVIDIA, GPUs, and AI compute clusters

Lex Fridman (04:00:10) On the hardware side, we mentioned NVIDIA a bunch of times. Do you think Jensen and NVIDIA are going to keep winning?

Sebastian Raschka (04:00:18) I think they have the downside that they have to iterate a lot and manufacture a lot. And what they’re doing—they do innovate, but I think there’s always the chance that there is someone who does something fundamentally different, who gets very lucky and then does something. But the problem is, I think, adoption. You know, the moat of NVIDIA is probably not just the GPU; it’s more like the CUDA ecosystem, and that has evolved over two decades. I mean, even back when I was a grad student, I was in a lab doing biophysical simulations, molecular dynamics, and we had a Tesla GPU back then just for the computations. It was fifteen years ago now.

Sebastian Raschka (04:01:01) They built this up for a long time and that’s the moat, I think. It’s not the chip itself. Although they have the money now to iterate, build, and scale, it’s really on the compatibility. If you’re at that scale as a company, why would you go with something risky where it’s only— … a few chips that they can make per year? You go with the big one. But then I do think with LLMs now, it will be easier to design something like CUDA. It took 15 years because it was hard, but now that we have LLMs, we can maybe replicate CUDA.

Lex Fridman (04:01:35) And I wonder if there will be a separation of the training and the inference- … compute, as we stabilize a bit more and more compute is needed for inference.

Nathan Lambert (04:01:47) That’s supposed to be the point of the Groq acquisition. And that’s why part of what Vera Rubin is—

Nathan Lambert (04:01:52) … where they have a new chip with no high-bandwidth memory, or very little, which is one of the most expensive pieces. It’s designed for pre-fill, which is the part of inference where you essentially do a lot of matrix multiplications, and then you only need the memory when you’re doing this autoregressive generation and you have the KV cache swaps. So they have this new GPU that’s designed for that specific use case, and then the cost of ownership per flop is actually way lower. But I think that NVIDIA’s fate lies in the diffusion of AI still. Their biggest clients are still these hyperscale companies, whether it’s Google—which obviously can make TPUs—Amazon making Trainium, or Microsoft trying to do its own things.

Nathan Lambert (04:02:36) As long as the pace of AI progress is high, NVIDIA’s platform is the most flexible and people will want that. But if there’s stagnation, then with creating bespoke chips, there’s more time to do it.

Lex Fridman (04:02:50) It’s interesting that NVIDIA is, is quite active in trying to develop all kinds of different products.

Nathan Lambert (04:02:55) They try to create areas of commercial value that will use a lot of GPUs.

Lex Fridman (04:03:01) But they keep innovating and they’re doing a lot of incredible research, so…

Nathan Lambert (04:03:06) Everyone says the company’s super oriented around Jensen and how operationally plugged in he is. It sounds so unlike many other big companies that I’ve heard about. And so long as that’s the culture, I think that you can expect that to keep progress happening. It’s like he’s still in the Steve Jobs era of Apple. So long as that is how it operates, I’m pretty optimistic for their situation because it is their top-order problem, and I don’t know if making these chips for the whole ecosystem is the top goal of all these other companies. They’ll do a good job, but it might not be as good of a job.

Lex Fridman (04:03:43) Since you mentioned Jensen, I’ve been reading a lot about history and about singular figures in history. What do you guys think about the great man view of history? How important are individuals for steering the direction of history in the tech sector? So, you know, what’s NVIDIA without Jensen? You mentioned Steve Jobs. What’s Apple without Steve Jobs? What’s xAI without Elon or DeepMind without Demis?

Nathan Lambert (04:04:11) People make things earlier and faster, whereas scientifically, many great scientists credit being in the right place at the right time. Eventually someone else will still have the idea. So I think that in that way, Jensen is helping manifest this GPU revolution much faster and much more focused than it would be without having a person like him there. This is making the whole AI build-out faster. But I do still think that eventually something like ChatGPT would have happened and a build-out like this would have happened, but it probably would not have been as fast. I think that’s the sort of flavor that is applied.

Sebastian Raschka (04:04:55) These individual people are placing bets on something. Some get lucky, some don’t. But if you don’t have these people at the helm, it would be more diffused. It’s almost like investing in an ETF versus individual stocks. Individual stocks might go up or down more heavily than an ETF, which is more balanced. We’ll eventually get there, but I just think the focus is the thing. Passion and focus.

Lex Fridman (04:05:19) Isn’t there a real case to be made that without Jensen, there’s not a reinvigoration of the deep learning revolution?

Nathan Lambert (04:05:26) It could’ve been 20 years later, is the thing I would say.

Nathan Lambert (04:05:30) Or another deep learning winter could have come… …If GPUs weren’t around.

Lex Fridman (04:05:35) That could change history completely because you could think of all the other technologies that could’ve come in the meantime, and the focus of human civilization would get… Silicon Valley would be captured by different hype.

Sebastian Raschka (04:05:48) But I do think there’s certainly an aspect where the GPU trajectory was all planned. But on the other end, it’s also a lot of lucky coincidences or good intuition. Like the investment into, let’s say, biophysical simulations. I mean, I think it started with video games and then it just happened to be good at linear algebra because video games require a lot of linear algebra. And then you have the biophysical simulations. But still, I don’t think the master plan was AI. I think it just happened to be Alex Krizhevsky. So someone took these GPUs and said, “Hey, let’s try to train a neural network on that.” It happened to work really well and… …I think it only happened because you could purchase those GPUs.

Nathan Lambert (04:06:30) Gaming would’ve created a demand for faster processors if… …NVIDIA had gone out of business in the early days. That’s what I would think. I think GPUs would still exist… …At the time of AlexNet and at the time of the Transformer. It was just hard to know if it would be one company as successful or multiple smaller companies with worse chips. But I don’t think that’s a 100-year delay. It might be a decade delay.

Lex Fridman (04:07:01) Well, it could be a one, two, three, four, five-decade delay. I mean, I just can’t see Intel or AMD doing what NVIDIA did.

Nathan Lambert (04:07:08) I don’t think it would be a company that exists.

Sebastian Raschka (04:07:11) A new company.

Nathan Lambert (04:07:11) I think it would be a different company that would rise.

Sebastian Raschka (04:07:13) Like Silicon Graphics or something.

Nathan Lambert (04:07:15) So yeah, some company that has died would have done it.

Lex Fridman (04:07:19) But looking at it, it seems like these singular figures, these leaders, have a huge impact on the trajectory of the world. Obviously, incredible teams are behind them. But, you know, having that kind of very singular, almost dogmatic focus- …is necessary to make progress.

Sebastian Raschka (04:07:40) Yeah, I mean, even with GPT, it wouldn’t exist if there wasn’t a person, Ilya, who pushed for this scaling, right?

Nathan Lambert (04:07:47) Yeah, Dario was also deeply involved in that. If you read some of the histories from OpenAI, it almost seems wild thinking about how early these people were like, “We need to hook up 10,000 GPUs and take all of OpenAI’s compute and train one model.” There were a lot of people there that didn’t want to do that.

Future of human civilization

Lex Fridman (04:08:02) Which is an insane thing to believe—to believe scaling before scaling has any indication that it’s going to materialize. Again, singular figures. Speaking of which, 100 years from now, this is presumably post-singularity, whatever the singularity is. When historians look back at our time now, what technological breakthroughs would they really emphasize as the breakthroughs that led to the singularity? So far we have Turing to today, which is 80 years.

Sebastian Raschka (04:08:36) I think it would still be computing, like the umbrella term “computing.” I don’t necessarily think that even 100 or 200 years from now it would be AI. It could still well be computers, you know? We are now taking better advantage of computers, but it’s the fact of computing.

Lex Fridman (04:08:53) It’s basically a Moore’s Law kind of discussion. Even the details of CUDA and GPUs won’t even be remembered, and there won’t be all this software turmoil. It’ll be just, obviously, compute.

Nathan Lambert (04:09:07) I generally agree, but is it the connectivity of the internet and compute able to be merged? Or is it both of them?

Sebastian Raschka (04:09:17) I think the internet will probably be related to communication—it could be a phone, internet, or a satellite. And compute is more like the scaling aspect of it.

Lex Fridman (04:09:29) It’s possible that the internet is completely forgotten. That the internet is wrapped into the phone networks, like communication networks. This is just another manifestation of that, and the real breakthrough comes from just the increased compute—Moore’s Law, broadly defined.

Nathan Lambert (04:09:46) Well, I think the connection of people is very fundamental to it. You want to find the best person in the world for something, they are somewhere in the world. Being able to have that flow of information—AIs will also rely on this. I’ve been fixating on when I said the dream was dead about the one central model; the thing that is evolving is that people have many agents for different tasks. People already started doing this with different Clouds for different tasks. It’s described as many AGIs in the data center where each one manages and they talk to each other. That is so reliant on networking and the free flow of information on top of compute. But networking, especially with GPUs, is such a part of the scaling of compute. The GPUs and the data centers need to talk to each other.

Lex Fridman (04:10:36) Do you think there’s something very specific and singular to the fact that it’s neural networks that’s seen as a breakthrough? Like a genius move where you’re basically replicating, in a very crude way, the structure of the human brain, the human mind?

Sebastian Raschka (04:10:54) I think without the human mind, we probably wouldn’t have neural networks because it was an inspiration for them. But on the other end, I think it’s just so different. I mean, it’s digital versus biological, so I think it will probably be more grouped as an algorithm.

Lex Fridman (04:11:11) That’s massively parallelizable— —on this particular kind of compute?

Sebastian Raschka (04:11:15) It could have well been genetic computing, like genetic algorithms, just parallelized. It just happens that this is more efficient and works better.

Lex Fridman (04:11:23) And it very well could be that the neural networks, the way we architect them now, are just a small component of the system that leads to the singularity.

Nathan Lambert (04:11:33) I think if you think of it over 100 years, society can be changed more with more compute and intelligence because of autonomy. But looking at this, what are the things from the Industrial Revolution that we remember? We remember the engine—it is probably the equivalent of the computer in this. But there’s a lot of other physical transformations that people are aware of, like the cotton gin and all these machines that are still known—air conditioning, refrigerators— Some of these things from AI will still be known; the word “transformer” could still very well be known. I would guess that deep learning is definitely still known, but the transformer might be evolved away from in 100 years with AI researchers everywhere. But I think deep learning is likely to be a term that is remembered.

Lex Fridman (04:12:28) And I wonder what the air conditioning and the refrigeration of the future is that AI brings. If we travel forward 100 years from now, what do you think is different? How does the world look? First of all, do you think there’s humans? Do you think there’s robots everywhere walking around?

Sebastian Raschka (04:12:46) I do think there will be specialized robots for certain tasks.

Sebastian Raschka (04:12:50) Maybe half-humanoid. We’ll see. I think for certain things, yes, there will be humanoid robots because it’s just amenable to the environment. But for certain tasks, it might not make sense. What’s harder to imagine is how we interact with devices and what humans do with them. I’m pretty sure it will not be the cellphone or the laptop. Will it be implants?

Lex Fridman (04:13:16) I mean, it has to be brain-computer interfaces, right? I mean, 100 years from now, it has to—given the progress we’re seeing now— —there has to be, unless there’s legitimately a complete alteration of how we interact with reality.

Sebastian Raschka (04:13:33) On the other hand, if you think of cars, cars are older than 100 years, right? And it’s still the same interface. We haven’t replaced cars with something else; we just made them better. But it’s still a steering wheel, it’s still wheels.

Nathan Lambert (04:13:45) I think we’ll still carry around a physical brick of compute— —because people want some ability to have a private interface. You might not engage with it as much as a phone, but having something where you could have private information that is yours as an interface between you and the rest of the internet is something I think will still exist. It might not look like an iPhone, and it might be used a lot less, but I still expect people to carry things around.

Lex Fridman (04:14:08) Why do you think the smartphone is the embodiment of privacy? There’s a camera on it. There’s a-

Nathan Lambert (04:14:15) Private for you, like encrypted messages, encrypted photos; you know what your life is. I guess this is a question of how optimistic you are on brain-machine interfaces. Is all that just going to be stored in the cloud, like your whole calendar? It’s hard to think about processing all the information that we can process visually through brain-machine interfaces presenting something like a calendar to you. It’s hard to just think about knowing your email inbox without looking. Like you signal to a computer and then you just know your email inbox. Is that something that the human brain can handle being piped into it non-visually? I don’t know exactly how those transformations happen. ‘Cause humans aren’t changing in 100 years.

Nathan Lambert (04:15:05) I think agency and community are things that people actually want.

Lex Fridman (04:15:09) A local community, yeah.

Nathan Lambert (04:15:10) So, like, people you are close to, being able to do things with them and being able to ascribe meaning to your life. I don’t think that human biology is changing away from those on a timescale that we can discuss. UBI does not solve agency. I do expect mass wealth, and I hope that it has spread so that the average life does look very different in 100 years. But that’s still a lot to happen in 100 years. If you think about countries that are early in their development process, to build all the infrastructure and have policy that shares one nation’s wealth with another is… I think it’s an optimistic view to see all that happening in 100 years- …while they are still independent entities and not just absorbed into some international order by force.

Lex Fridman (04:16:13) But there could be just better, more elaborate, more effective- …social support systems that help alleviate some levels of basic suffering from the world. With the transformation of society where a lot of jobs are lost in the short term, I think we have to really remember that each individual job that’s lost is a human being who’s suffering. When jobs are lost at scale, it is a real tragedy. You can make all kinds of arguments about economics or say it’s all going to be okay and good for the GDP because new jobs will be created, but fundamentally at the individual level for that human being, that’s real suffering. That’s a real personal tragedy.

Lex Fridman (04:16:58) And we have to not forget that as the technologies are being developed. Also, my hope for all the AI slop we’re seeing is that there will be a greater and greater premium for the fundamental aspects of the human experience that are in-person. The things that we all enjoy, like seeing each other and talking together in-person.

Nathan Lambert (04:17:22) The next few years are definitely going to see an increased value on physical goods and events- …and even more pressure from slop. The slop is only starting. The next few years will be more and more diverse-

Lex Fridman (04:17:37) Do you think we’ll all be drow-

Nathan Lambert (04:17:37) …versions of slop.

Lex Fridman (04:17:38) They would be drowning in slop. Is that what-

Nathan Lambert (04:17:40) So I’m hoping that society drowns in slop enough to snap out of it and be like, “We can’t. It just doesn’t matter. We all can’t deal with it.” And then, the physical has such a higher premium on it.

Sebastian Raschka (04:17:53) Even like classic examples, I honestly think this is true, and I think we will get tired of it. We are already kind of tired of it. Same with art. I don’t think art will go away. I mean, you have physical paintings. There’s more value, not just monetary value, but just more appreciation for the actual painting than a photocopy of that painting. It could be a perfect digital reprint, but there is something when you go to a museum and you look at that art and you see that real thing and you just think about, “Okay, a human.” It’s like a craft. You have an appreciation for that.

Sebastian Raschka (04:18:25) And I think the same is true for writing, for talking, for any type of experience, where it will be… I do unfortunately think it will be like a dichotomy, like a fork where some things will be automated. There are not as many paintings as there used to be 200 years ago. There are more photographs, more photocopies. But at the same time, it won’t go away. There will be value in that. I think that the difference will just be what’s the proportion of that. But personally, I have a hard time reading things where I see it’s obviously AI-generated. I’m sorry, there might be really good information there, but I have a certain feeling, like, it’s not for me.

Nathan Lambert (04:19:08) I think eventually they’ll fool you, and it’ll be on platforms that give ways of verifying or building trust. So you will trust that Lex is not AI-generated, having been here. So then you have trust in this- -channel. But it’s harder for new people- -that don’t have that trust.

Sebastian Raschka (04:19:25) Well, that will get interesting because I think fundamentally it’s a solvable problem by having trust in certain outlets that they won’t do it, but it’s all going to be kind of trust-based. There will be some systems to authorize, “Okay, this is real. This is not real.” There will be some telltale signs where you can obviously tell this is AI-generated and this is not. But some will be so good that it’s hard to tell, and then you have to trust. And that will get interesting and a bit problematic.

Nathan Lambert (04:19:54) The extreme case of this is to watermark all human content. So all photos that we take on our own- -have some watermark until they- -are edited- -or something like this. And software can manage communications with the device manufacturer- -to maintain human editing, which is the opposite of the discussion to try to watermark AI images. And then you can make a Google image that has a watermark and use a different Google tool to remove the watermark.

Sebastian Raschka (04:20:20) Yeah. It’s going to be an arms race, basically.

Lex Fridman (04:20:23) And we’ve been mostly focusing on the positive aspects of AI. I mean, all the capabilities that we’ve been talking about can be used to destabilize human civilization with even just relatively dumb AI applied at scale, and then further and further, superintelligent AI systems. Of course, there’s the sort of doomer take that’s important to consider a little bit as we develop these technologies. What gives you hope about the future of human civilization? Everything we’ve been talking about—are we going to be okay?

Nathan Lambert (04:20:59) I think we will. I’m definitely a worrier both about AI and non-AI things, but humans do tend to find a way. I think that’s what humans are built for—to have community and find a way to figure out problems. And that’s what has gotten us to this point. I think the AI opportunity and related technologies is really big. I think that there are big social and political problems to help everybody understand that. I think that’s what we’re staring at a lot of right now; the world is a scary place, and AI is a very uncertain thing. And it takes a lot of work that is not necessarily building things. It’s like telling people and understanding people, things that the people building AI are historically not motivated or wanting to do.

Nathan Lambert (04:21:50) But it is something that is probably doable. It just will take longer than people want. And we have to go through that long period of hard, distraught AI discussions if we want to have the lasting benefits.

Lex Fridman (04:22:04) Yeah. Through that process, I’m especially excited that we get a chance to better understand ourselves at the individual level as humans and at the civilization level, and answer some of the big mysteries, like what is this whole consciousness thing going on here? It seems to be truly special. Like, there’s a real miracle in our mind. And AI puts a mirror to ourselves and we get to answer some of the big questions about what is this whole thing going on here.

Sebastian Raschka (04:22:35) Well, one thing about that is also what I do think makes us very different from AI and why I don’t worry about AI taking over is, like you said, consciousness. We humans, we decide what we want to do. AI in its current implementation, I can’t see it changing. You have to tell it what to do. And so you still have the agency. It doesn’t take the agency from you because it becomes a tool. You tell it what to do. It will be more automatic than other previous tools. It’s certainly more powerful than a hammer, it can figure things out, but it’s still you in charge, right? So the AI is not in charge, you’re in charge. You tell the AI what to do and it’s doing it for you.

Lex Fridman (04:23:17) So in the post-singularity, post-apocalyptic war between humans and machines, you’re saying humans are worth fighting for?

Sebastian Raschka (04:23:27) 100%. I mean, the movie Terminator, they made in- -the ’80s, essentially, and I do think the only thing I can see going wrong is, of course, if things are explicitly programmed to do things that are harmful.

Lex Fridman (04:23:43) I think actually in a Terminator type of setup, I think humans win. I think we’re too clever. It’s hard to explain how we figure it out, but we do. And we’ll probably be using local LLMs, open source LLMs, to help fight the machines. I apologize for the ridiculousness. Like I said, Nathan, I’ve already been a big fan of yours for a long time. And I’ve been a big fan of yours, Sebastian, for a long time, so it’s an honor to finally meet you. Thank you for everything you put out into the world. Thank you for the excellent books you’re writing. Thank you for teaching us. And thank you for talking today. This was fun.

Sebastian Raschka (04:24:26) Thank you for inviting us here and having this human connection, which is actually-

Lex Fridman (04:24:30) -extremely valuable- -human connection. Thanks for listening to this conversation with Sebastian Raschka and Nathan Lambert. To support this podcast, please check out our sponsors in the description, where you can also find links to contact me, ask questions, give feedback, and so on. And now let me leave you with some words from Albert Einstein: “It is not that I’m so smart, but I stay with the questions much longer.” Thank you for listening, and hope to see you next time.

帕维尔·杜罗夫:Telegram、自由、审查、金钱、权力与人性 (2025-09-30)

Pavel Durov: Telegram, Freedom, Censorship, Money, Power & Human Nature (2025-09-30)

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:Telegram 创始人兼 CEO Pavel Durov,一位以其对自由和隐私的坚定立场而闻名的科技领袖,在公司用户突破十亿并首次实现盈利、同时其本人深陷与法国政府的司法纠纷之际,接受了 Lex Fridman 的深度访谈。
  • 核心论点:本次对话的核心论题在于论证一种以“创始人原则”为绝对内核的另类科技公司构建范式。Pavel Durov 通过自身经历,系统性地阐述了极端的个人纪律(斯多葛主义生活方式)如何直接转化为公司的工程哲学、商业模式与地缘政治策略。他认为,要在一个日益被巨头垄断和政府干预的数字世界中,建立一个真正中立、高效且坚不可摧的平台,其根基必须是创始人不可动摇的价值观。这种价值观不仅体现在对用户隐私的承诺上,更体现在对精简团队、自动化、第一性原理工程和非剥削性商业模式的极致追求上。最终,Telegram 不仅是一个通讯工具,更是其创始人世界观的具象化产物——一个在混乱中坚守秩序、在压力下保持韧性的数字堡垒。

2. 🧠 深度观点解析 (Deep Dive Analysis)

维度一:激进的精益工程与组织哲学 (Radical Lean Engineering & Organizational Philosophy)

  • 核心观点大规模的工程团队不仅效率低下,而且会成为安全与创新的负资产。真正的效率与安全源于由顶尖人才组成的极小团队,并通过极致的自动化来管理复杂系统。
  • 原理解构:Durov 的逻辑颠覆了“人多好办事”的传统管理认知。他认为,团队规模的增长会导致沟通成本呈指数级上升,大量时间被用于协调而非创造。更重要的是,平庸的员工(B players)不仅产出低,还会通过制造不必要的问题、传播负面情绪来拖累顶尖人才(A players)。因此,主动限制招聘,迫使团队通过编写算法和自动化脚本来解决规模化问题(如管理全球近10万台服务器),这不仅提升了效率,还减少了人为错误和潜在的内部恶意行为(Durov 称“humans are attack vectors”,人是攻击向量),从而增强了系统的可靠性与安全性。
  • 证据/案例:Telegram 仅用约 40 人的核心工程团队,支撑了全球超 10 亿用户的服务,其功能迭代速度和创新能力(如动态贴纸、消息编辑等)长期领先于拥有数千名工程师的竞争对手 WhatsApp。Durov 引用了自己单枪匹马在几个月内写出 VKontakte (VK) 初版,以及解雇一个工程师反而提升了团队整体生产力的亲身经历,来印证其“少即是多”的原则。

维度二:作为地缘政治防火墙的架构设计 (Architecture as a Geopolitical Firewall)

  • 核心观点:要抵御来自全球政府的压力,技术架构和公司所有权结构必须从第一天起就为“不妥协”而设计
  • 原理解构:面对各国政府要求审查或访问用户数据的压力,Telegram 的防御体系是多层次的。
    1. 技术层:采用分布式基础设施,将加密的用户数据(Cloud Chats)和解密密钥分割存储在不同国家的多个司法管辖区。这意味着单一政府无法通过法律或物理手段获取完整数据。端到端加密的“私密对话”(Secret Chats) 则提供了更高级别的保护。
    2. 组织层Pavel Durov 100% 拥有 Telegram,没有外部股东或董事会。这彻底排除了因商业利益(如IPO、投资者回报)而向政府压力屈服的可能性,使公司决策能完全基于创始人的核心原则。
    3. 策略层:Durov 公开表示,宁愿放弃整个国家市场,也绝不会交出用户私人数据。这种“焦土策略”提高了政府施压的代价,也向用户传递了最强烈的信任信号。
  • 证据/案例:Durov 详述了在法国被捕的经历,以及法国情报部门试图利用其司法困境,要求其干预罗马尼亚和摩尔多瓦选举的行为。他坚决拒绝,并公开了这些企图。此外,2018 年俄罗斯和伊朗封禁 Telegram,但 Telegram 通过动态 IP 和代理网络等“数字抵抗”(Digital Resistance) 运动成功规避封锁,最终迫使俄罗斯在两年后解除封锁。这些案例证明了其架构和策略的有效性。

维度三:斯多葛主义作为一种商业武器 (Stoicism as a Business Weapon)

  • 核心观点:创始人的个人生活方式——一种现代斯多葛主义的实践——是其做出高质量、长期主义决策并抵御巨大压力的关键
  • 原理解构:Durov 的个人哲学是,恐惧与贪婪是自由的最大敌人。他通过严格的自律生活来“驯服”这两者。
    1. 抵御恐惧:通过直面死亡、刻意练习自律(如每天300个俯卧撑、冰水浴、数小时长泳)来增强精神韧性。当一个人对最坏结果(死亡或失去一切)都感到坦然时,外部的威胁就失去了效力。
    2. 抵御贪婪/干扰:通过戒绝成瘾物质(酒精、咖啡因)、不使用智能手机、避免算法推荐信息流,来保护自己最宝贵的资产——清晰的思维。他认为,手机和社交媒体是在用无关紧要的“当日戏剧”轰炸用户,使其丧失独立思考和设定长期目标的能力。这种“信息斋戒”让Durov 能专注于真正重要的、穿越时间周期的产品和战略问题。
  • 证据/案例:他详细描述了自己超过 20 年不饮酒、不摄入加工糖、规律进行高强度锻炼的生活方式。他将这种生活与商业决策直接挂钩,认为身体的健康和精神的纯净是高效工作和抗压能力的基础。他明确指出,作为社交媒体的创始人却几乎不用手机,正是为了防止自己的思维被他人议程所绑架

维度四:无需用户剥削的盈利模式 (Monetization Without Exploitation)

  • 核心观点一个科技平台可以在不侵犯用户隐私、不出售个人数据、不设计成瘾性信息流的前提下,实现规模化盈利
  • 原理解构:Telegram 的商业模式是对主流广告驱动模式的直接挑战。其核心是创造用户愿意主动付费的增值价值,并赋能平台生态,而非将被动观看广告的用户视为产品。
    1. 增值服务 (Telegram Premium):为核心用户提供额外的功能(如更快的下载速度、更大的文件上传限制、高级贴纸等),将免费用户体验做到极致的同时,为付费用户提供“锦上添花”的价值。
    2. 平台经济 (Platform Economy):通过TON (The Open Network) 区块链赋能。例如,推出基于上下文的、非侵入性的频道广告,并与频道主进行收入分成;推出基于 TON 的用户名、数字身份和“礼物”(Gifts) 系统,创造了一个全新的数字资产市场;开放强大的 Bot 和 Mini App 平台,让开发者在 Telegram 生态内构建商业并从中收取少量佣金。
  • 证据/案例:Telegram 于 2024 年首次实现盈利。其 Premium 订阅用户已超过 1500 万,年收入超 5 亿美元。与 Snoop Dogg 合作的“礼物”在 30 分钟内销售额达 1200 万美元。这些都证明了,通过提供真实效用和创新体验,可以构建一个可持续且尊重用户的商业闭环。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识

    • “少即是多”的人力资源观:普遍认为公司扩张需要大规模招聘,Durov 却认为精简团队通过自动化能实现更高效率和安全性,甚至“裁员能提高生产力”。
    • CEO 的“离线”状态:在要求“永远在线”的科技行业,Durov 主张 CEO 应该最大限度地脱离数字干扰(如手机),以保护战略思考能力。
    • 盈利与隐私并非零和游戏:行业主流认为放弃定向广告等于放弃核心收入,Telegram 却证明了基于增值服务和平台生态的模式是可行的。
    • 竞争是教育的驱动力:在西方教育界普遍强调“减压”和避免竞争的环境下,Durov 认为源于其少年时代残酷的学术竞争是塑造逻辑思维和韧性的关键,并批评无竞争环境会削弱整个国家的竞争力。
  • 盲点与局限

    • 对政府无知的批评:Durov 多次指出,以法国为代表的政府机构对技术(加密、机器学习)的理解极为有限,导致其监管行为既荒谬又具有破坏性。他认为这种“技术文盲”是现代治理的一大风险。
    • 对激励机制的洞察:他批判了大型制药公司(Big Pharma)的商业模式,认为其激励机制是让你持续依赖药物(缓解症状),而非根除病因。他将此逻辑延伸到分析新闻媒体——你读到的多数内容,其背后都有促使你购买、支持或憎恨某物的动机。
    • “创始人依赖”风险:对话中隐含的一个未被深入探讨的风险是,Telegram 的所有原则和韧性都高度依赖于 Pavel Durov 本人。这种“国王哲人”模式虽然目前极为高效,但也存在单点故障风险。如果创始人发生意外或改变初衷,整个平台的价值观将面临严峻考验。
  • 未解之谜

    • 富足社会的“鼠标乌托邦”困境:Durov 引用了“Universe 25”实验,承认在物质极大丰富的未来,人类社会如何像实验中的老鼠一样避免因丧失目标和挑战而陷入社会功能失调、最终走向崩溃,是一个没有答案的难题。
    • AI 是否能拥有“良知”:他转述其父亲的观点,认为 AI 可以拥有意识和创造力,但无法拥有人类意义上的“良知”(conscience)——即根植于内心的道德准则和正直感。这为关于通用人工智能的讨论提供了一个深刻的哲学维度。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “If you do the same thing everybody else around you is doing, you don’t have any competitive advantage and you don’t get to become outstanding at some point in your life.”

    • 中文意译:“如果你做着和周围人一模一样的事情,你就没有任何竞争优势,也永远不可能在生命中的某个时刻脱颖而出。”
    • 语境:解释为什么要做一个反传统者,敢于对社会主流习惯(如饮酒)说不,并将其与寻找个人独特发展路径的理念相关联。
  2. “Humans are attack vectors, and if you have a distributed system that runs itself automatically, you have a chance at increasing the security of speed and speed of your service.”

    • 中文意译:“人本身就是攻击向量。如果你拥有一个能自动运行的分布式系统,你就有机会同时提升服务的安全性与速度。”
    • 语境:解释为什么 Telegram 坚持用极小的团队和高度自动化来管理庞大的基础设施,这是其工程哲学的核心。
  3. “The more pressure I get, the more resilient and defiant I become… I would rather lose everything I have than yield to this pressure.”

    • 中文意译:“我受到的压力越大,我的韧性和反抗精神就越强……我宁愿失去我所拥有的一切,也绝不会向这种压力屈服。”
    • 语境:在谈及法国政府试图利用其司法困境向他施压时,Durov 表达了自己决不妥协的坚定立场。
  4. “After you survive something like this, you feel like you’re living on bonus time. So in a way, you died a long time ago, and every new day you get is a gift.”

    • 中文意译:“在经历过这样的事情之后,你会感觉自己活在‘奖励时间’里。从某种意义上说,你很久以前就已经死过一次了,你得到的每一个新的一天都是一份礼物。”
    • 语境:在首次公开披露 2018 年的投毒暗杀企图后,解释这段濒死经历如何让他变得更加无所畏惧。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)

    • 对技术栈与产品形态:Telegram 将继续以极快的速度推出创新功能,尤其是在 Mini App 生态、区块链集成(TON)和 AI 工具方面,进一步巩固其“超级应用”的地位。其对性能和用户体验细节的极致追求(如丝滑的动画、高效的矢量图形)将继续为行业设定标杆。
    • 对竞争格局:Durov 的“精益模式”可能会激励更多初创公司挑战“人海战术”,尤其是在 AI 时代,自动化将进一步放大顶尖人才的杠杆效应。Meta (WhatsApp) 和其他竞争对手可能会被迫模仿更多 Telegram 的功能,但难以复制其底层的组织和文化。
    • 对地缘政治:Durov 与法国的案例将成为科技平台与主权国家之间冲突的标志性事件,引发更多关于数字主权、平台中立性和科技领袖个人安全的讨论。
  • 长期终局 (5-10年)

    • 如果 Durov 的设想成真,未来的数字通讯领域将出现明显分化:一边是以 Meta、Google 等为代表的,与政府深度合作、基于用户数据监控的广告驱动型平台;另一边则是以 Telegram 和去中心化协议为代表的,坚持隐私、中立和用户主权的价值驱动型平台。用户将面临更清晰的选择。
    • Telegram 可能演变为一个去中心化的数字国家,拥有自己的经济系统(TON)、身份系统(用户名 NFT)、开发者生态和治理原则,其影响力将超越传统科技公司,成为一个独立于民族国家的地缘政治实体。
  • 行动建议

    • 对于开发者押注 Telegram 的 Mini App 和 TON 生态可能是一个高回报的机会。平台拥有庞大且高度参与的用户群,且官方在激励机制上表现出极大的诚意(低佣金、收入分成)。同时,学习 Telegram 对性能和简洁性的执着,构建小而美的产品。
    • 对于投资者:寻找那些拥有**“创始人-原则”强绑定的公司**。在日益复杂的全球环境中,一个拥有坚定价值观且不受制于短期财务压力的领导者,其公司的长期韧性可能远超预期。同时,重新评估“员工数量”作为衡量公司价值的指标。
    • 对于创业者将个人纪律视为核心竞争力。Durov 的案例表明,创始人的精神状态和生活方式直接决定了公司的文化和上限。在创业初期就明确不可妥协的原则,并围绕这些原则设计你的产品架构、股权结构和商业模式。不要害怕成为一个孤独的反传统者。

这是一份基于 Pavel Durov 与 Lex Fridman 对话内容的深度技术与商业研报。


🚀 数字化主权的孤勇者:Pavel Durov 的技术哲学与自由战歌

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:本次对话发生在 Telegram 创始人 Pavel Durov 遭遇法国政府逮捕并在保释后的关键节点。作为拥有 10 亿用户的社交平台掌门人,Durov 长期保持极度低调。
  • 核心论点:对话揭示了一个以**“极端自律”“自由意志主义”**为底层逻辑的商业奇迹。Durov 认为,在国家机器与官僚主义不断扩张的现代社会,技术是保护人类隐私和言论自由的最后防线。他通过维持精简到极致的团队和分布式的物理架构,试图构建一个不受任何单一地缘政治实体控制的数字乌拉托邦。

2. 🧠 深度观点解析 (Deep Dive Analysis)

Ⅰ. 极致精简主义:40 名工程师驱动 10 亿用户

  • 核心观点:Telegram 的核心工程团队仅约 40 人,却在创新速度上超越了数万人的大厂(如 WhatsApp)。
  • 原理解构:Durov 奉行**“反人数增加”**原则。他认为员工数量增加会指数级提升沟通成本,导致平庸化(B-players)和官僚化。Telegram 强制要求高度自动化,将 10 万台服务器的运维交给算法而非人力。
  • 证据/案例:Durov 提到他通过**“竞赛(Contests)”**招募全球顶尖的“A级天才”。许多核心代码是由 10 余岁便在编程竞赛中夺魁的天才编写。他认为,一个优秀的程序员能顶替 1000 个平庸的程序员。

Ⅱ. 物理防线:分布式的法律韧性

  • 核心观点:隐私保护不仅靠算法,更靠地理分布的物理架构。
  • 原理解构:Telegram 的数据和解密密钥被切割并分布在多个司法管辖区。这意味着,任何单一政府若想索要数据,必须获得多个国家的法院许可,这在现有的地缘政治环境下几乎不可能实现。
  • 证据/案例:Durov 强调 Telegram 从未向任何政府(包括俄罗斯、伊朗、法国)泄露过一条私人私信。他宁愿失去整个市场(如被俄罗斯封锁期间),也不愿破坏这一加密承诺。

Ⅲ. 商业模式的道德底线:不利用数据变现

  • 核心观点:放弃 80% 的潜在广告收入,以换取产品的纯粹性。
  • 原理解构:不同于 Meta 等依赖“精准画像”的广告模式,Telegram 的广告基于频道内容(上下文)而非用户私人数据。这种“非针对性广告”虽然变现效率低,但保护了用户隐私。
  • 证据/案例:Telegram 去年实现了盈利,主要靠 Telegram Premium(超过 1500 万付费订阅)和基于 TON 链的数字资产变现(如 Snoop Dogg 的数字礼物在 30 分钟内售罄 1200 万美元)。

Ⅳ. 竞争驱动进化论:数学是技术的骨架

  • 核心观点:逻辑思考和数学能力是技术领导力的根源。
  • 原理解构:Durov 及其兄长 Nikolai(三次 IMO 金牌得主)在苏联解体后的极端教育中成长。他认为“匮乏激发创造力”,因为资源不足,他们被迫开发最高效的 C/C++ 引擎,这使得 Telegram 在弱网环境下依然比竞争对手快 50 毫秒。
  • 证据/案例:Telegram 的动画效果(如“灭霸式”的消息删除动画)采用了极其复杂的矢量算法,以极小的带宽消耗实现了极致的视觉美感。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识:丰饶带来的灾难
    • Durov 引用了 “25号宇宙(鼠类天堂)” 实验,指出当食物和空间无限(丰饶)时,社会将陷入崩塌和消亡。他认为现代社会的过度消费和短视享乐(酒、糖、社交媒体推送流)正在瓦解人类的意志力。
  • 盲点与局限:单一领导力的脆弱性
    • Durov 100% 控股 Telegram,且没有董事会。这种极致的集权虽然带来了决策效率和原则一致性,但一旦他本人遭遇意外(如法国被捕事件),整个生态系统面临巨大的单点失败风险。
  • 未解之谜:地缘政治的零和博弈
    • 当自由意志主义撞上欧洲日益严苛的数字法案(如 DSA),Durov 承认目前尚无完美解法。他宁愿退出市场也不屈服,但这对 10 亿用户的连接性是巨大的挑战。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “Freedom matters more than money.” (Свобода дороже денег)
    • 背景:解释为何放弃巨大的广告收益和 VK 的股份。
  2. “Self-discipline is the most important muscle you can exercise.”
    • 中译:自律是你所能锻炼的最重要的肌肉。
    • 背景:讨论他极度克制的饮食、运动和社交生活。
  3. “Short-term pleasure isn’t worth your future.”
    • 中译:短期的欢愉不值得用你的未来去交换。
    • 背景:告诫年轻人远离成瘾性物质。
  4. “I would rather starve to death in prison than yield to the pressure of betraying my users.”
    • 中译:我宁愿在监狱里饿死,也不会屈服于压力去背叛我的用户。
    • 背景:回应法国政府对他施加的政治压力和审查要求。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)
    • 加密抗争加剧:Telegram 与欧洲监管机构的碰撞将成为全球科技审查的分水岭。
    • Web3 整合加速:TON 链将使 Telegram 成为全球最大的 Web3 门户,通过 Mini Apps 改变小额支付和数字所有权的生态。
  • 长期终局 (5-10年)
    • 如果 Durov 的设想成功,Telegram 将进化为一个**“主权数字国家”**,用户拥有真正的身份所有权、资产所有权和言论自由,不再受传统主权国家的边境限制。
  • 行动建议
    • 开发者:关注 Telegram Mini Apps 和 TON 链生态,这是少数能直达 10 亿用户的低门槛创业机会。
    • 创业者:学习 Durov 的“Lean Philosophy”,反思过度扩张团队带来的熵增。
    • 投资者:关注去中心化通信和隐私保护技术的刚需增长,地缘政治冲突越多,这类平台的护城河越深。

行业深度研报:Pavel Durov 与 Telegram 的哲学工程学

报告编译者:资深科技评论家与行业分析师
对话对象:Pavel Durov (Telegram 创始人)
对话媒介:Lex Fridman Podcast
核心主题:极简主义、绝对隐私、反官僚主义与技术的反直觉哲学


1. 🎯 核心论题与背景

对话背景: 本次深度对话的主角是 Telegram 的创始人 Pavel Durov,一位身陷法国法律泥潭(引发国际广泛批判的“卡夫卡式”拘留)、却仍以此为契机阐述其技术信仰与人生哲学的 outspoken 的科技领袖。对话发生在他被法国政府限制人身自由、被迫处于“数字流放”状态的背景下,因此不仅是技术探讨,更是一场关于自由、隐私与人性本质的终极辩护。

核心论点: Pavel Durov 的核心观点在于将“工程学的极致严谨”与“斯多葛哲学的绝对内核”进行融合。他认为,真正的卓越产品构建于极少量人编写的代码之上(Lean Engineering),通过彻底的自动化和分布式架构来规避人为错误与专制压迫;同时,他将其产品哲学建立在对抗“注意力经济”与“强制消遣”的基础上——拒绝无序的新闻流和针对用户隐私的个人数据广告,转而通过审美导向的设计和订阅模式实现盈利。这不仅是一个赚钱的故事,更是一个关于如何在巨型机构的攻击下,通过极度自律(断绝成瘾性物质、高强度训练)来保持战略定力的生存范本。他认为物理层面的健康与精神层面的主权是保护代码不可侵犯性的必要条件。


2. 🧠 深度观点解析

2.1 “极简主义”的工程权力

  • 核心观点:以极少的代码和人员实现千亿的体量,效率不仅意味着省钱,更意味着安全与速度。
  • 原理解构:Durov 坚信“人”往往是分布式系统的健壮性杀手。团队规模过大意味着高比例的时间被用于“协调”(而非创造)。为了保护隐私,Telegram 核心团队仅 40 人,且没有任何员工能接触用户私钥。通过将算法等同于管理层(自动化管理数据中心),系统消除了单点故障和人为攻击面。
  • 证据/案例:他在 VK 时期的经历,20 岁单枪匹马开发并在数月内击败 Facebook 的性能挑战;以及利用 $1/3 美元年薪维持公司运营的极限财务策略。

2.2 “反规训”的产品哲学(非新闻流)

  • 核心观点:为了夺回人类的专注力,必须从互联网中移除“强制性信息流”,即新闻 Feed 和基于隐私数据的广告。
  • 原理解构:现代科技产品的商业模式是将用户作为数据矿场。Durov 选择了一条少有人走的路:只允许基于频道/聚类的定向广告(不支持基于用户画像的广告)。他通过 Premium 会员(15M+ 订阅者)和 Telegram Mini Apps 平台实现变现。这不仅是对抗资本主义的商业手段,更是对人类注意力的防御性保护。
  • 证据/案例:对比 WhatsApp/Signal 缺乏的原生功能(如 7 年前的自动删除、游戏化的文件传输、陶瓷质感的删除动画)。这种对“微小交互体验”的迷恋,是为了在可能在“99%的人都在看无聊内容”的环境中创造“那 1%的愉悦感”。

2.3 “零摩擦”的隐私与主权

  • 核心观点:隐私是生存的自由,不可妥协。
  • 原理解构:Telegram 的架构设计哲学是“系统本身拒绝被攻破”。通过将数据库密钥物理拆分并分布在不同法辖区,或者通过全节点共识,使得即使没有 Telegram 内部人员在场,数据也是不可读取的。这是一种技术上实现的“无人值守”的密钥管理。
  • 证据/案例:面对法国政府的要求,他将“拒绝共享数据”定义为不可更改的系统逻辑,宁愿踢出市场也不愿留下后门。这种铁律并非基于合同,而是基于对自由意志的绝对信仰。

2.4 竞争作为进化的过滤器

  • 核心观点:竞争是消除平庸、激发天才的唯一机制。
  • 原理解构:Durov 认为在学校和家庭环境中过度保护孩子(消除竞争),会导致社会停滞和经济脆弱。相反,赛马式选拔(Coding Contests)能精准筛选出真正解决问题的 A 类人才,而不是那些擅长会议和画 PPT 的 B 类人才。
  • 证据/案例:通过公开黑客松和编程比赛招募工程师。他认为大公司繁文缛节产生的“分摊责任”经过了几千年的进化,需要进行一次系统性的重置。

3. 💡 反直觉与批判性视角

3.1 “坏人”的另一种视角

  • 打破共识:Durov 指出,政府压力往往来自对“混乱”的恐惧,而不仅仅是因为“邪恶”。他认为政府行为往往遵循**“不可能三角”**——即试图在“安全”、“隐私”和“管控便利”三者中同时达成,最终往往牺牲掉前两者。他自己也是这些“敌人”,因为他提供了反例。
  • 盲点与局限:他承认 Telegram 在处理极端的政治言论时会陷入乌克兰/俄罗斯互相指责的“精神分裂”困境。虽然他表示支持言论自由,但过度强调中立可能导致其在地缘政治中被边缘化,或者成为被各方同时指责为“外国干涉者”的工具。

3.2 “富人”悖论与激励失效

  • 思考误区:Durov 认为留给子女(或捐精所得子女)的巨额财富通常会摧毁他们的动力。为了验证这一点,他选择了一种极其极端的分配策略:财富将延迟 30 年分发
  • 未解之谜:虽然他理念超前,但这种“延迟满足”是否合理仍存争议。如果一个人实际上拥有了让子孙后代衣食无忧的资源,而受限于(他自己设定的)规则,这实际上也是一种对他自身绝对自由的某种路径依赖,甚至可能对他的一生活力造成一定影响。

3.3 阿姆斯特丹老鼠实验的现代映射

  • 批判性反思:Durov 引用“宇宙 25 号”实验,指出无限自由和资源可能导致社会解体(老鼠不再繁衍、暴力横行)。这挑战了当代对“科技指数级增长”的盲目乐观。如果 AI 和自动化免费提供了无限生产力,人类是否会陷入像试验老鼠一样的停滞?
  • 应用推演:这暗示了他目前对社交媒体“设计焦虑”的根源——过度的信息流和算法推荐,实际上可能是在模拟这种“生态系统的崩溃”。

4. 💎 金句与高光时刻

  1. “There’s no such thing as your death in your life.”
    • 译文:在你的生命中,不存在“死亡”这个东西。
    • 语境:关于面对死亡的本体论重构,用以克服对死亡的恐惧,从而获得活着的极致清晰感。
  2. “Quantity of employees doesn’t translate to quality of the product… 90% of their time will be spent on coordinating…”
    • 译文:员工数量不等于产品质量,过多的员工会把 90% 的时间花在内部协调上。
    • 语境:阐述 Telegram 极简团队文化背后的工程伦理,反对大公司病。
  3. “There’s something about it that feels wrong when such things are neglected because I understand that every day, tens of millions of people around the world are deleting messages.”
    • 译文:当这种被忽视的事情发生时,人们心里会觉得不对劲,因为我理解每天都有数亿人正在删除消息。
    • 语境:解释为什么即便没人注意,也要花巨大精力优化“消息删除”时的粒子破碎动画的技术哲学——对人性的细腻关怀。
  4. “We don’t get to contribute to this abundance without freedom.”
    • 译文:如果不拥有自由,我们就无法体验并贡献这份繁荣。
    • 语境:他在意大利看到的苏联贫乏与西欧富足的对比,确立了自由作为生产力的前提。

5. 🚀 行业启示与未来推演

5. 短期影响 (1-3年)

  • 去中心化基础设施:以 TON(The Open Network)为代表的核心链技术将与 App 层结合更紧密,推动“数字身份”的资产化(如 Telegram Gifts)。
  • Mini Apps 风暴:随着支付机制的打通,Telegram 将成为继微信之后另一个从 0 到 1 孵化应用的经济系统,开发者可以从编程语言中解放,专注于业务逻辑。

5. 长期终局 (5-10年)

  • 主权互联网:随着欧洲与俄罗斯在通讯管制上的博弈(法国对 Telegram 的封杀、各国的审查法律),Telegram 可能成为“主权互联网”的抗争与逃逸通道。
  • 标准化加密协议:如果政府要求建立“上帝视角”的监控系统失败,技术标准很可能转向 Durov 所倡导的“不可破解”的工程范式,迫使现有对手跟进升级。

5. 行动建议

  • 对于开发者/创业者
    • 招聘逻辑:不要雇佣经验丰富但习惯分摊责任的“大厂螺丝钉”。去竞逐代码挑战,寻找能解决具体问题的极客。
    • 技术迭代:不要为了迭代而迭代。Durov 秘诀在于“如果延迟增加 50 毫秒,亿级用户累计起来就是数个世纪的损失”。极致的效率才是护城河。
  • 对于投资者
    • 价值锚点:警惕纯社交网络的数据挖掘模型。关注那些通过极客精神、审美设计或订阅经济构建“护城河”的平台。
    • 关注区块链:TON 的生态繁荣展示了 App Store 模式与公链的去中心化支付能力的结合潜力。这里是针对该播客的深度研报分析:

深度研报:Pavel Durov 的哲学工程学与反叛的生存哲学

1. 🎯 核心论题与背景

对话背景: Pavel Durov,Telegram 创始人,正处于多重压迫之下——个人隐私在欧洲面临前所未有的法律围剿(法国的长期拘留调查),其技术公司的核心资产(Telegram)被数十亿用户依赖,同时又是左翼与右翼政府共同针对的对象(被指责为外国干预者)。Lex Fridman 对此进行了为期数周的深度跟随采访,将话题从宏观的“设计你的一天”(切断手机、极端自律)延伸至微观的“代码如何杀死一只蚂蚁”(原子级的 UI 动画),构建出一幅融合了黑客精神、斯多葛主义与政治批判的全景图。

核心论点: Durov 的思想核心在于**“用工程学的绝对理性对抗人性的生物本能与制度的官僚冲动”**。他认为,一个无视用户隐私、沉溺于娱乐和算法投喂的产品是时代的毒药,而 Telegram 的成功证明了不加用户数据广告、仅依靠“审美”与“利他”精神即可盈利并在全球定制的可能性。他的论述建立在一个极其坚固的基石上:通过彻底的物理隔绝(断绝酒精、糖分、毒品)和精神异化(拒绝算法 Feed),一种极度理性的、生猛的、具有艺术感的“极客独裁”秩序正在构建新的互联网疆土。

2. 🧠 深度观点解析

2.1 极简主义的极权主义

  • 核心观点:团队规模是产品质量的天敌,甚至能直接决定系统的安全性。
  • 原理解构:Durov 驳斥了“人多好办事”的传统管理逻辑。他认为,超过一定规模后,沟通成本会呈指数级上升,只剩下 10% 的人在处理实际问题,90% 的人在协调。这导致了“投入产出比”和“安全黑箱”的双重恶化。为了实现这一目标,他强制使用自动化和算法来管理全球数十万台服务器,消除了所有“人类作为攻击向量”的可能。这是一种去中心化的零信任架构 (Zero-Trust Architecture),但不是针对外部黑客,而是为了防备内部人员或外部势力通过内部人员作案。
  • 证据/案例:40 人的核心团队管理亿级用户;没有任何员工拥有私钥访问权限;基于 Open Source 和可重现构建的哲学,使得外部审计能通过代码而非信守承诺来信任系统。

2.2 供给侧的艺术与注意力陷阱

  • 核心观点:必须从供给侧拆除“成瘾机制”和“隐私剥削”,以夺回用户的时间主权。
  • 原理解构:Durov 指出,现代社交软件正在实施“技术成瘾”。他选择将 Telegram 设计成一个**“工具”**而非“媒体”。
    • 去 Feed 化:不提供瀑布流信息,切断用户被动的“被动刷新”机制。
    • 审美资本主义:广告收入不来自于“你知道我想买什么”,而来自于“这个频道本身值得支持”。
    • 付费订阅:Premium 模式不是针对功能收费,而是为了追求极致体验(更快的下载、去广告)而进行的付费意愿测试。
  • 证据/案例:从“删除消息时的 Thanos 指尖滑动特效”到长达数年的矢量动画 Stickers 研究。同时,他也指出如果不赚钱,产品可能会消失,所以他引入了 Mini Apps 和付费礼物,证明了“不掠夺隐私也能致富”。

2.3 Scarcity as a Driver (稀缺性作为驱动力)

  • 核心观点:匮乏是创造力的源泉,无节制的富足会导致物种(和文化)的退化。
  • 原理解构:Durov 深受“宇宙 25 号”老鼠实验(无限资源导致的种群灭绝)的影响。他极力推崇童年时期的贫穷与资源的缺乏,因为正是这种匮乏感设定了**“优先级”**,迫使大脑高效处理信息,而非被繁杂的娱乐淹没。他反直觉地认为,现代教育过于害怕孩子的“争强好胜”和“压力”,实际上剥夺了他们通过失败建立韧性(Resilience)的机会。
  • 证据/案例:描述自己在苏联时期只能玩二手衣服,以及他在工作狂生涯中依然保持高强度冥想和训练,“活在当下”的时间管理哲学。

2.4 私密财产权的“不可能三角”

  • 核心观点:隐私不是一种法律权利,而是一种生存技术的物理极限;试图通过后门开启极权统治的大门,那是死路一条。
  • 原理解构:针对法国政府和欧盟提出的“强制加密后门”要求,Durov 提出了技术层面的反驳:任何后门都会被高级黑客拥有。更重要的是,他构建了**“经济与政治的不可抗力”**机制:Facebook、Google 和微软的公司结构天生要求扩张利润、收集数据和顺从监管;而 Telegram 采用的是非上市公司结构,带给他的唯一动力是完善产品而非获利。这让他在面对制裁时处于一个无法被收买的独特位置。
  • 证据/案例:不惜退出俄罗斯或欧盟市场(虽然他保住了市场,但通过舆论战重创了法国形象),因为他坚信“如果我放弃了原则,我也就死去了”。

3. 💡 反直觉与批判性视角

3.1 “The two chairs dilemma” (第二把椅子的悖论)

  • 打破共识:Durov 强调解决复杂问题的方法不是在 A 和 B 两个烂选项中二选一,而是主动破坏游戏规则。他分享的一个监狱谜题暗示:如果你面对的是毁灭性的压力,不要选择毁灭性较小的那个,而是要改变问题的前提。
  • 盲点与局限:这种“打破规则”的策略在 IT 领域很有效,但在处理人类情感和复杂的法律困境(如法国的拘留调查)时可能是一个陷阱。正如 Lex Fridman 所指出的,卡夫卡的《审判》说明,官僚体制的最终目的不是审判,而是耗尽你的精神。Durov 的确做到了不屈不挠,但整个系统正在消耗全球的资源(司法、媒体)来围剿他一个人。

3.2 B Players 的毒性

  • 反常识思维:大多数管理者认为“人多力量大”,而他认为**“平庸同事的虚度光阴比工作量不足更具破坏力”**。
  • 批评性反思:他必须保持极度的不讨喜(insulting 被动词),频繁开除表现不佳的员工。这种高效的“物竞天择”虽然提升了公司的技术速度,但也可能因为“唯我独尊”的文化而伤害长期的人才生态。他实际上是在培养一个由超级 A 类选手组成的完全服从的“狂人军队”,这虽然带来了速度,但也放大了决策失误的破坏力(因为他一人独裁)。

3.3 富足的陷阱与“量子永生”

  • 伦理争议:他对“富足导致颓废”的定义如果被社会采纳,可能会造成一种**“精英主义的负罪感”**。他认为自己捐精和在去世后分给几百个非亲生子嗣巨额遗产的行为是高尚的,但他通过强制性地不给这笔钱(延迟 30 年)来防止他们“腐烂”。
  • 终极哲学:他对量子力学的“多世界诠释”和“量子永生”的接受,表明他实际上是一个不可知论者,相信通过意志力可以改变概率波函数。这为他的“斯多葛”训练增添了一层神秘主义色彩——他认为所谓的“自由意志”或许就是一种重构现实概率的能力。

4. 💎 金句与高光时刻

  1. “Quantity of employees doesn’t translate to quality of the product… If you have too many people, they have to coordinate their efforts, constantly communicate, and 90% of their time will be spent on coordinating…”
    • 译文:员工数量不等于产品质量。人太多会导致协调成本飙升,90% 的时间都在沟通ossip(而无所事事)。
  2. “It’s incredibly important to analyze yourself and try to get to the bottom of things… If you’re experiencing a headache, one solution would be to take a pill… What this pill would actually do, in most cases, it would mute the consequence… You have to ask yourself, ‘What is it that’s causing this headache?’”
    • 译文:我们需要分析自己,挖到问题的根源。吃止痛药只是掩盖症兆,真正的解决在于找到头痛的来源。
  3. “You don’t get to contribute to this abundance without freedom.”
    • 译文:自由是繁荣的前提,没有自由,你就无法体验或贡献这个世界的丰富性。
  4. “The struggle of the Hunger Artist is relevant to our modern-day attention economy.”
    • 译文:卡夫卡笔下的“饥饿艺术家”与现代人的注意力经济有着惊人的隐喻关联——当献艺变得陈旧乏味,无人观看时,表演者在如何生存?
  5. “I think there is a lesson from all these huge pressure… The more pressure I get, the more resilient and defiant I become.”
    • 译文:外在压力越大,我的反弹和反抗就越强,这是反直觉的真理。

5. 🚀 行业启示与未来推演

5. 短期影响 (1-3年)

  • 加密通讯的正规化与极端化:随着政府对 Telegram 的施压,用户对端到端加密和去中心化存储的需求将达到顶峰。类似的技术(如更安全的 ONYX、ICANN 管理的域名)将获得红利。
  • Telegram Mini Apps 的爆发:Durov 构建了一个完全独立的 App Store 生态,加上区块链支付的便利性,开发者将在 Telegram 上掀起一轮类似于早期移动 App 潮的创业浪潮。

5. 长期终局 (5-10年)

  • 主权互联网的对抗:Telegram 可能会成为全球自由派黑客、流亡者、被审查媒体的避难所与通讯枢纽。这可能导致欧盟与美国政府密谋制定新的反加密法案,从而形成新的冷战技术分裂带。
  • 审美护城河:如果 UI/UX 的“艺术性”成为了区分头部产品的唯一维度,技术堆栈的复杂性将转化为 IP(知识产权的壁垒)。

5. 行动建议

  • 对于开发者:不要只关注功能的堆叠,要关注交互的微观感知。Durov 证实了在极度快的连接下,平滑的动画和极简的干扩设计对用户体验有决定性影响。
  • 对于创业者:反思招聘中“大厂经验”的陷阱。如果你的团队里有“分摊责任”的思维,尽早切除。
  • 对于个人:重拾“稀缺性”带来的创造力。在这个信息过载和算法投喂的世界里,有意识地“断网”、限制信息和娱乐摄入,可能是保持大脑高效运转的最强手段。

逐字稿

Introduction

Lex Fridman (00:00:00) The following is a conversation with Pavel Durov, Founder and CEO of Telegram, a messaging platform actively used by over 1 billion people. Pavel has spent his life fighting for freedom of speech, building tools that protect human communication from surveillance and censorship. For this, he has faced pressure from some of the most powerful governments and organizations on earth. In the face of this immense pressure, he has always held his ground, continuously fighting to protect user privacy and the freedom of all of us humans to communicate with each other. I got the chance to spend a few weeks with him and can definitively say that he’s one of the most principled and fearless humans I’ve ever met. Plus, when I posted that I’m hanging out with Pavel, a lot of people, fans of his, wrote to me asking if he does, in fact, privately live the disciplined ascetic life he’s known for. No alcohol, stoic mindset, strict diet and exercise, including a crazy amount of daily pull-ups and push-ups. No phone, except to occasionally test Telegram features, and so on.

(00:01:12) Yes, he’s 100% that guy, which made the experience of hanging out with him really inspiring to me. I’m grateful for it and I’m grateful to now be able to call him a friend. This podcast conversation is in parts philosophical, about freedom, life, human nature, and the nature of government bureaucracies. And it is also in parts super technical because to me, it’s fascinating that Telegram has a relatively small engineering team and yet is able to basically out-innovate all of its competitors with an insane rate of introducing new, unique features. Just like the meme of the Simpsons did it first, when you consider all the features we know and love in our communication apps, in almost every case, Telegram did it first. So we discuss it all, from the Kafkaesque situation he’s in the midst of France, to the roller coaster of his life and career, to his philosophy on technology, freedom, and the human condition.

(00:02:15) And by the way, while this entire conversation is in English, we’ll make captions and voiceover audio tracks available in multiple languages, including Russian, Ukrainian, French, and Hindi. On YouTube, you can switch between language audio tracks by clicking the settings gear icon, then clicking audio track, and then selecting the language you prefer. Huge thank you once again to ElevenLabs for their help with translation and dubbing, and with the bigger mission of breaking down barriers that language creates. They are truly one of the most remarkable companies I’ve ever had the pleasure of working with. This is the Lex Fridman podcast, to support it please check out our sponsors in the description. And now, dear friends, here’s Pavel Durov.

Philosophy of freedom

Lex Fridman (00:03:07) You’ve been an advocate for freedom for many years, writing that you should be ready to risk everything for freedom. What were some influences and insights that help you arrive at this value of human freedom?

Pavel Durov (00:03:21) I get to experience the difference between a society with freedom and a society without freedom pretty early in life. I was four years old when my family moved from the Soviet Union to northern Italy, and I could see that a society without freedom cannot enjoy the abundance of opinions, of ideas, of goods and services. Even for a four or five-year-old kid, it was obvious. You can’t experience all the toys, the ice cream of sorts, the cartoons in the Soviet Union that you can access in Italy. And then I got to realize something even more important. You don’t get to contribute to this abundance without freedom. And at this point it was pretty obvious to me.

Lex Fridman (00:04:14) You also wrote “Свобода дороже денег”. It translates to, “Freedom matters more than money.” How do you prevent these values for freedom, being corrupted by money, by people with influence, by people with power?

Pavel Durov (00:04:29) Well, the biggest enemies of freedom are fear and greed, so you make sure that they don’t stand in your way. If you imagine the worst thing that can happen to you and then make yourself be comfortable with it, there is nothing more left to be afraid of. So you stand your ground and you remember that it’s worth living your life according to the principles that you believe in, even though this life can end up being shorter than a longer life, but lived in slavery.

Lex Fridman (00:05:08) Do you contemplate your mortality? You think about your death?

Lex Fridman (00:05:13) Are you afraid of it?

Pavel Durov (00:05:14) In a way, you have to go against your instinct of self-preservation, and it’s not easy. We are all biological beings, hard-coded to be afraid of death. Nobody wants to die, but when you approach it rationally, you live and then you die. There’s no such thing as your death in your life. You stop experiencing life once you die. So you have to ask yourself this question, is it worth living a life full of fear of death, or it’s much more enjoyable to forget about this and live your life in a way that makes you immune to this fear? At the same time remembering that death exists, so that every day would count.

Lex Fridman (00:06:03) Yeah, remembering that death exists makes you deeply feel every moment that you do get.

Pavel Durov (00:06:11) That’s why I love reminding myself that I can die any day.

No alcohol

Lex Fridman (00:06:15) In many ways you live a pretty stoic existence. I got a chance to spend a couple of weeks with you. In many ways, you seek to minimize the negative effects of the outside world on your mind. You’ve written, quote, “If you want to reach your full potential and maintain clarity of mind, stay away from addictive substances. My success and health are the result of 20 plus years of complete abstinence from alcohol, tobacco, coffee, pills, and illegal drugs. Short-term pleasure isn’t worth your future.” Let’s talk about each one of these. Alcohol. What’s been your philosophy behind that?

Pavel Durov (00:06:57) That one is quite easy. When I was 11 years old, my biochemistry teacher, he gave me this book he wrote, it was called The Illusion of Paradise, and there he would describe the biological and chemical processes that happen in your body once you consume this or that substance. It was mainly related to illegal drugs, but alcohol was one of these addictive substances that he covered. So it turns out that when you drink alcohol, the thing that happens is that your brain cells become paralyzed. They become literally zombies. And then next day, sometime after the party is over, some of your brain cells die and never get to normal. So think about this. If your brain is this most valuable tool you have in your journey to success and happiness, why would you destroy this tool for short-term pleasure? This sounds ridiculous.

Lex Fridman (00:08:06) Yeah, in many ways it’s a poison we’re letting in our body. But by way of advice, what advice would you give to people who consider not drinking? A lot of people use alcohol to enable them to have a vibrant social life. There’s a lot of pressure from society at a party to drink so they can socialize. So what advice would you give to them, to people who imagine having a social life without alcohol?

Pavel Durov (00:08:37) Well, first of all, don’t be afraid to be contrarian. Set your own rules. Secondly, if you feel you need to drink, there must be some problem you’re trying to conceal. There’s some theory you’re not ready to confront, and you have to address this fear. If there is a good-looking girl you’re afraid to approach, get rid of this fear, approach her, practice. Do it again and again, it’s pretty banal, but this advice works.

Lex Fridman (00:09:11) Fix the underlying problem, which is usually at the very bottom, is always going to be fear. Work on that.

Pavel Durov (00:09:17) And very often people are trying to escape something in their lives with alcohol. What is it they’re trying to escape? What is this problem? You have to get to the bottom of it. Your mind is trying to tell you something valuable, and instead of addressing it directly, you are flooding it in alcohol, which is a spiritual painkiller, but works only temporarily and then you have to pay the debt with interest.

Lex Fridman (00:09:51) So what do you do? I mean, you’ve been in a lot of gatherings, a lot of parties. Is there some challenges to saying no?

Pavel Durov (00:09:58) For me, not at all. I’ve been always ready to stand my ground and say no when I feel something’s not right. And it’s extraordinary how easily we humans are affected by what we perceive as a majority. Because nobody since ancient times, since million years ago wants to be left out by the tribe. We are scared that we won’t become accepted anymore, which thousands of millions of years ago meant we’re going to starve to death. So we have to consciously fight this inclination to be agreeable with everything that the majority imposes on you because it’s quite clear that many things that the majority, many activities the majority is engaging in are not bringing you any good.

Lex Fridman (00:11:03) So that’s another fear you have to face, going into a party and the fear of being the outcast at that party, of being different than others at that party, at that social gathering. In the crowd of humans, be different. That’s a fear.

Pavel Durov (00:11:17) That’s a fear. And it’s quite irrational if you think about it. It was something that made a lot of sense 20,000 years ago. It makes zero sense today because if you think about it, if you do the same thing everybody else around you is doing, you don’t have any competitive advantage and you don’t get to become outstanding at some point in your life.

Lex Fridman (00:11:45) Yeah, that’s one of the things we talked about by way of advice is, if you want to be successful in life, you want to be different.

Lex Fridman (00:11:56) And perhaps, I think you said you want to achieve mastery at a niche. So find a niche at which you can pursue with all your effort and achieve mastery, and the niche being different than anything that anybody else is doing. Can you explain that a little bit more?

Pavel Durov (00:12:13) So obviously in order to contribute to the society you’re in, to the economy of the country you live in, you have to do something that is valuable. But if you’re doing something that everybody else is doing anyway, what’s the value of it? Now it sounds easier than it is done, to do something that nobody else is doing, because we humans are surrounded by all kinds of information, which makes us want to copy what we’re perceiving. At the same time, there are so many areas which you can explore, that have nothing to do with the information you receive on the daily basis. So it’s extremely important to curate the information sources that you have, so that you wouldn’t be somebody who is left to the will of AI-based algorithmic feed telling you what’s important so that you end up consuming the same information, the same stuff, the same memes, the same news as everybody else.

(00:13:24) But rather you should be proactive. You should deliberately try to set a goal, an area that you want to explore, and then actively search information that is relevant to this field, so that one day you can become the world’s number one expert in this field. And it’s not that difficult to do that. You have to just remain consistent because nobody else is trying to do that. Everybody else is just reading the same news and discussing the same news every day. But this way they don’t get to have a competitive advantage.

No phone

Lex Fridman (00:14:08) Yeah, majority of the population becomes slaves to the AI-driven recommender systems, and so the content everybody’s fed is the same thing and we all become the same. On that point, one of the different things you do is, you don’t use a phone except occasionally to test Telegram features, but I’ve been with you for two weeks, I haven’t seen you use a phone at all in the way that most people use a phone, like for their social media. So can you describe your philosophy behind that?

Pavel Durov (00:14:40) I don’t think a phone is a necessary device. I remember growing up, I didn’t have a mobile phone. When I was a student at the university, I didn’t have a mobile phone. When I finally got to use a mobile phone, I never used phone calls. I was always in airplane mode or mute. I hated the idea of being disturbed. My philosophy here is pretty simple, I want to define what is important in my life. I don’t want other people or companies, all kinds of organizations telling me what is important today, and what I should be thinking about. Just set up your own agenda and the phone gets in your way.

Lex Fridman (00:15:40) It provides distractions, it guides what you should be looking at, what you will be looking at. So you don’t want that. You want to quiet the mind. You want to choose what kind of stuff you let inside your mind.

Pavel Durov (00:15:55) Yes, because this way I can contribute to the progress of society. Or at least I like to think this way and this makes me happier.

Lex Fridman (00:16:03) How often do you find quiet time to just think and focus deeply on work without any distractions? You mentioned to me that you value quiet mornings.

Pavel Durov (00:16:13) Yes. So the thing I’m trying to do, I try to allocate as much time as possible for sleep. Now, even if I allocate say 11 or 12 hours for sleep, I won’t sleep for 11 or 12 hours. So what I end up doing is, I end up lying in bed thinking. And some people hate it. They say, “Well, you have to take a sleeping pill,” but I never take pills. I love these moments. I get so many brilliant ideas, or at least they seem brilliant to me at the moment, while I’m lying in bed, either late in the evening or early in the morning. That’s my favorite time of the day. Sometimes I wake up, I go take a shower, still without a phone.

(00:17:03) Beautiful ideas can come to you while you’re doing your morning exercise, your morning routine without a phone. If you open your phone first thing in the morning, what you end up being is a creature that is told what to think about for the rest of the day. Same is true in a way if you’ve been consuming news from social media late at night. But then how do you define what is important and what you really want to become in life? Now, I’m not saying you have to completely stay away from all sources of information, but take some time to think about what’s really important for you and what you want to change in this world.

Lex Fridman (00:17:51) So you definitely try to avoid digital devices for as many hours as possible in the morning, just to have the quiet thinking time, plus the crazy amounts of push-ups and squats?

Pavel Durov (00:18:02) I know it’s counterintuitive because I founded one of the largest social networks in the world, after which I founded the second-largest messaging app in the world. And you’re supposed to be really connected, but the conclusion you reach very early is that the more connected and accessible you are, the less productive you are. And then how can you run this thing if you’re constantly bombarded by all kinds of information, most of which is irrelevant to the success of what you’re trying to build? The entire world can be fascinated by a fight, a quarrel between the world’s richest man and the world’s most powerful man. But for the vast majority of these people following this saga, it’s irrelevant. It won’t change their lives, and in any case, they can’t affect it, so it’s a bit pointless. Of course, there are people who are engaged in activities that require them to be up-to-date of everything that’s going on, but 99% of people aren’t.

Lex Fridman (00:19:19) Yeah, the internet, social media presents to us drama in such a way that we think it’s the biggest thing in the world, the most important thing in which the tides of history will turn. But in reality, most things will not turn the tides of history. And so I guess our challenge is to figure out what is the timeless thing? What is the thing that’s happening today that’s still going to be true in 10, 20 years? And from that, decide what you’re going to do. And that’s very difficult on social media because everybody’s outraged. The news of the day, whatever the quarrel is, that’s the thing that everyone thinks the world will end because of this thing, and then another thing happens the next day.

Pavel Durov (00:20:04) And they’re trying to influence your emotions.

Pavel Durov (00:20:08) And that’s how you get into trouble because you can be forced to make conclusions that are not in your best interest.

Discipline

Lex Fridman (00:20:17) I’ve seen you be, once again, quite stoic about your emotions. You ever get angry? You ever get lonely? You ever get sad? The roller coaster of human emotion, and what do you do with that when you make difficult decisions?

Pavel Durov (00:20:31) I’m a human being like everybody else. I do get to experience emotions. Some of them are not very pleasant, but I believe that it’s the responsibility of every one of us to cope with these emotions and to learn to work through them. Self-discipline is particularly important because without it, how can you overcome this seemingly endless loop of negativity or despair that ultimately leads to depression for some people? I normally never have depression. I don’t remember having depression in the last 20 years, at least. Maybe when I was a teenager. But one of the reasons for that is I start doing things.

(00:21:25) I identify the problem, I can see a solution, and I start executing the strategy. If you are stuck in this loop of being worried about something, nothing’s ever going to change. And people often make this mistake thinking, “Oh, I should just have some rest and then regain energy.” This is not how it works. You gain energy by doing something, so you start doing something, then it happens, you feel motivated, you feel inspired. And then ultimately you do something else, a little bit more, a little bit more. And then a few years, who know? You may end up achieving great things.

Lex Fridman (00:22:12) Yeah, that’s the thing that people are confused. If you’re stuck in a depressive cycle, even when you really, really, really, really don’t want to do anything, to do something. Try to make progress because the good feeling comes on the end of that. The whole point is to do first and then feel, not feel and then do.

Pavel Durov (00:22:33) Exactly. And going to the gym is a good example. There are many days when you don’t want to start working out, but you have to overcome this initial reluctance, and then you get to a point that you enjoy it and you think, “Oh my God, it was such a good idea to come to the gym today.” But it’s similar to pretty much every activity. You get to write some code, write a small piece of code first, and then you get inspired. Then you’ll come up with more ideas. You need to write a novel or just write the paragraph. This is pretty obvious and it’s not a secret, but because we are bombarded with all kinds of information, that is not really important for us in terms of becoming successful, we often forget the important things, and this is one of them.

Lex Fridman (00:23:32) We’ve been working out every single day. You have been working out for many years pretty intensively, so I think a lot of people would love to know what’s your perfect daily workout regimen? Let’s say on a daily, weekly basis?

Pavel Durov (00:23:50) I do 300 push-ups and 300 squats every morning. And in addition to that, I go to the gym normally five, six times a week, spending between one and two hours every day.

Lex Fridman (00:24:04) So push-ups and squats are still a big part of your routine?

Pavel Durov (00:24:07) Yes, this is how I start my day. I’m not sure they do a lot in terms of changing your body, but they’re definitely a good way to practice self-discipline because you don’t want to do these push-ups in the morning most of the days. Squats are particularly boring. They’re not that hard, they’re just boring, but you overcome it and then it’s much easier to start doing other things related to your work. For example, when I can, I also take an ice bath because it’s another exercise of self-discipline. I think the main muscle you can exercise is this muscle, the muscle of self-discipline. Not your biceps or your pecs or anything else. Because if you get to train that one, everything else just comes by itself.

Lex Fridman (00:25:07) Everything else becomes easy. We should mention, I went with you to Banya, and I think it’s fair to say you’re nuts in terms of how much you can handle. And I didn’t even see the worst of it. Can you just speak to your crazy escapades in the Banya, what value you get from it? So both the heat and the cold.

Pavel Durov (00:25:31) I don’t know if it’s crazy. I think it’s quite natural and normal by this time, but maybe I just got used to it. So Banya is this extreme kind of sauna practiced by Eastern Europeans, but it is done in a way that maximizes heat and they also use all kind of herbs and branches, and it’s a much more holistic and natural experience. Then the necessary part of it is you get the cold plunge and then you go back. And again, this is one of those things that maybe in the moment it’s not always that pleasant, particularly if you go to extreme temperatures, you don’t feel great.

(00:26:24) I don’t always feel great, but this feeling is passing. It’s only a few minutes. Same with the ice bath. You have to suffer a bit and then you get to feel great for hours and days after. What’s more, it gives you this long-term health benefits. In a way you can look at it as alcohol in reverse. Alcohol will give you this short, fleeting pleasure for an hour, for a couple of hours, but then you will be paying for it with long-term negative consequences. I’d rather do Banya and ice bath.

Lex Fridman (00:27:09) We swam the length of a large lake in France a couple times. Can you talk through why you value these multi-hour swims?

Pavel Durov (00:27:17) Yeah. I love swimming for hours. The longest I swam was five and a half hours in Finland. It was quite cold. I got lost in the process, barely could find my way back. But the reason I do it, yes, you feel great after. You’re shaking a little bit, you feel great after. You cross a huge lake, and I cross many lakes, Geneva Lake, Zurich Lake. And every time you feel this achievement, which makes you happy, makes you feel strong, and then you’re more ready to do other challenges. And of course, when you know you’re going to start a journey that will last a few hours, you are reluctant to do it. But you swim for 10 minutes and then for 20 minutes and then for 30 minutes, and it teaches you this incredible patience that I think is necessary if you want to achieve anything in life.

Lex Fridman (00:28:23) And it’s pretty meditative, lake versus ocean.

Pavel Durov (00:28:27) Yes. And you don’t have to go too fast. You can be slow and enjoy the moment.

Lex Fridman (00:28:33) Until you get lost and it’s five and a half hours. Would you panic, if you’re going to be able to find the shore, find your way out?

Pavel Durov (00:28:39) Not really, I’m a reasonably stress-resilient person. I didn’t panic at that moment. And there were worse swims I had that were shorter, but involved accidents and you know about some of them. So that wasn’t the worst by far. But an important thing about swimming and physical activity in general is that it makes your mind clear and your thinking process is becoming more efficient. Because at the end of the day, the efficiency of our brain is limited by how much sugar and oxygen our heart can push through blood to our brain though. How can you make this go faster or how do you make your lungs more efficient? How do you make your heart more efficient in doing that?

(00:29:33) Physical activity is the only way I know of. So it’s not just staying healthy or trying to look good, it’s also being productive. It’s also being stress resilient. All of these qualities are necessary if you want to run a large company, if you want to start a company. I’m surprised when I started doing this more than 10 years ago, that more CEOs didn’t engage in sports. The situation changed in the last several years, which is great. Because back in the day, if you take 20 years ago, there was this stereotype that if you are strong, you must be not very smart and vice versa. Which is a complete lunacy. Very often these two things go together.

Lex Fridman (00:30:34) So for you working out is not just about staying healthy, it’s actually valuable for the work that you do as a tech leader, as an engineer, as a technologist.

Pavel Durov (00:30:43) Oh yes. When I can’t train, I can instantly feel that stress is creeping on me. So even in situations when I’m constrained, I can’t go to the gym, I would just keep doing push-ups. I just keep doing squats.

Lex Fridman (00:31:06) Yeah, I mean that’s the cool thing about body weight exercise. You could just do it anywhere. You could just pop off 50, 100 push-ups before a meeting.

Pavel Durov (00:31:16) Don’t you feel weird when you have a day without physical activity?

Lex Fridman (00:31:21) Yeah. If I go a day without doing push-ups, at the very minimum, it’s a shitty day.

Pavel Durov (00:31:27) And if you can do pull-ups, it’s even better.

Lex Fridman (00:31:30) Yeah. I got to ask you about your diet too. No processed sugar, no fast food, no soda. Intermittent fasting, sometimes once a day only, sometimes a couple times a day. So take me through your philosophy on the no sugar, no soda, just clean food.

Pavel Durov (00:31:47) Well, sugar is pretty easy because it’s addictive. The more you consume sugar, the more you want it, the hungrier you get. So if you want to stay efficient and healthy, why consume processed sugar? You’ll just end up snacking all the time. Intermittent fasting. So eating only within six hours and not eating for 18 hours every day also brings structure into your day and into your eating habits. So you don’t crave sugar anymore because you know if you eat sugar and then you’re unable to snack, you’re just punishing yourself. I read a few books on longevity. I think something everybody agrees on is that sugar is harmful.

(00:32:48) No, I’m not militant about sugar. You can eat berries, fruit, if you feel your body needs it, but it’s not true to think it’s necessary to consume sweet things. Not for children, not for adults. Red meat, I stopped eating it about 20 years ago because I just felt heavy every time I had it. So I guess it’s individual. It’s my metabolism. My digestive system isn’t agreeing with this kind of food. So I normally eat seafood of all kinds and vegetables. This is the basic source of calories for me.

Lex Fridman (00:33:37) Yeah, and like all things, you said, “Short-term pleasure isn’t worth your future.” So a lot of things we all know, that alcohol is destructive to the body. Tobacco, pills, processed food, sugar, but society puts that on you, makes it very difficult to avoid. So I guess it all boils down to just discipline.

Pavel Durov (00:33:56) Yes, and trying to identify the real cause of an issue you’re experiencing. If you’re experiencing a headache, one solution would be to take a pill and then the headache disappears. What this pill would actually do, in most cases, it would mute the consequence, your feeling of pain. It’s a painkiller. It will not eliminate the root cause. So you have to ask yourself, ” What is it that’s causing this headache? Do I need to drink some water? Is the air quality here bad? Do I need to start getting more sleep? Is there something wrong with people around me? They’re stressing me out.” There must be some reason why you’re experiencing a headache. But if you take a pill, you’re not removing this reason, you’re actually making it worse because this harmful factor is still there. It’s like you’re-

Pavel Durov (00:35:00) Full factor is still there. It’s like you’re piloting a helicopter and there is some red signals and red lamp starts to blink and it starts producing bad, unpleasant noise. What would you do? You would try to figure out the cause and eliminate it. Maybe there is some mountain next to you and you have to avoid it, or you take a hammer and smash the signal. I think the answer is quite obvious. So, why are we constantly doing this regardless? Oh, because everybody else is doing it. Because there’s a whole industry trying to persuade you that this is the right thing to do. So, it’s incredibly important to analyze yourself and try to get to the bottom of things.

Lex Fridman (00:35:48) So you generally try to avoid all pills, all pharmaceutical products?

Pavel Durov (00:35:53) Yes. I’ve been staying away from all of that since I became an adult. When you’re a teenager, your mom would typically say, “We need to take this pill, otherwise the world collapses.” Once I became a grown-up, I said, “No, I don’t think that the producers of pill are incentivized in the right way. They’re not really interested in eliminating the root of the problem.” They would rather have me dependent on the pills they’re producing so that I could buy them forever. No, I’m not saying that you should never take pills. There are obviously some diseases that you can only fight with antibiotics, for example. So, I’m not suggesting we go back to the Middle Ages, but what I’m saying is we overuse pills.

Lex Fridman (00:36:59) Yeah, it’s always good to study and deeply understand the incentives under which the world operates so that you don’t get swept up into the forces that operate under these incentives. Big Pharma is certainly one of them. Pharmaceutical companies have a huge incentive to keep the problem going versus solving the problem. It’s wise.

Pavel Durov (00:37:19) This is something I practice every day. I read some piece of news and I ask myself, “Who benefits from me reading this?” Then you can end up coming to this conclusion that maybe 95% of things we read in the news have been written and published because somebody wanted you to buy some product, support some political cause, fight some war, donate some money. Let’s do something that would benefit other people. This is not a problem to support causes that you truly believe in as long as it was your intentional choice and you’re not being manipulated into fighting other people’s wars.

Lex Fridman (00:38:14) And that takes us back to the original thing we started talking about, which is freedom. One of the ways to achieve freedom of thought is to remove your mind from the influences, the forces that manipulate you. That’s really important to realize the content you consume, especially on the internet, when a large percentage of it is designed to manipulate your mind. You have to disconnect yourself. Be very proactive understanding what the biases, what the incentives are. So, you can think clearly, independently, and objectively.

Pavel Durov (00:38:51) Again, it ties back with restraint from alcohol because if your mind is clouded, how can you analyze yourself? You’ll always be dependent on opinions of others. You always follow the mainstream. And then whatever the authorities or whoever in charge will tell you, you believe it because you don’t have a tool of your own to rely on to come to your own conclusions.

Lex Fridman (00:39:27) I have to ask you, this is something that came up. You don’t watch porn. I don’t think I’ve heard you talk about this before. What’s the philosophy behind not watching porn? There’s a lot of people that talk about porn in general having a very negative effect on young men on their view of the world, on their development of their sexuality and how they get into relationships and all that stuff. So, what’s your philosophy in not consuming porn?

Pavel Durov (00:39:55) I don’t watch porn because I just feel it’s a surrogate, a substitute for a real thing that is not necessary in my life. If anything, it just forces you to exchange some energy, some inspiration to a fleeting moment of pleasure. It doesn’t make sense. In any case, as I said, it’s not the real thing. So, as long as you can access the real thing, you don’t need to watch porn. But then if you can’t access the real thing, you shouldn’t watch porn as well because it means there’s some deficiency in your life, some problem that you have to overcome.

Lex Fridman (00:40:45) Yeah, analyze the underlying cause. Again, this goes back to the theme of investing in a long-term flourishing versus short-term pleasure. There’s a theme to the way you approach life.

Pavel Durov (00:41:02) I try to be strategic. I try to act under assumption that I’m not going to die in one hour from now and I’m going to stick around for a bit despite the fact that we are all mortal. So, why would I exchange the mid and long term for the short term? It doesn’t make any sense.

Lex Fridman (00:41:23) Quick pause, bathroom break.

Pavel Durov (00:41:24) Yeah, let’s take a break.

Telegram: Lean philosophy, privacy, and geopolitics

Lex Fridman (00:41:26) All right. We took a break and now we’re back. I got to ask you about Telegram, the company. I got to meet some of the brilliant engineers that worked there. Telegram runs lean relative to other technology companies that achieve the scale that Telegram does. It has very few employees. So, how many people are on the core team? Let’s say the core engineering team.

Pavel Durov (00:41:48) The core engineering team is about 40 people. This includes back-end, front-end, designers, system administrators.

Lex Fridman (00:42:02) Can you speak to the philosophy behind running a company with so few employees?

Pavel Durov (00:42:10) Well, what we realized really early is that quantity of employees doesn’t translate the quality of the product they produce. In many cases, it’s the opposite. If you have too many people, they have to coordinate their efforts, constantly communicate, and 90% of their time will be spent on coordinating the small pieces of work they’re responsible for between each other. The other problem with having too many employees is that some of them won’t get enough work to do, and if they don’t get enough work to do, they demotivate everybody else by their mere existence. They’re still there, they’re still getting the salary, but they don’t do anything.

(00:43:01) If they don’t do anything, more often than not, they will start trying to find their purpose elsewhere, maybe inside your team, but not by doing productive work, but by finding problems that don’t exist within the team. That can disrupt the team and the mood inside it even further. Also, when you intentionally don’t allow some of your team members to hire more people to help them, they’ll be forced to automate things. In our case, we have tens of thousands of servers around the world, almost 100,000 distributed across several continents and data centers.

(00:44:02) If you try to manage this system manually without automation, you will probably end up hiring thousands of people, tens of thousands of people. But if you rely on algorithms and the team is forced to put together algorithms in order to manage it, then it becomes much more scalable, much more efficient, and interestingly, much more reliable as well.

Lex Fridman (00:44:31) And more resilient to the changing geopolitics, to the changing technology, all of that. Because if you automate the distributed aspect of the data storage and all the compute, then that’s going to be resilient to everything the world throws at you. I suppose if you have people managing all of it, it becomes stale quickly.

Pavel Durov (00:44:54) Yes, humans are attack vectors, and if you have a distributed system that runs itself automatically, you have a chance at increasing the security of speed and speed of your service, this is what we did with Telegram, while also making it much more reliable. Because if some part of the network goes down, you can still switch to the other parts of it.

Lex Fridman (00:45:25) Yeah. One of the big ways you protect user privacy is that you store the data. The infrastructure side of Telegram is distributed across many legal jurisdictions with the decryption keys. So, it’s encrypted in cloud. The decryption keys are split and kept in different locations so that no single government or entity can access the data. Can you explain the strength of this approach?

Pavel Durov (00:45:55) The way we designed Telegram is we never wanted to have any humans, any employees have any access to private messaging data. That’s why since 2012 when we’ve been trying to come up with this design, we always invested a lot of effort into making sure that nobody can mess with it. If you hire an employee or any of the existing employee, it can’t break the system in a way that would allow them to access messages of users. Then of course we launched end-to-end encrypted messaging that is even more protected, but it has certain limitations. So, you still have to rely on an encrypted cloud. So, an interesting engineering challenge was how you make sure that no point of failure can be created within your team or outside.

Lex Fridman (00:46:58) So no employee can even access user messages. So, that’s the thing. We talk about encryption, we talk about privacy, we talk about security, all these kinds of things. I think the number one thing that people are concerned about, about which there’s also misinformation, is about private messages. So, Telegram is very, very protective of the private messages of users. So, you’re saying employees never can access the private messages. Have any governments or intelligence agencies ever accessed private user messages in the past?

Pavel Durov (00:47:38) No, never. Telegram has never shared a single private message with anyone, including governments and intelligence services. If you try to access any server in any of the data center locations, it’s all encrypted. You can extract all the hard drives and analyze it, but you won’t get anything. It’s all encrypted in the way that is undecipherable. That was very important for us. That’s why we can say with confidence, there hasn’t been ever a leakage of data, any leak of data from Telegram. Not in terms of private messages, not in terms of say contact lists.

Lex Fridman (00:48:28) Do you see in the future a possible scenario where you might share user private messages with governments or with intelligence agencies?

Pavel Durov (00:48:39) No. We designed a system in a way that’s impossible. It’ll require us to change the system and we won’t do that because we made a promise to our users. We would rather shut Telegram down in a certain country than do that.

Lex Fridman (00:48:56) So that’s one of the principles you operate under is you go into protect user privacy.

Pavel Durov (00:49:03) I think it’s fundamental. Without the right to privacy, people can’t feel fully free and protected.

Lex Fridman (00:49:11) I mean, this is a good place to ask. I’m sure you’re pressured by all kinds of people, all kinds of organizations to share private data. Where do you find the strength and the fearlessness to say no to everybody, including powerful intelligence agencies, including powerful governments, influential, powerful people?

Pavel Durov (00:49:33) I guess part of it is just me being me. I stood up for myself and for my values since I was a little kid. I always had issues with my teachers because I would point out their mistakes during classes. At the end of the day, what’s important is to remind yourself that you have nothing to lose. They can think they blackmail you with something, they can threaten you with something, but what is it they really can really do to you? Worst case, they can kill you, but that brings us back to the first part of our discussion. There’s no point living your life in fear.

(00:50:21) As for Telegram, it’s incredibly successful, but if we lose one market or two markets or pretty much all of the markets, I don’t care that much. It won’t affect me, it won’t affect my lifestyle in any way. I’ll still be doing my pushups. So, you don’t like encryption, you don’t like privacy, you think you should ban encryption in your country, like the European Union is trying to do now for all the member states, well, go ahead and do that. We’ll just quit this market. We won’t operate there. It’s not that important. They all think that somehow we profit from their citizens, and the only goal tech companies have is extracting revenues. It’s true, most tech companies are like this, but there are projects like Telegram which are a bit different and I’m not sure they realize that.

Lex Fridman (00:51:23) So for you, the value of maintaining your integrity in relation to your principles is more important than anything else. Of course, we should say that you also have full ability and control to do just that because you, Pavel Durov, own 100% of Telegram. So, there’s no anybody with a say on this question.

Pavel Durov (00:51:47) There are no shareholders, which is quite unique.

Lex Fridman (00:51:52) Very unique. I don’t think there’s anything even close to that in any major tech company.

Pavel Durov (00:51:56) And this allows us to operate the way we operate, to build this project and maintain it based on certain fundamental principles, which by the way, I think everybody believes in. I think the right to privacy is included in the constitution of most countries, at least most Western countries, but it’s still under attack almost every week. It often starts with well-meaning proposals. Oh, we have to fight crime, we have to do that, we have to protect the children. But at the end of the day, the result is the same. People lose their right to such fundamental thing as privacy. They sometimes lose their right to express themselves, to assemble.

(00:52:47) This is a slippery slope that we witnessed in pretty much every autocratic country or country that used to be free and then became autocratic. No dictator in the world ever said, “Let’s just strip you away from your rights because I want more power to myself and I want you to be miserable.” They all justified it with very reasonable sounding justifications and then it came in stages gradually. After a few years, people would find themselves in a position when they’re helpless. They can’t protest. Every message they sent is monitored. They can’t assemble. It’s over.

Lex Fridman (00:53:39) So you see Telegram as a place that people from all walks of life, from every nation can have a place to speak their mind, have a voice in. In the geopolitical context, you’re mentioning that government when they become autocratic naturally is the way of the world. Human nature and the nature of governments, they become more censorious. They begin to censor and always justifying it in their minds perhaps assuming that they’re doing good.

Pavel Durov (00:54:08) Perhaps some of them assume they’re doing good, but interestingly, it always results in the state accumulating more power at the expense of the individual. Then where does it stop? We humans are not very good at finding the right balance, and in this case, the right balance between chaos and order, between freedom and structure. We tend to go to extremes.

Lex Fridman (00:54:44) I think you still consider yourself a libertarian. There is something about government that always over time naturally builds a larger and larger bureaucracy. In that machine of bureaucracy, it accumulates more and more power. It’s not always that one individual member of that bureaucracy is the one that corrupts the initial principles on which the government was founded, but just something over time, you forget. You begin to censor. You begin to limit the freedoms of the individual, the ability of the individuals to speak, to have a voice, to vote. It just gradually happens that way.

Pavel Durov (00:55:29) And the government is not some abstract notion. The government consists of people and these people have goals. They would naturally be inclined to increase the level of influence, to have more subordinates, to have more resources. That’s how you end up in an endless loop of ever-increasing taxes, ever-increasing regulation, which ultimately suffocates free market, free enterprise, and free speech. So, you do want to have very, very strict limitations on the extent the government can increase its powers at the expense of citizens. Ironically, you don’t have those limitations.

(00:56:22) You’re supposed in all countries, which are considered to be free. It’s supposed to be the constitution that protects everybody, but interestingly, it doesn’t work always this way. They are able to find very tricky phrasings in order to carve out exceptions and then the exception becomes the rule.

Arrest in France

Lex Fridman (00:56:49) On this topic, I’d love to talk to you about the recent saga of you being arrested in the August of last year in France. I think I should say that it’s one of the worst overreaches of power I’ve seen as applied to a tech leader in recent history, in all history. So, it’s tragic, but I think speaks to the thing that we’ve been talking about. So, maybe can you tell the full saga what happened? You arrived in France.

Pavel Durov (00:57:24) I arrived in France last year in August just for a short two-day trip and then I see a dozen of armed policemen greeting me and asked me to follow them. They read me a list of something like 15 serious crimes that I’m accused of, which was mind-boggling. At first, I thought there must be some mistake. Then I realized they’re being serious and they’re accusing me of all possible crimes that the users of Telegram have allegedly committed or some users and they think I should be responsible for this, which again, like you said, it’s something that never happened in the history of this planet. No country, not even an authoritarian one did that to any tech leader, at least at this scale.

(00:58:37) There are good reasons for that because you are sacrificing a big part of your economic growth by sending these messages to the business and tech community. So, they put me in a police car and I found myself in police custody. Small room, no windows, just a narrow bed made of concrete. I spent almost four days there. In the process, I had to answer some questions of the policemen. They were interested in how Telegram operates. Most of it is public anyway, and I was struck by very limited understanding or should I say even lack of understanding on behalf of the people who initiated this investigation against me by how technology works, how encryption works, how social media work.

Lex Fridman (00:59:57) I mean, there’s something darkly poetic about a tech founder of a platform where a billion people are communicating with each other and you’re on concrete, no pillow for days, no windows. I’m a huge fan of Franz Kafka and he’s written about the absurdity of these kinds of situations, hence the Kafkaesque stories. There’s a story literally about the situation that he wrote, perhaps predicted, called The Trial, where a person is arrested for no reason that anybody can explain and is stuck in the judicial system for a long time, that nobody fascinatingly in that story, neither the person arrested nor any individual member of the system itself fully understand what is happening.

(01:00:45) Nobody can truly answer the questions and eventually the person, spoiler alert, is mentally broken by the whole system, which is what bureaucracy can do in its most absurd form. It breaks the spirit, the human spirit laden in all of us. That’s the negative side of bureaucracy.

Pavel Durov (01:01:05) I agree with you on the absurdity of this thing because if this was a good faith attempt to fix an issue, there were so many ways to reach out to Telegram, to reach out to me personally, voice their concerns, and solve any alleged problem in a way that is conventional and diplomatic the way every other country on this planet solves these problems, including with Telegram. We did it dozens of times.

Lex Fridman (01:01:43) Yeah, you have a nice page showing this is like details that most people don’t really think about, but Telegram is at the forefront of moderating CSAM and terrorist groups. There’s a nice page, telegram.org/moderation that shows just the incredible amount of groups and channels that are engaged in terrorist activity and CSAM activity that are actively blocked, found and blocked by Telegram. A lot of this work, like you said, because of the automation is done with machine learning, just the scale is insane.

(01:02:22) This is stuff that most noobs like me who are just chatting it up on Telegram don’t think about, but there’s just an immense number of people essentially doing things that violate the law on there and you have to find them immediately and catch it. I guess all platforms have to deal with it. Telegram was doing a great job of dealing with that content. What you’re saying is the French government had no idea. Do they even know what machine learning is?

Pavel Durov (01:02:53) It’s a concept that is challenging to explain to them, but I think they will learn much more about it by the end of this investigation. That’s my hope. In any case, you’re right. If you look at Telegram, we’ve been fighting harmful content that is publicly distributed on our platform since 10 years ago, actually since the time we launched public channels on Telegram. Since something like eight years ago, we had daily transparency reports on how many channels related to child abuse or terrorist propaganda we’ve taken down daily.

(01:03:41) Every day we’ve taken maybe we’d take down hundreds of them, and if you include all kinds of content that we remove, all the accounts, groups, channels, posts, that would amount to millions of pieces of content every week, hundreds of thousands every day. Then somebody would read the newspaper, get enraged because they would read something about child porn. This is a subject that is very emotionally charged and start doing something not based on data and logical thinking and laws, but based on emotions driven from inaccurate input.

Lex Fridman (01:04:36) Yeah, I think we should make pretty clear that there’s no world, no reason that the French government should have arrested you, but here we are. That’s the situation you’re in. So, to be clear, you have to show up in front of a judge. All of this is beautifully absurd. It would be hilarious if it wasn’t extremely serious. You have to show up in front of a judge every certain amount of time. What is that experience like?

Pavel Durov (01:05:01) In France, they have this role of investigative judge. I don’t think you have it in many other places in the world. It means I’m not on trial, I’m being investigated. In France, it’s not just the police or prosecutor asking me questions. It’s a judge, which in my experience is more like still a prosecutor, but it’s called a judge. That makes it harder to appeal. So, if you are limited in say, countries where you can travel, then to appeal that restriction will take you a lot of time. The investigation itself should have never been started. It’s an absurd and harmful way of solving an issue such as complicated as regulating social media. It is just the wrong tool. So, we objected and appealed the investigation itself. We did last year, I believe.

(01:06:14) We’re still not even given a hearing date for the appeal because the process is painfully slow, not just for me but for everybody, which made me realize the system may be broken in many levels. You have other entrepreneurs affected by the French justice system telling me horror stories about their experiences where businesses got paralyzed by very unnecessary actions of investigative judges that ended up being unjustified and biased. In the end, you can perhaps solve it when you reach a higher court and you’ll get justice, but you’ll lose a lot of time and energy in the process. So, this is the only thing that is, I hope, different and will be different in this case compared to the story you told from Kafka.

Lex Fridman (01:07:31) I mean, but it does as Kafka describes break a lot of people with time. So, we should say that you’re for a long time not allowed to travel out of France. Now you can travel to Dubai. We’re now in Dubai, got to meet many of the people that work at Telegram. Telegram is headquartered in Dubai, but you’re not allowed to travel anywhere else. When do you think you’re coming to Texas to hang out with me over there?

Pavel Durov (01:08:01) That’s a hard question to answer because it doesn’t depend on just my actions. I can just say this, I’m patient. I will not let this limitation on my freedom dictate my actions. I will, if anything, double down on defending freedoms because I experienced firsthand what the absence of freedom feels like at least during these four days in police custody when you are just stuck, unable to communicate with people that are important to you, when you don’t even know what’s going on in the world in relation to you personally. So, I have no crystal ball that would tell me the future. I can’t say that I am pessimistic. I think we’ve been able to gradually remove most of the restrictions initially imposed on my freedom last August.

Lex Fridman (01:09:23) If the French government or the French intelligence agency want to have a back door or want to access private user messages, what would you say to them? Is there anything they can do to get access to the private user messages?

Pavel Durov (01:09:42) Nothing. My response would be very clear, but it won’t be very polite. So, I’m not sure.

Lex Fridman (01:09:52) It’s good to say here.

Pavel Durov (01:09:53) It’s good to say because you are wearing a tie.

Lex Fridman (01:09:57) Yeah, this is a serious adult gentleman-like program. Yeah, but that is a concern.

Lex Fridman (01:10:00) … a gentleman-like program, yeah. But that is a concern that people have is when you have so much pressure from governments that, over time, they’ll wear you down and you’ll give in. And then, of course, other places use that as propaganda to try to attack you, you get attacked by basically every nation. So, it’s a difficult medium in which to operate. It’s difficult to be you fighting for freedom, fighting to preserve people’s privacy. But is there something you could say to reassure people that you’re not going to sacrifice any of the principles that you’ve just expressed if the French government just keeps wearing you down?

Pavel Durov (01:10:42) I think the French government is losing this battle, this battle is wrong. The more pressure I get, the more resilient and defiant I become. And I think I have proven that in the last several months when there were attempts to use my situation being stuck here in France by approaching me and asking me to do things in other countries, blocking certain channels, changing the way Telegram works. And not only I refused, I told the world about it and I’m going to keep telling the world about every instance, any government, in this case in particular, the French government, tries to force me to do anything. And I would rather lose everything I have than yield to this pressure because, if you submit to this pressure and agree with something that is fundamentally wrong and violates rights of other people as well, you become broken inside, you become a shell of your former self on a deep biological and spiritual level.

(01:12:10) So, I wouldn’t do that. There are probably other people in the world that would consider that, I don’t care. Telegram disappears to something people don’t understand, including in this intelligence services or governments, I don’t care, I’ll be fine. If they put me into prison for 20 years which, let’s be clear, it’s not something that I think is realistic but let’s just think about it as a hypothetical situation. I would rather starve myself to death and die there, reboot the whole game than do something stupid.

Romanian elections

Lex Fridman (01:12:59) Let me ask you about an example of the thing you’re talking about. Tell the saga of Telegram in the Romanian election. So, amidst all this, you are still fighting to preserve the freedom of speech. What happened and what were some of the decisions you had to make?

Pavel Durov (01:13:16) So, when I got stuck in France unable to leave the country for a few months, I was offered to meet the head of state foreign intelligence services through a person I know quite well, he’s actually a well-known tech entrepreneur in France and he’s well-connected and he said, “This guy wants to meet you.” I said, “Okay, fine, let’s do that but I’m not promising anything.” I took the meeting and, in this meeting, I was asked to restrict what I see as restriction of freedom of speech in Romania. I don’t know if you followed the whole saga with the Romanian elections, they had a presidential elections last year, the results got canceled. Now, Romania, at that point when I had this meeting, was preparing for a new presidential elections, the conservative candidate was not somebody who the French government was supportive of so they asked me whether I would be shutting down or ready to shut down channels on Telegram. Let’s support the conservative candidate or protest against the pro-European candidates so they called the guy they liked.

(01:14:49) I said, “Look, if there is no violation of the rules of Telegram which are quite clear, you can’t call to violence. But if it’s a peaceful demonstration, if it’s a peaceful debate, we can’t do this, it would be political censorship. We protected freedom of speech in many countries in the world, put it in Asia and Eastern Europe and Middle East, we’re not going to start engaging in censorship in Europe no matter who is asking us.” I was very clear to the guy who was the head of French intelligence, I said, “If you think that, because I’m stuck here, you can tell me what to do, you are very wrong. I would rather do the opposite every time,” and in a way that’s what I did. I had a small debate with him about the morality of this whole thing and then, at a certain point, just disclose the content of this entire conversation because I never signed an NDA. I don’t ever sign NDAs with any people like that, I want to be able to tell the world what’s going on.

(01:16:12) And that’s quite shocking to me that you would have people in the French government trying to get advantage of this situation. Of course, if they had nothing to do with the start of this investigation itself and use it to reach their political or geopolitical goals, I consider it an attempt to humiliate myself personally and millions of Telegram users collectively. And it’s quite strange that the same agency asked us to do certain things in Moldova as well. So, even before that, I think it was October last year or September, I was arrested in Paris in late August and then again approached through an intermediary and asked, “Would you mind taking down some channels in Moldova because there is an election going on and we’re afraid there’re going to be some interference with these elections. Could you please connect with representatives of the government of Moldova and take care of it?” We said, “We’re happy to take a look at it and see if there is content there that is in violation of our rules.”

(01:17:50) And they sent us a list of channels and bots, some of them were … So, it was a very short list and some of these channels and bots were in violation indeed of our rules and we took them down, only a few of them, the rest were okay. Then they said thank you and sent us another list of dozens of channels, many, many channels. We looked at these channels, we realized that there is no solid foundation to justify banning them and we refused to do that. But interestingly enough, the French intelligence services that were asking us to do this in Moldova, let me know through the contact that, after Telegram banned the few channels that were in violation of our rules in Moldova, they talked to my judge, the investigative judge in this investigation that has been started against me, and told the judge could things about me which I have found very confusing and, in a way, shocking because these two matters have nothing in common.

(01:19:27) Why would anyone talk to an investigative judge that is trying to find out whether Telegram did a good enough job in removing illegal content in France, what does Moldova have to do with it? I got very suspicious at that moment. Remember, it happened after we blocked a few channels that violated our rules but before we refused to block a long list of other channels that were completely fine which is people expressing political views which I may not agree with but it’s their right to express them. Not extreme views, not views that call to violence. That was extremely alarming, that was a moment when I told myself that there may be more going on here that I initially thought. Initially I thought, yeah, some people are confused about how technology works and, after this case in Moldova, I got much more suspicious. So, by the time the head of intelligence services met me to ask about Romania to help them silencing conservative voices in Romania, I was already wary of what can be going on next.

Lex Fridman (01:21:18) Yeah. So, clearly, this was a systematic attempt to pressure you to censor political voices that the French government doesn’t agree with. And we should say that you have fought for freedom of speech for left-wing groups and right-wing groups, it really doesn’t matter. So, it’s not you don’t have a political affiliation, political ideology that you fight for, you’re creating a platform that, as long as they don’t call for violence, allows people from all walks of life, from all ideologies to speak their mind, that’s the whole point. And it happens to be conservative voices in the Romania election that the French government wanted to censor because, currently, the French government leans left. But if you flip everything around and the government would be right wing, you’d be fighting against censorship of left-wing voices and you have in the past many times.

Pavel Durov (01:22:13) Exactly. Ironically, we received a request from the French police to take down a channel of far left protesters on Telegram in France. We refused to do that. We looked at the channel, peaceful protesters. It doesn’t matter for us whether we are defending the freedom of speech of people leaning right or leaning left. During COVID, we were protecting activists that were organizing the Black Lives Matter events and the other side, the protesters against lockdowns. We protect everybody as long as they are not crossing the lines and not starting to call to violence or incite damage to public property. It’s a fundamental right to assemble. It’s interesting that people who haven’t had this experience of living in countries that don’t have freedoms don’t always realize how dangerous it is to gradually compromise your values, your principles, your freedoms, your rights because they don’t understand what’s at stake.

Power and corruption

Lex Fridman (01:23:56) Yeah, these things become a slippery slope. So, you’ve, for many, many years, including currently, have spoken very highly of France, you love French history, French culture. I think this situation, this historic wrong that’s been done is, put simply, is just a gigantic PR mistake for France. There’s no entrepreneur that sees, that aspires to be the next Pavel Durov to create the next Telegram, sees this and wants to operate in France after seeing this. There is no justification for this arrest, there’s a misapplication of the law, all kinds of pressures, all kinds of behavior that seems politically motivated, all that kind of stuff, all the excessive regulation and the bureaucracy, a nightmare for entrepreneurs that dream to create something impactful and positive for the world.

(01:24:50) So, what do you think needs to be fixed about the French government, the French system and then, zooming out, because you see similar kinds of things in Europe, that could enable entrepreneurs, that could reverse the trend that we seem to be seeing in Europe that is becoming less and less friendly to entrepreneurs? What can be fixed? What should be fixed?

Pavel Durov (01:25:20) I think the European society must decide where they want their ever-increasing public sector to stop increasing, what they think should be the right size of government. Because today, if you take France for example which is a beautiful country with a lot of talented people, but public expenses are 58% of the country’s GDP, it’s maybe as much more than in the latest stage of the Soviet Union. So, you have this disbalance where you have many more people representing the state as opposed to people trying to bring the country’s economy forward by creating great products and great companies.

(01:26:26) The start-up field and my field, social media field has been affected by it immensely. There was one great start-up in this realm in France in the last 10 years, it was this location-based social network, it was eventually sold to Snapchat. But before it was sold, the founder asked me whether he should sell, I told him, “Never sell. You have a great thing going. You have lots of users, you have organic traction in many countries and the first of this kind of success story in France.” But then he sold anyway in a couple of weeks.

(01:27:12) And later I met him, he’s trying to do a new thing now, I met him and I asked him, I was trying to understand what went wrong and one of the things he told me about is that, while he was trying to run his company, competing with Facebook, Instagram, Snapchat, having all this pressure from investors, trying to hire the best people and persuade them to go to Paris, and he did a great job by the way, but while he was trying to do that, he got also attacked by some silly investigation, again, involving the data protection issues which lasted forever and was gradually sucking blood of his team and his company, constant interrogations, disclosure requests.

(01:28:14) And this is a young company, it significantly increases the level of stress and, at some point, I think the pressure was too much, he decided to, again, just sell it. Eventually it turned out that there was no issue, the investigation ended as far as I understand with no charges but, such investigations, they have a price, they have a cost.

(01:28:45) And unless the society realizes the cost of projects, of companies, of start-ups that are never created or sold to the United States at the very early stage or other countries resulting in decreased economic growth, things won’t change. I think we just talked to a guy a few days ago who left France and started a business here in Dubai and one of the reasons he had to leave France is that the government started an investigation on his company and they frozen his bank accounts and this investigation that involved taxes lasted for many, many years, I believe he said eight years.

(01:29:36) And at the end of this eight years, the government reached to the conclusion that there was nothing wrong, he’s good, it’s okay. In the meantime, his corporate bank accounts were frozen, his business died. The only reason why he was able to retain sanity is because he moved to Dubai and started a new company which is incredibly successful and now he’s enriching this city which we’re in right now with his great ideas and creativity.

Lex Fridman (01:30:17) And by the way, having interacted with him, there’s a fire in his eyes, the human spirit that fuels entrepreneurship. Whatever that is, he doesn’t have to do it, he’s made a lot of money. He probably doesn’t have to do anything but he still wants to create and that fires what fuels great nations. Build, build, build, build new stuff, expand, all of that and regulation suffocates that.

Pavel Durov (01:30:40) You have to cherish this people.

Pavel Durov (01:30:42) But I guess the French public or some part of the French public was misled and I don’t know when, perhaps since the time of the French Revolution, to believe that entrepreneurs are somehow their enemies. They’re the evil rich people that are the cause of all problems as if only you could make the rich share their ill-gotten wealth with the rest of the population then every problem will be magically solved. In reality though, a lot of these people that are starting such companies with fire in their eyes are sacrificing their lives, their livelihood.

(01:31:27) They’re working 20 hours a day, they’re experiencing immense stress in order to fulfill the vision and bring value and good to the society around them. They create jobs, they create great services, they create great goods, they make your country grow, they make your people proud, you have to cherish them. But what does the system do to them? It squeezes them out because perhaps there was somebody in the tax authority that decided to advance their career and perhaps was too ambitious and not too smart so, as a result, a company was destroyed.

(01:32:17) And now the same entrepreneur, by the way, who we talked to is invited to come back to France. He’s been offered really good terms, he said they’re going to open this new venue on Champs-Élysées, we’re going to give you the best location, we’re going to fund part of it, tax breaks and he said, “Never. Just forget about this, it’s impossible. I’m not coming back to France.” He’s traumatized by the experience and he’s French, he was born there, he has a French passport. So, unless things like this change, France and the rest of Europe will keep struggling with economic growth, with budget deficits, with unemployment and all the other relevant social and economic metrics.

Lex Fridman (01:33:06) Yeah, it’s heartbreaking. Many of these nations, I appreciate the historic and the culture of value and I hope Europe and France flourish but this is not the components that are required for flourishing. Quick pause, I need a bathroom break.

Intense education

(01:33:24) All right, we had some tea, we’re back. Let’s go back a bunch of years to the beginning. You mentioned you went to school with super intensive education so I thought it’d be really interesting to look at some of the powerful aspects of that education from the languages to the math. Can you actually describe some of the rigorous aspects of it and what you gained from it?

Pavel Durov (01:33:48) At the age of 11, I got the opportunity to enter an experimental school in St. Petersburg where I lived and you had to pass a rigorous test to get accepted. The idea behind the school was that, if you try to squeeze as much information as possible into a brain of a teenager making a focus on maths and foreign languages, then there will be some changes in the brain of the student that will allow the student to understand most other disciplines. But we had a class, as a result, that didn’t have any single focus, it was very widespread across a lot of disciplines. You would have four foreign languages at least including Latin, English, French, German, in addition, you can get ancient Greek. You would have classes like biochemistry or psychoanalysis, evolutionary psychology. The difference of this class as opposed to other classes in the same school which was part of the St. Petersburg State University called academic gymnasium was that, unlike other classes which were specialized in some single subject like physics or maths or history, this one tried to get the best from all of these specialized classes and bring into one curriculum. Since it was an experimental class, it wasn’t possible to become a straight A student, to be excellent in all the subject, it’s always considered crazy to even try.

Lex Fridman (01:35:48) So, it’s assumed nobody’s able to handle it, you’re just pushing the limits of the human mind. Four languages in parallel, math, evolutionary psychology, just overwhelming the mind and see what happens.

Pavel Durov (01:35:59) Yes, see what happens. This was an experiment and it was in the middle of the ’90s, remember, when Russia, particularly its educational system, wasn’t regulated as much as it is today. It was in the middle between the two stages of the Russian history, the Soviet’s history and the modern Russian history of the 21st century. In any case, I learned a lot from that experience. First of all, why I got into this school is because I kept being kicked out from other schools.

Lex Fridman (01:36:38) Challenging authority?

Pavel Durov (01:36:39) I was good at all subjects but not behavior. We had this behavior grade in the Soviet Union in early ’90s, perhaps they even have it today, I’m not sure. I was very bad at behavior, always challenging the teachers, always pointing out their mistakes.

Lex Fridman (01:36:59) By the way, that’s not such a bad thing, right? If you were looking back, there’s some value to that for young people to, maybe respectfully, but challenge the authority, the wisdom of old, right?

Pavel Durov (01:37:14) I think I was very lucky to be able to do that and to be able to get away with it in the end because, normally, if you keep challenging authorities, you just get kicked out of all schools and then you end up nowhere. So, I eventually got into a school where challenging teachers was not fully okay but it was something that you could do and then you would start a debate with the teacher and normally they would allow you to express your point of view and then some objective truth may come out of it as a result.

(01:37:58) But at that point, I was pretty bored with my life, every teenager gets to a point when they have this sort of existential crisis. What’s the point of life? What am I even doing here? At some point, I decided, since I have to go to school anyway, I might as well try to do something impossible and become the best student and get an A or what we called five in the Russian system on every single subject and that kept me busy for a while.

(01:38:40) It was incredibly difficult because you didn’t have enough time. Even if you just studied all the time, not doing anything else, you didn’t have any time left to prepare all the homework, tasks and get ready for all the tests. So, I ended up using the breaks between classes but I get to the result I wanted to get to. I got the excellent mark in every subject and that kept me happy for a while.

Lex Fridman (01:39:19) What did you understand about an effective education system from studying foreign language at the same time doing such a diversity? If you were to design an education system from scratch for young people, especially in the 21st century, what would that look like? You posted about the value of mathematics as a foundation for everything.

Pavel Durov (01:39:39) Yeah. I still think math is essential. It’s something that shapes your brain, it teaches you to rely on your logical thinking to split big problems into smaller parts, put them in the right sequence, solve them patiently, trying again if it doesn’t work. This is exactly the same skill you’ll need in programming and project management and start it when you start your own company. And it’s one of the few subjects to school which encourages you to develop your own thinking as opposed to rely on what other people have to say and just repeating their opinions. That is extremely valuable. And of course, once you’re good at math, you can apply it in physics, in engineering, in coding. And it’s not surprising that most of the most successful tech founders and CEOs are very good at matters in coding because, ultimately, it’s the same mental skill that you rely on.

(01:41:05) But back then in the school, I realized something else as well, it’s that competition is really important, competition is key. This is what motivates a lot of teenagers when there is school and, if you remove competition out of the education system, you end up forcing kids to start competing elsewhere, for example, in video games. It’s a trend you see now in many countries, including in the West, when well-meaning authorities or parents say we don’t want our kids to be too stressed, we don’t want them to feel anxiety so let’s just get rid of all the public grading system, all these rankings of who won, who lost, we don’t want any of that.

(01:42:06) And part of it is justified but, as a result, some kids lose interest. Yes, you eliminate the losers but you end up eliminating the winners as well. And then, if you are overprotective of the kids in that age, they grow up, graduate schools, the universities and they’re still not prepared for real life because real life is constant competition for jobs, for promotions, for customers and it’s more brutal.

(01:42:47) What you have as a result is high suicide rates, high unemployment, all the things and negative trends you see now in many countries which thought eliminating competition from their education system was a good idea. They still persist, they still think competition is a bad thing, they try to eliminate competition from their economy as well to an extent saying we are going to make sure the losers don’t lose and the winners don’t get too much but, as a result, they make their entire systems less competitive, their entire economies.

(01:43:34) Some of them in Europe are now struggling to keep up with China, with South Korea, with Singapore, with Japan and other places where the education system was based on ruthless competition. So, this is a hard choice any civilization has to make. We support competition understanding that, eventually, it leads to progress in science and technology and abundance for society at large or we remove competition thinking that somehow we can shield the future generations from the stress that competition inevitably causes.

Lex Fridman (01:44:22) Yeah, it’s grounded in a good instinct of compassion, you don’t want people who suck at a thing to feel pain but it seems like struggle is a part of life, either you do it earlier or you do it later. And it’s true, that’s such a good point that competition does seem to be a really powerful driver of skill development, like you mentioned, pursuing mastery. There’s something in human nature that, especially for young people, if you can compete at a thing, you’re going to be really driven to get good at that thing. If you can direct that in the education system as China does, as many nations like you mentioned do, then you’re going to develop a lot of brilliant people.

Lex Fridman (01:45:00) … do, then you’re going to develop a lot of brilliant people, resilient people, people that are ready to create epic shit in the world.

Pavel Durov (01:45:07) I think there is a lot of evidence proving that we are biologically wired to compete and establish our understanding of what our qualities are and talents are in relation to other people around us, and this is one of the ways society self-regulates.

Nikolai Durov

Lex Fridman (01:45:30) Speaking of competition, your brother, Nikolai, he’s a mathematician, programmer, expert in cryptography. He has won the IMO International Mathematics Olympiad, he got gold medal three times, ICPC programming, two times, has two PhDs in mathematics, and you have worked together for many years creating incredible technologies that we’ve been talking about. So what have you learned about just life from your brother?

Pavel Durov (01:46:02) Well, first of all, I must say I learned pretty much everything from my brother, everything I know, because when we were used to be kids, we slept in the same bedroom, like beds a few feet away from each other, and I kept bugging him with questions. I would ask him about dinosaurs and galaxies and black holes and Neanderthals, everything I could think of, and he was my Wikipedia back in the time when we didn’t have internet access. He’s a unique prodigy kid, probably one of a billion.

(01:46:45) He started reading at the age of three, I think, and he pretty fast got so advanced in maths, that by the age of six, he could already read really sophisticated books on astronomy. Sometimes when he did it in public places, like buses or metro, my mom was criticized by people who were witnessing it. They would tell her, “Why are you mocking your own kid with this serious book? It’s obvious that the kid can’t understand everything there. It’s too complicated even we don’t understand anything there. There’s some formulas,” and he was already sucking in this knowledge. He just has this thirst for information.

(01:47:39) So he was the source of all kind of great facts, useful things, inspiring things. He taught me pretty much everything I know. At the same time, he’s incredibly modest and kind, and this is something I think a lot of people that think they’re smart but not generally intelligent lack. More often than not, people who are truly intelligent, they’re also kind and compassionate.

Lex Fridman (01:48:21) You actually have been staying out of the public eye for the most part. You’ve done very few interviews, you’re pretty low-key, but your brother is in another level. He’s been staying out of the public eye. What’s behind that?

Pavel Durov (01:48:34) Part of it is his natural modesty. He doesn’t need to do it. He doesn’t feel this urge to show off, brag about stuff. I tried to avoid it as well, but at a certain point I realized that me being too private, too secretive becomes a liability because it creates this void, this emptiness that people and organizations that don’t like Telegram very much are willing to fill with inaccurate information and they’re willing to spread the narratives about Telegram, which can result in strange situations, some of which we discussed earlier. For example, this French investigation.

Lex Fridman (01:49:32) Yeah, I’ve gotten to know you more and more and there’s a deep integrity to you that I think is good to show to the world. There’s a lot of attack vectors on user privacy and I think the most important, the last wall of protection is the actual people that are running the company, so it’s important to some degree for you to be out there to showing your true self.

Programming and video games

(01:49:55) So we should say that also you didn’t mention, but you were a programmer from an early age. You started coding at 10. First things you built are a video game at 11, and then eventually 10 years later, 21, you programmed the initial versions of VK single-handedly. Can you talk to me about your programming journey that led to the creation of VK? What was the VK stack? Is it PHP mostly? How did you figure out how to program websites, all of that?

Pavel Durov (01:50:27) Yeah, I wasn’t as interested in probably websites at first. I didn’t even have access to the internet when I was 10 years old, but I liked video games. I didn’t have enough of them and the scarcity forced me to start building them, more computer games, just to play myself.

(01:50:49) It’s actually an interesting thing that we sometimes don’t realize it, but scarcity leads to creativity, and one of the reasons you have so many people who love to code coming from the Soviet Union or other places which didn’t have much access to modern technology, and more importantly modern entertainment, is that perhaps we were not so much distracted by all this abundance of different entertainment options, which is not to say it’s bad to have those options. It’s just a fact that we sometimes don’t appreciate.

(01:51:34) So I started to build computer games. My brother would sometimes guide me. For example, I would create a turn-based strategy. Of course, two-dimensional. Back then three-dimensional was too much for me. But it wasn’t as slick in terms of the scrolling FPS, frames per second, parameter, and I asked my brother how to optimize it. He would guide me, and this kind of learning and training really shaped my coding skills when I was younger.

(01:52:21) Then I started to create video games for my classmates when we played, for example, tic-tac-toe on an infinite field in my class during the breaks. And not tic-tac-toe the three in a row, this was about five in a row and in an infinite field. This is a much more interesting game and it gets quite complicated if you keep playing it. My classmates used to love it and some of my classmates were really smart, champions of math olympiads, sons and daughters of professors at the university, and I decided, “No, I want to win every single time. I don’t want to lose even a single time. So how do I win? I need to practice more, but how do I practice more? I need an opponent stronger than myself.”

(01:53:08) So I coded this game so that I would play against the computer and the computer would calculate, I think, four moves in advance to choose the optimal strategy. That wasn’t enough. Four moves in advance, I would still win over it. If I tried to calculate five or six, it was too slow, so asked my brother to help me out here. So he made this algorithm. Eventually, I trained myself to win every single time, even with the computer back then, we didn’t have modern CPUs, and I could still retain some self-confidence.

(01:53:54) I would go back to school during breaks, play with my classmates, and soon people started to lose interest. None of my classmates wanted to play this game anymore. I killed the game because there’s…

VK origins & engineering

(01:54:09) So after that, when I got into the St. Petersburg State University, it was quite boring just to study because it was too easy. So I thought, “What can I do there?” I created a website for the students of my faculty first. I organized the creation of digital answers to all exams and digitalized version of all lectures, which was something very unique back then. Remember, it was 25 years ago. I would put together a website where I would publish all this materials, and pretty soon it became super popular. I opened a discussion forum there. In a few years, I expanded to the university with all of its other departments, and then to other universities. We ended up having tens of thousands of users just as a student’s portal. We had all kinds of social features there, friends lists, photo albums, profiles, blogs. All of it.

(01:55:29) It was quite successful, and after I graduated the university, one of my ex-classmates from the school reached out to me after reading about my successes in a newspaper, the main business newspaper of St. Petersburg, and he asked me, “Are you trying to build a Russian Facebook?” I said, “I’m not sure. What’s Facebook?” So we met. Since he graduated an American university two years before that, he showed me Facebook. I thought, “Well, I can’t already have all of this technology, but it’s valuable to know which elements I should get rid of in order to scale this thing and have millions of users.”

(01:56:25) This is also something people don’t appreciate that sometimes in order to move forward and have more success, you have to get rid of things, including technology. Getting rid of features is super important.

Lex Fridman (01:56:40) Simplify, both for scaling and for making it amenable to just growing the user base where people get it immediately.

Pavel Durov (01:56:50) Yes. Otherwise, it’s just too complicated for the new user. The existing users will be happy, they’ll be praising you, they will be asking you to add more stuff to make it even more complicated, so it’s easy to lose track and get disoriented if you are only relying on the feedback of existing users.

(01:57:18) So as a result, I started the website called VKontakte or VK, it means “in touch” in Russian, initially to solve my own personal problem. I graduated the university that same year and I wanted to be in touch or remain in touch with my ex-classmates from the university and the other fellow students. And of course, as a 20-year-old, I wanted to meet other people, including good-looking girls.

(01:57:46) So I started to build it from scratch. For that one, I thought, “I’m not going to use any third-party libraries, modules because I want to make it as efficient as possible.” I was obsessing over every line of code, but then how do you start something that large? I didn’t have any prior experience creating a project of that scale, which would involve everything. Before, I would reuse some existing solutions. Here, I wanted to build from scratch.

(01:58:26) So I called my brother. He was a post-doc student in Germany at the time in the Max Planck university, and I asked him, “What should I start from?” And he told me, “Just build a module to authorize users, just to log in, not even to sign out, just to log in because you can pre-populate the database with credentials and emails and passwords. It doesn’t really matter. But once you see that you can type in your password and email and you are in and it tells you, ‘Hello,’ using your name, then you will have a clear understanding where to go from there.”

Lex Fridman (01:59:22) Yeah. I mean, that’s true.

Pavel Durov (01:59:24) That’s one of the best advice I’ve ever got in my life. It worked perfectly, by the way. I started to build it and before I knew it, I would have there on the website photo albums, private messages, this guest book. We used to call it “thee wall” back on VK and I guess in the early days of Facebook. We’d end up building something even more sophisticated than Facebook at the time with more features.

(01:59:54) I had a girlfriend at the time. I asked her, “We need to somehow come up with a database of all Russian schools and universities and the departments and subdivisions.” She did a great job trying to source all this information online or sometimes writing emails to universities saying, “Which departments do you have exactly at this point? We need to know,” or reaching out to the Department of Education, but in Russia and then in Ukraine, and then eventually in Belarus and in Kazakhstan and other countries where VK ended up to be the largest and most popular social network.

(02:00:38) So we did a few things that were quite unique at the time, and for the first almost a year, I was the single employee of the company. I was the backend engineer, the front-end engineer, the designer. I was the customer support officer. I was the marketing guy as well, coming up with all the wordings and the announcements, coming up with competitions to promote VK, which worked quite well. That was an incredible experience that gave me knowledge of every aspect of a social networking platform.

Lex Fridman (02:01:30) Also understanding of how much a single person can do.

Pavel Durov (02:01:32) Exactly. It’s one of the reasons why I’d like to think I’m an efficient project manager and product manager inside Telegram because I will not take anything but ambitious deadlines from my team members. If somebody gives me, “Oh, I need three weeks to do that,” I always reply, “Well, I built the first version of VK in just two weeks. Why would you need three weeks? It seems like something you could make real in just three days. Three weeks? What are you going to do the rest of the three weeks apart from this three days?”

(02:02:18) And the team knows me, and that’s why we are able today, Telegram, to move at a very good pace of innovation. Every month we’re pushing several meaningful features, I think out-competing everybody else in this industry in terms of what you can do within a short timeframe. So yes, that experience was invaluable.

(02:02:52) As for the stack, I started from PHP and MySQL, Debian Linux, but very soon I realized, “I need to optimize this.” I started using Memcached. Apache servers were not enough anymore. We had to set up NGINX. And my brother was still living in Germany, so he couldn’t help me much for the first year of building VK. Sometimes I would manage to get through to him through a call. I would use an old-school phone to call him with wires. I said, “What do I do? How do I install this thing called NGINX? I’m not a Linux guy.” If he felt particularly kind that day and not too busy, he would show me the way to do it or set it up himself, but for the most part, I had to rely on just myself.

(02:03:53) Having him there though helped when we started to grow fast and started to scale it, because at first, you realize, “One server is not enough. I need to buy another one. Then another one and another one.” The database should be in a different server. Then you have to split the database into tables. Then you have to come up with a way to chart the tables using some criteria that would make sense that wouldn’t break your user experience.

(02:04:28) When we got to over a million users and beyond a dozen of servers surviving without the input from my brother in terms of taking care of the scaling aspect, it became impossible. I remember asking him to come back, “You need to help me with this thing. It’s starting to be really big.” What was worse is that since we became popular, somebody started to do DDoS attacks on us, as it always happens. And then we had people that wanted to buy a share of VK, and interestingly, every time we had a negotiation day, the DDoS attacks intensified, so we had to come up with a way to fight it. I remembered having many sleepless nights trying to figure it out.

Lex Fridman (02:05:30) So that was your introduction to all kinds of bad actors, DDoS, business. Then later you’d find out there’s such a thing called politics, and then later, geopolitics. But this is the initial stages, that it’s not just about creating cool stuff, it’s having to deal with, as you now have to deal with with Telegram, is seas of bad actors trying to test the limits of the system, trying to break the system.

Pavel Durov (02:06:02) Unfortunately. If we didn’t have bad actors and pressure, it would be the best job ever. You just get to create.

Lex Fridman (02:06:12) Yeah, yeah. And so the help from your brother, like you mentioned NGINX and charting the tables, some of this scaling issue is algorithmic nature. It’s almost like theoretical computer science. So it’s not just about buying more computers, it’s figuring out how to algorithmically make everything work extremely fast, so some of it’s mathematics. Some of it is pure engineering, but some of it is mathematics.

Pavel Durov (02:06:44) Yeah. So at that stage, I could do the basic stuff. I could understand how I implement scalability into the code base, how I chart my tables in the database, where I include Memcached instead of direct requests to the database. That was quite easy because it was still PHP back in the day.

(02:07:14) When my brother got back from Germany somewhere around 2008, I asked him, “Can we make it even more efficient? Can we make it super fast and at the same time so that we would require even fewer servers to maintain the load?” And he said, “Yes, but PHP is not enough. I’ll have to rewrite big part of your data engines in C and C++.” I said, “Okay, let’s do that.”

(02:07:47) He invited a friend of his to help him, another absolute champion in world’s programming contest, twice in a row, and they put together the first customized data engine, which was far more efficient than just relying on MySQL and Memcached because it was, first of all, more specialized, more low-level.

Lex Fridman (02:08:19) So they rewrote it in C, C++?

Pavel Durov (02:08:21) A large chunk of it. For example, the search, the ad engine, because VK had targeted ads, they built that. It was very efficient what they did. Eventually, the private messaging part, the public messages part. At some point, we realized there are very few websites online that load faster than VK.

Pavel Durov (02:08:49) I remember in 2009, I went to Silicon Valley and I met Mark Zuckerberg the first time and some of the other core team members of early Facebook. Remember, Facebook was just four or five years old. And everybody kept asking me, “How come even here in Silicon Valley, VK loads faster than Facebook? Everything seems to appear instantly on your website. What’s the secret sauce?” That was one of the things that made them very curious

Lex Fridman (02:09:25) And that was always important to you, to have very low latency to make sure the thing loads because that’s one of the things Telegram is really known for. Even on crappy connections and all that kind of stuff, it just works extremely fast. Everything is fast.

Pavel Durov (02:09:37) As one of the core technological ideas, we prioritize speed. We think that people can notice the difference, even if it’s just 50 million millisecond difference. The difference is subconscious. It also allows us not just to be faster and more responsive, but also more efficient when it comes to the infrastructure, the expenses. Because if your code executes faster, it means you need fewer computational resources to run it.

(02:10:16) So there is no way you can lose in making things faster, and that’s why we have always been very careful when hiring people. I would only hire a person if I’m ultimately certain is the best option because if you hire somebody who is maybe a little bit distracted, unexperienced, you may end up with inefficiencies in your code base that results in tens of millions of dollars of losses. And think about the responsibility, like if we jump to today from the VK days, Telegram is used by over a billion people. They open it dozens of times every day. Imagine the app opens with a slight delay, say, half-a-second delay. Multiply by dozens of times by a billion. It’s centuries, millennia lost for humanity without any reason other than just being sloppy.

Hiring a great team

Lex Fridman (02:11:24) That is so important to understand and so wise that it’s actually, if you’re just a little bit careless as a developer, you can introduce inefficiencies that are going to be very difficult to track down because you don’t know that it can be faster. The code doesn’t scream at you saying, “This could be much faster.” So you have to actually, as a craftsman, be very careful when you’re writing a code and always thinking, “Can this be done much more efficiently?” And it can be tiny things because they all propagate throughout the code, and so there’s a real cost in having a careless developer anywhere in the company because they can introduce that inefficiency and all the other developers won’t know. They’ll just assume it kind of has to be that way.

(02:12:11) So there’s a real responsibility for every single individual developer that’s building any component of an app like Telegram to just always ask, “Okay, can this be done more efficiently? Can this be done more simply?” And that’s one of the most beautiful aspects, the art forms of programming, right?

Pavel Durov (02:12:32) Oh, yes, because when you manage to discover a way to simplify things, make them more efficient, you feel incredibly happy and proud and accomplished.

(02:12:47) And to your point, I can recall a few instances in my career where firing an engineer actually resulted to an increase in productivity. Say you have two Android engineers building their app and then they just can’t make it. They’re not keeping up with the pace of the feature release schedule. And you think, “I probably have to hire a third one,” but then you notice that one of them is really weird, falling behind the schedule, complaining some of the time, doesn’t assume responsibility. And you ask, “So what if I just fire this person?” And you fire this person. In a few weeks, you realize you actually don’t need any new, never needed the third engineer. The problem was this guy who created more issues and more problems than he solved.

(02:13:49) That is so counterintuitive because in developing tech projects, we tend to think that you just throw more people into something and then things get solved miraculously by themselves just because more people means more attention from them now.

Lex Fridman (02:14:12) That’s, again, extremely powerful. Steve Jobs talked about A players and B players, and there’s something that happens when you have B players, which is like the folks you’re talking about. Introduced into a team, they can somehow slow everybody down. They demotivate everybody. And it’s very counterintuitive that you basically, part of the work of creating a great team is removing the B players. It’s not just hiring more, generally speaking. It’s finding the “A players” and removing the people that are slowing things down.

Pavel Durov (02:14:48) Oh, yes, because the other thing that people don’t realize is how demotivating working with a B player is. Everybody can tell if the other person, the other engineer they’re working with is really competent. And it’s very visible if the person is not comfortable. They’re asking the wrong questions, they keep lagging behind. And at a certain point, if you’re an A player, you get this dissatisfaction, this feeling that you are not able to realize your full potential, accomplish what you’re really meant to accomplish because of this person working next to you or pretending to work next to you.

(02:15:37) And by the way, in some cases, it’s not because the person is lazy. In some cases it’s just the mental, the intellectual ability is not there. It’s not about experience. Most often it’s about natural ability and persistence. In 90% of cases, it’s just the inability to focus on one task for an extended period of time. Not everybody has this ability. So for people who do have this ability, it’s an insult to work alongside someone who is distracted and cannot go deep in the projects that they’re responsible for.

Lex Fridman (02:16:27) On this small tangent, what’s your hiring process? So you’ve shown and you’ve talked about how you use competitions often, coding competitions to hire to find great engineers. What’s your thinking behind that?

Pavel Durov (02:16:40) Well, it’s in line with my overall philosophy. I think competition leads to progress. If you want to create an ideal process for selecting the most qualified people for certain specific tasks you have in mind, what can be better than a competition? A coding contest where everybody who wants to join your company as an engineer or just wants to get some prize money or validation can demonstrate their skills, and then we just select the best. Or if we are not certain because there’s not enough data to hire somebody, we just repeat the contest with another task, get more data, get more winners, then repeat again.

(02:17:31) And at some point, you realize, “Oh, actually this guy has competed in 10 of our contests since he was 16 years old or 14 years old. Now he’s 20 or 21. He won in eight of these competitions. He seems to be really good in JavaScript on Android, Java, and also C++. Why not hire this person?” There’s some consistency there.

(02:18:04) And a lot of these people, they have never worked in a big company before, which is priceless because in a big company, people tend to shift responsibility. They have this shared responsibility wherein nobody fully understands who can take credit for a project, who can take blame for a project. Inside Telegram, it’s pretty clear, and these competitions are the closest experience to what people will have when working at Telegram.

(02:18:46) So for example, we want to implement certain very tricky animation and redesign to the profile page of the Telegram’s Android version. And the Android app, it’s an open-source app. Anybody can take its code and play with it. So as a result, we would not just select the best person and hire this person, we would also select the best solution to the problem because we would not suggest the contestants to solve trivial problems. It’s something that’s valuable. It saves a lot of time for us in terms of development.

(02:19:24) And because I always had this large social media platforms, which I could use to promote these competitions, somehow both VK and Telegram were very popular among engineers and designers, other tech people, I had no issue to promote this contest and find the right people ever. And what can be better than, for an employee of your company, somebody who has been a user of it? This person has no prior experience of using Telegram.

Pavel Durov (02:20:00) This person has no prior experience of using Telegram. Their understanding would be very limited. Why would I even try to hire somebody from LinkedIn who worked at Google and other companies, is used to receiving salary for nothing, is used to shift responsibility and being stuck in endless meetings and have very limited understanding of what Telegram stands for? It’s just crazy if you think about it.

Telegram engineering & design

Lex Fridman (02:20:40) Because of that, you’re extremely selective and slow in hiring. People really have to earn their spot and then as a result, I got a chance to sit in one of the team meetings where people discuss the different features that are being developed, the different ideas, some of which are at the very cutting edge and so you get to see behind the scenes how it’s possible to have such a fast rate of idea generation. You generate the idea, you implement the prototype and then eventually it becomes an actual feature in the product. That’s why you have this kind of half hilarious, half incredible fact that for many, as compared to WhatsApp and Signal, you’ve led the way on many other features. Many of the features we take for granted now, many of which we know and love, like the auto-delete timer. That was seven years ahead of any other messenger. Message editing, replies. These are all obvious things I’ve even forgotten for some of them that they were never part. I think auto-delete timer is a really brilliant idea.

Pavel Durov (02:21:54) We implemented in 2013 in the Secret Chats. Funny thing about it is then when other apps started to copy it, WhatsApp seven years after and then Signal and some other of these apps, they initially even copied the exact timestamps. For example, if we had one, three and five seconds, they would also have one, three and five seconds. They tried not to change it because they were not sure what was the magic sauce behind the feature. Ironically, it happens with many of these things. For example, when we design how you reply to a message and you have a small snippet showing that you’re replying to this message and now you’re typing your response, then there is a small snippet into the message itself that if you tap on it highlights the original message you’re replying to. Seems pretty obvious, but there are certain design decisions that we were implementing at the time and we got this vertical line on the left and all these other small things that are completely arbitrary, you can do it in a different way, but somehow the entire industry ended up copying exactly that solution. Now whenever you go to WhatsApp, Instagram direct, Facebook Messenger, Signal, it doesn’t matter, you would see exactly the same or pretty much similar experience because nobody really wants to take the risk and innovate. If something works, why not just copy it?

Lex Fridman (02:23:32) We should say that it’s done extremely well. The vertical line and the highlighting, I mean all of these are tiny little strokes of genius. By highlighting the text in a certain way that from a design perspective makes it very clear that this part was written before and thing under it is your reply. The distinction between the different formatting, the text. Listen, I know how much typography is an art form. There’s a lot of interacting, graphic artistic elements inside Telegram that all have to play together extremely well. Like you pointed out to me, this thing that just blew my mind, which is the background gradient of Telegram, shifts. It changes and it adjusts really nicely to the bubbles, the chat bubbles and then there’s graphic elements on top of the gradient that all interplay together. All of that has to work really nicely without sacrificing clarity. Everything’s just intuitive. That’s very difficult to create. That is art. On top of that, super fast.

Pavel Durov (02:24:40) That’s the hardest part. To make it look so that designers love it is one thing. The real challenge is make it look the way the designers love it and make it work on the weakest device as possible. Oldest, cheapest, smartphones you can imagine. If you take the moving gradient on the background of every Telegram chat, this is something most people don’t notice, but they can feel it.

Lex Fridman (02:25:13) They notice it subconsciously or something like that. There is a pleasant feeling. There’s a feeling, there’s a pleasant feeling when you’re reading a chat and that’s where the design contributes to that. I think a gradient really does. I really love that about Telegram, the gradient. Not the technical thing you described, but the feeling of it and then the technical aspect of creating that feeling is incredible. I could probably come up with all kinds of algorithms of rendering that gradient that’s going to be super inefficient and so doing that efficiently is like…

Pavel Durov (02:25:46) Or efficient, but not too beautiful because even doing something so trivial as a gradient can result in noticeable lines in the gradient that a person can instantly say, oh no, it’s not the right thing. You can have to introduce certain randomness there and then you have the gradient, but it’s not enough. It’s too plain. You want to have certain pattern as an overlay, but it should be simple enough not to distract you from the content, but it has to be entertaining enough to create a good feeling about the whole app. Another question, what kind of objects you want to include in this pattern and how this pattern would work? Will it be based on pixels or would it be vector-based and would it be vector-based so they will be infinitely scalable and high quality? I think for the default pattern and the default background, which is based on four colors, it’s not a gradient based on two colors, it’s four colors and they’re constantly shifting. I probably look through several thousand variations of that because this is such an important decision to make. It’s the default background. Of course you can change it actually. You can set up your own four colors for that. You can change it.

Pavel Durov (02:27:10) Yes, you can do it and you want to rely on certain deeply hard-coded biological properties of the human mind. Which color do you want to use? Is it going to be blue? Is it going to be yellow? Is it going to be green? Each color has a different meaning in our brain and what kind of objects you want to put there? Something from our childhood? Something from nature or something that can create a different kind of mood? This is just one detail of the app. There are many details. When you send a message, you are done typing a message and you just then tap send and then the message gradually appears in the chat. How does it happen? You want the input field to slowly morph into the actual message.

Lex Fridman (02:28:03) To the message. Yeah.

Pavel Durov (02:28:04) You want this to be done regardless of the contents of the message because sometimes the width would be different. Sometimes it’ll be containing media or link preview or other stuff that will change the message bubble. You go through countless different scenarios and make sure every one of them works great, even if this message contains 4,000 characters. Then you look at all the platforms, iOS, Android and all the old devices, all kinds of outdated operating systems and the hardware and you cross the two because you can have this really bad old phone, but using the newest operating system version, so what do you do? What kind of bugs you get there? Then of course, since Telegram works on tablets as well and our iOS version works on an iPad, which I love a lot, you have to understand that everything can be really big. It can consume a lot of space on your screen and then it’ll trigger using more computational resources to render it. There are a lot of nuances to it, but as long as you obsess over every small detail, at least every detail that really counts, you can get to a user experience… If you’re really used to Telegram, if you’ve been a regular user for at least a few weeks, going back to any other messaging app feels like a serious downgrade.

Lex Fridman (02:29:53) Yeah, I mean there’s so many really magical moments. For example, the way a message evaporates when you delete it, that is a really pleasant experience.

Pavel Durov (02:30:05) Oh yeah. Boy was it hard to make, particularly on Android. This is this Thanos snap effect, right? The message is broken into tens of thousands particles, which go away like dust in the wind. It looks great, but it was so hard to make.

Lex Fridman (02:30:28) Probably one of my favorite GUI graphical things. It’s just art. It’s pure art. It’s incredible. It’s good to hear that it has been really fought over and thought through. It’s extremely well done.

Pavel Durov (02:30:45) No, you can’t pull it off if you’re not going deep in this. Then you don’t want to distract people from their communication with all this additional animation. You want them to be invisible in a way.

Lex Fridman (02:31:06) They create the feeling, but they don’t create distraction.

Pavel Durov (02:31:09) Yes. In order to do that, you have to overcome even more challenges. For example, you mentioned this deletion effect, message evaporates. If you do the animation, if you show the animation first and then the message that is preceding the deleted message that is going after the just deleted message move closer to each other, then it doesn’t feel right. It feels too long, too imposing. What you want to do is you want the message disappear while the messages around it go closer to each other to fill the resulting gap. Then you imagine what it involves. Redrawing the entire screen. On top of this very complicated animation, you have to think about things like which kind of messages were there before it after. It just adds to complexity.

Lex Fridman (02:32:14) Once again on all kinds of devices, all kinds of operating systems, all kinds of tablets, phones, desktop, all of that.

Pavel Durov (02:32:21) Once you accomplish it, it gives you this immense sense of pride because nobody is doing this. Nobody really cares. In a way maybe they’re right not to care. Maybe nobody notices this, but there is something about it that feels wrong when such things are neglected because I understand that every day, tens of millions of people around the world are deleting messages. What kind of experience they get? Is this an experience that maybe even subconsciously inspires them and makes their hearts sing even a little bit? Fills them with joy? Lightens up their mood, even a little bit by 0.001%? Is it something that is just basic and I think if we can bring some value in people’s lives, even through this subtle details, we have to definitely invest our time in it.

Lex Fridman (02:33:32) Some joy. Not just sort of value like productivity, but joy. I think Steve Jobs, Jony Ive talked about this, they would put so much love and effort in the design of everything, including things that weren’t visible in the initial pc, personal computers because they believe that you somehow through osmosis, the users will be able to feel the love that the designers put into the thing and you’re absolutely right. It’s not about deleting messages. I feel a little inkling of joy when I see that evaporation animation. It’s just nice. I’m happier because of it. I feel that effort and I think a billion users feel that.

Pavel Durov (02:34:21) People like when other people care.

Lex Fridman (02:34:23) Yeah, yeah, yeah. That’s exactly what it is. Of course there’s the more sexy things like all the emojis and the stickers, the gifts, many of those are just, they’re a little like art pieces.

Pavel Durov (02:34:39) That’s again an intersection of art and technology because you look at the stickers, which Telegram launched way before most of this other apps-

Lex Fridman (02:34:48) Three years and eight months ahead.

Pavel Durov (02:34:50) … ahead of WhatsApp, yes. The stickers that WhatsApp ended up launching three years and eight months after were not the first version was not really good because they just did regular GIFs or WebM videos, which were not based on vector graphics. What we did is vector animations. Each of these stickers is only several kilobytes, sometimes maybe maximum 20, 30 kilobytes in size, but it says 180 frames. We were able to run them at 60 frames per second on all devices. It’s also very challenging. It was a challenging thing to do. We had so much headache trying to make it work. Nobody even tried to do anything like this before us because it’s crazily difficult. As a result, you have these fluid animations. You have this really nice user experience. Somebody sends you a sticker, you don’t have to wait for it to load because it’s so lightweight and it starts moving instantly.

(02:35:58) Then of course, it’s not just engineering. You have to find designers that are able to create the stickers using vector graphics, which means they’re based on curves described by formulas, not just created as photographs with pixels. Where do you find these people? Again, we did competitions, but was not easy to assemble a team of artists/engineers I would say, that are able to do something like this. This is a unique form of art and this allowed us to do a revolution in stickers and then another revolution in animated emoji that you can add into messages, custom animated emoji. I don’t think anybody did that. I think Telegram is still the only one allowing users to do that because you can include 100 of animated emoji in a message and they will be animated and it’ll be moving and your device won’t crash. It’s probably unnecessary and crazy, but we think somewhere in this intersection of art and engineering, true quality is created.

(02:37:14) Then of course, more recently we expanded into what we call Telegram Gifts, which are essentially blockchain-based collectibles that you can demonstrate on your Telegram profile so that they get social relevance, but you can also use them to congratulate your friends and close ones with their birthdays and other holidays and that was received extremely well.

Lex Fridman (02:37:41) Yeah, they can hold value, they can increase in value, you could trade them in that aspect, but to me still, vector graphics and it’s not just simple graphics, it’s incredibly intricate graphics. The vector makes it very efficient, but it also allows you to create, maybe incentivizes the artist, enables them, incentivizes them, to create super detailed intricate elements. Then the final result, you would think it wouldn’t matter, but the final result has a lot of stuff going on and it allows you to scale on arbitrary devices. Now it’s like this little… Usually GIFs from back in the day and still in meme form, are low resolution and so usually people don’t put details and intricate art into it, but here with vector graphics it’s like a million things going on. It allows you to play with different animations. Like you showed me this thing where you send and you hold for a while on the send button and so you can share with the person you send a message to this animation that you’ve encoded. There’s a bunch of stuff going on when they read the message.

Pavel Durov (02:38:59) Yes, we have a lot of features like that when we use this art to allow people to express themselves and most people don’t even know about these features.

Lex Fridman (02:39:10) I didn’t know about it. That was cool. That was cool.

Pavel Durov (02:39:12) The other application of the same technology is reactions on Telegram because we made it a goal to make sure that people feel joy when they just send you a like. Something so trivial as just adding a like to a message should be an action that you want to perform again and again and again.

Encryption

Lex Fridman (02:39:43) Another feature, on the more serious side, is end-to-end encryption. You led the industry in that. It was launched one year and three months ahead. Can you speak to why you decided to add end-to-end encryption and how you developed the encryption algorithm in the beginning? What was your thinking behind that?

Pavel Durov (02:40:03) At 2013 when we were launching Telegram, we were aware of the serious issue with privacy that Edward Snowden made very clear. We thought, yes, we’re designing this product in a way that is already extremely secure, but we want to make sure that not even we can access user messages. We understood very clearly that a bunch of people who were born in Russia don’t necessarily inspire trust. That’s why we made Telegram open source, so all our apps have been available on GitHub since 2013 and then we added end-to-end encryption in our Secret Chats, which WhatsApp copied a few years after. One year and three months ahead they just started to test it. They rolled this out I think 2016, which is three years after us and the only reason I think the rest of the industry had to do it is because we set the standard.

(02:41:23) It was incredibly important back in the day and at the same time we realized certain limitations of end-to-end encryption. Within that design, that architecture, you can’t support very large chat communities with consistent persistent chat histories. You can’t support huge one-to-many channels. You’d have issues with maintaining bots that have lots of incoming messages. Multiple device support becomes tricky. People will end up losing some of the documents they share. We also saw a lot of issues and we ended up having this sort of hybrid experience where depending on your use case and your requirements, you can choose the level of encryption that we want to have.

Lex Fridman (02:42:27) That’s why you chose to go opt-in for end-to-end encryption. The trade off there that you are describing is between for people who really care about specific messages, extreme privacy on those messages and usability, like being able to sync across multiple devices, having groups that are 200,000 people. All of those features, quality of life features, there’s a trade-off between those and end-to-end encryption. You lean towards letting users enable end-to-end encryption for cases when they want to be super secure.

Pavel Durov (02:43:04) Yes. Secret Chats are not just end-to-end encrypted. There are certain limitations that are both a feature and a bug. For example, you can’t screenshot them. You can’t forward any document, any message from them, which is not necessarily something you need when you are trying to get some work done and you are just communicating with your team on a project. It became very clear to us that there are different needs here and if you try to combine both in one type of chat, you will end up losing a lot of utility. We at Telegram, we don’t use any collaboration tool for teamwork. We use Telegram to build Telegram. We felt instantly when we were trying to switch to say Secret Chats, to share large documents and tried to get work done, it was just not adapted for it. At the same time, if you were really paranoid, you think, I don’t want to be screenshotted, I don’t want to have any leaks, I don’t even trust Telegram, I only trust code. Secret Chats are the best option. I believe is the most secure means of communication today.

Open source

Lex Fridman (02:44:36) We should say that there’s a lot of other aspects to this that are important. For example, Telegram is the only app that has open source reproducible builds for both Android and iOS. Why is this important?

Pavel Durov (02:44:49) You need reproducible builds in order to verify that the app really does what it claims, really encrypts data in a way that it is described on its website. For that you need to make your apps open source for any researchers to have a look at it. Telegram has been open source since 2013. Apps like WhatsApp have never been open source, so you don’t really know what they’re doing and how exactly they encrypt your messages. What’s important here though is to understand whether the version of the app that you download from the app store corresponds exactly to the source code that you can view on GitHub. For that you need reproducible builds.

(02:45:48) As you said, Telegram is the only popular messaging app that does that. We allow people to make sure both on Android and the iOS that the source code of Telegram on GitHub and the app you are actually using is the same app. I think it’s incredibly important, not just to gain people’s trust, but just to stay transparent and open about it. When I make this claim that Telegram’s Secret Chats are the most secure way of communicating, I really mean it because I haven’t seen any fact contradicting this claim, at least among the popular messaging app. You say WhatsApp, Signal, iMessage. None of them have reproducible builds on both iOS and Android. None of them had at least at the same level put so much effort into making sure that the algorithms that you use in order to encrypt data are not algorithms that have been handed to you by some agency in order to create a honey pot, at least from what I know about our competitors. I don’t think they went through the same process.

Lex Fridman (02:47:23) We should say that the entirety of the software stack in Telegram is done from scratch internally to Telegram. We’re talking about not just the encryption, but everything running on the servers. The servers are built out, the hardware and the software are all done internally, which is one of the ways you reduce the attack surface on the entire stack that handles the messages.

Pavel Durov (02:47:45) It does make it more secure because if Snowden’s revelations taught us anything is that very often open source tools, modules, libraries, that they used by everybody, ended up having certain flaws and security issues that make software vulnerable. It’s also a way to make sure you are doing things the most efficient way possible, but it’s extremely difficult to do that. You really have to have exceptional talent in your team to achieve this level of thoroughness, to go to a low level of coding that allows you to recreate from scratch database engines, web servers, entire programming languages because the programming language we use on the back end to develop the API for the client apps is also entirely built by our team.

Lex Fridman (02:49:01) Removing, minimizing the reliance on open source libraries is extremely difficult as most companies, they rely on open source libraries.

Pavel Durov (02:49:09) Well, I wouldn’t say we are completely independent from that. We use Linux on the back end. There’s no way of avoiding it for us at the moment, but for the most part we are much more self-reliant than most other apps.

Edward Snowden

Lex Fridman (02:49:26) You mentioned Edward Snowden. A long time ago you wanted to work together with him, perhaps to share expertise, to understand the full realm of what it takes to achieve cybersecurity. What do you make of his case? What lessons do you learn from what he has uncovered and maybe even broadly, what impact has his work had on the world, do you think?

Pavel Durov (02:49:53) Well, the main lesson is not everything is what it seems. You would discover and this is something that I found quite shocking at the time, that a lot of people who you thought were security and cryptography experts ended up being agents of the NSA in one way or the other, promoting flawed encryption standards. You wouldn’t end up discovering that your government that was supposed to be limited in how it can surveil its people, actually doesn’t consider itself that limited. That was very valuable for the world to understand.

(02:50:50) I guess it also can be a lesson demonstrated that we humans don’t get the balance right. 9/11 created a situation when the government had to respond and it responded, but it overreacted. It ended up eroding certain basic rights and freedoms including the right to privacy because the government always wants to increase its powers and the government always tries to do it at the expense of citizens. You have the situation when the cure is worse than the disease. I think it was incredibly brave to do what Edward did. I didn’t get to work with him. Whoever see him in person, we keep in touch, we sometimes communicate, but we’re not close. I still, I think what he did is laudable. I hope someday we meet.

Intelligence agencies

Lex Fridman (02:51:59) You yourself have faced the full force of various governments, intelligence agencies. Is there any intelligence agency you’re afraid of? Any government you’re afraid of?

Pavel Durov (02:52:15) I think they should be equally afraid of or equally not afraid of, in a way. It’s not that intelligence services can kill you and the other can’t kill you.

Lex Fridman (02:52:26) They all can kill you?

Pavel Durov (02:52:27) I guess they all can kill me one way or the other, but it’s a matter of whether I’m afraid of death.

Lex Fridman (02:52:34) This goes back to the beginning of our conversation, I think, multiple times. You’re in general fearless in the face of the pressure.

Pavel Durov (02:52:42) That would be a very bold statement, but I proved to be quite stress resilient and it’s not that you don’t have fear. You can have fear, but you overcome this fear. I don’t think there is anything at this point that can happen to change the way I am.

Iran and Russia government pressure

Lex Fridman (02:53:11) You went through a lot from 2011 to 2014, government pressure that you refused to give into, that led you to create Telegram and let go of VK. Then in 2018, Russia and Iran decided to ban Telegram. That was another example of pressure. Can you take me through that saga in 2018?

Pavel Durov (02:53:35) In 2018 Telegram started to become popular. I think we had something like 200 million users and it increasingly became popular in places like Iran and Russia and other countries where sometimes people have something to hide from the government. In Iran, people use Telegram to protest against the government. They had these huge channels that they would use to organize the protests and eventually the government couldn’t keep up. They decided to ban Telegram. People would still keep using it though using VPNs. It didn’t help. The government invested a lot in coming up with their own messaging app. They had several teams competing for the title of the nationally reigning messaging app. All these apps failed. People still preferred Telegram. Interestingly, Iran banned Telegram, but WhatsApp wasn’t banned.

Pavel Durov (02:55:01) WhatsApp wasn’t banned. Or at least they unbanned WhatsApp soon after. At the same time starting in mid-2017 or late-2017, Russia demanded that Telegram hands them the encryption keys. They thought these things exist, something that would allow them to read messages of every person on Telegram or at least every person on Telegram in Russia. And we told them, it’s impossible. If you have to ban us, ban us. And this is what they ended up doing in spring 2018. And that was quite fun because they were trying to block our IP addresses, but we were prepared for that and we came up with this technology that allowed us to rotate IP addresses, replacing them with new ones every time the sensor blocks our existing addresses. And then it was completely automated. We had millions of IP addresses. We would be burning through them. We set up this movement called Digital Resistance when system administrators and engineers all around the world, both inside and outside Russia could set up their own proxy servers and their own IP addresses for Telegram to rely on in order to bypass censorship.

Apple

(02:56:41) We ended up spending I think, millions of dollars on that. And as a result, the sensor got crazy there. They would ban IP addresses and large subnets of IP addresses and huge subnets, which resulted in a weird situation where parts of the country’s infrastructure started to go down. People were trying to pay for groceries in the supermarkets and nothing would work because the Russian sensor blocked too many IP addresses and some of the subnets were used to host other unrelated services. Even some Russian social networks and media got affected. Banks. So they had to start being more selective in how they combat our anti-censorship tools.

(02:57:41) The biggest resistance we got at the time was from Apple. Apple didn’t allow us to update Telegram in the app store saying for at least four weeks that we have to come to an agreement with Russia first who said it’s not possible. They said, “We will allow you to push your update for Telegram worldwide except for Russia.” We didn’t want to do that. Almost lost hope. At some point I said, “Maybe this is the only way. Maybe we should leave the Russian market. Stop allowing users from Russia to download the app from the app store.” Which would mean it’s over. We helped organize certain protests in defense of Telegram and privacy and freedom of speech in 2018 in Moscow. There was hilarious people flying paper airplanes.

Pavel Durov (02:58:49) And at some point I decided I have to make a statement. I have to say that Apple sided with the censor. That we are trying to do the right thing here, but without Apple we can’t do much because people can’t download your app anymore. I published it in my channel and then New York Times picked it up with the picture of the protesters flying paper airplanes. Apple was criticized in that story and I thought, well, Apple should probably come back to the right side of history here. And I waited for one day and two days. In the meantime, since we’ve been unable to update Telegram for more than a month, it started to fall apart because the new version of iOS came out and it made the old versions of Telegram obsolete. Some features that used to work stop working and users all over the world start to suffer. People that had nothing to do with Russia from other parts of the world experienced issues with Telegram. So it was really serious and I said to my team, you know what if by 6:00 P.M. today … I think it was a Friday. Nothing changes and Apple doesn’t allow us to push the version of Telegram through, let’s just forget about the Russian market. Let’s keep going because the rest of the world is more important. It’s sad, but what can we do?

Lex Fridman (03:00:44) Which by the way, removes all the people that want to protest all the people that want to talk in Russia and removes their ability to have a voice in the most popular messaging app in that part of the world.

Pavel Durov (03:00:55) Yes. Magically 15 minutes to the time I was planning to remove Telegram from the Russian app store in order to proceed globally, Apple reached out to us and said, “It’s okay. Your update is approved.” And we managed to keep playing this hide and seek game with the sensor bypassing censorship through digital resistance. In Iran, it was a little bit different because we realized it would’ve been too expensive to try to come up with all these IP addresses, and in addition, it was not clear whether we wouldn’t be in violation of the sanctions regime. So we did something else. We created an economic incentive for people who would set up proxy servers for Telegram. Any person, say an Iranian engineer could come up with a proxy server, distribute its address among users in Iran, and whoever connected through the proxy of this person would be able to see a pinned chat, an ad placed there by the system administrator, the owner of the proxy. And this is how you can monetize your proxy. So it created this market which resulted in Iranians fixing their own problem. And as a result, we kept millions or maybe 10s of millions of Iranian users. Up until this day I think Telegram is still banned in Iran today, but we probably have something like 50 million people relying on Telegram from that country.

Lex Fridman (03:03:08) So that people find a way around.

Pavel Durov (03:03:10) People find a way around.

Poisoning

Lex Fridman (03:03:11) That’s ingenious. That’s really great to hear. I have to ask you about this. After having spent many days with you, I learned of something that you’ve never talked about at the time, have not talked about to this day, that there was an assassination attempt on you using what appears to be poisoning in 2018. I think to me, it showed this seriousness of this fight to uphold the freedom of speech for everyone, for all people of earth that you’re doing. I have to say it would mean a lot to me if you tell me this story.

Pavel Durov (03:03:55) Well, this is something I never talked about publicly because I didn’t want people to freak out particularly at the time, it was spring 2018. We were trying to raise funds for TON, a blockchain project working with all kinds of VCs and investors. In the meantime, we had a couple of countries trying to ban Telegram. So it wasn’t exactly the best moment for me to start sharing anything related to my personal health. But that was something that is hard to forget. I never fall ill. I believe I have perfect health. I very rarely have headaches or bad cough. I don’t take pills because I don’t have to take pills. And that was the only instant in my life when I think I was dying.

(03:05:05) I came back home, opened the door of my townhouse, the place I rented. I had this weird neighbor and he left something for me there around the door. And one hour after when I was already in my bed … So I was living alone. I felt very bad. I felt pain all over my body. I tried to get up and go to the bathroom, but while I was going there, I felt that functions of my body started to switch off. First the eyesight and hearing, then I had difficulty breathing. Everything accompanied by very acute pain. Heart, stomach, all blood vessels. It’s a difficult thing to explain, but one thing I was certain about is, yeah, this is it.

Lex Fridman (03:06:25) You thought you were going to die.

Pavel Durov (03:06:26) Yeah. This is it. Because I couldn’t breathe. I couldn’t see anything. Was very painful. I think it’s over. I thought, well, I had a good life. I managed to accomplish a few things. And then I collapsed on the floor, but I don’t remember it because the pain covered everything. I found myself on the floor the next day. Was already bright and I couldn’t stand up. I was super weak. I looked at my arms and my body, blood vessels were broken all over my body. Something like this never happened to me. I couldn’t walk for two weeks after. I stayed at my place and I decided not to tell most of my team about it because again, I didn’t want them to worry. But it was tough. That was tough.

Lex Fridman (03:07:35) Did that make you afraid of the road you are walking, meaning all the governments, all the intelligence agencies, all the people like we mentioned? It’s like you’re playing a video game. You started with VK where you’re just trying to build a thing that scales and all of a sudden you find out there’s DDoS attacking the security, the integrity of the infrastructure, and then you realize there’s politics and then you realize there’s geopolitics and all of these forces are interested in controlling channels of communication, and you’re just a curious guy who created a platform for everybody on the earth to talk, and all of a sudden you realize there’s a lot of people attacking you. How did that change your view? Did that make you more scared of the world?

Pavel Durov (03:08:42) Interestingly, not at all. If anything, I felt even more free after that. It wasn’t the first time I thought I was going to die. I had an experience when I assumed something bad is going to happen to me a few years before that also in relation to my work. But after you survive something like this, you feel like you’re living on bonus time. So in a way, you died a long time ago, and every new day you get is a gift.

Lex Fridman (03:09:32) And the first time you’re referring to would that have to do with the complexity that was happening with the pressure from the government on VK? The increasing pressure and you had to figure out what to do, and you understood that you’re losing control of VK that moment.

Pavel Durov (03:09:52) The first of these instances was in December 2011. December 2011 you had this huge protest on the streets of Moscow. They didn’t trust in the integrity of the election results to the state Duma in Russia. I remember 2011, I still lived in Russia running VK. There was no Telegram. So the government demanded that we take down the opposition groups of Navalny from VK that had hundreds of thousands of members and that were used to organize this protest. And I very publicly refused to do that. I just decided it’s not the right thing to do. People have the right to assemble. And I mocked the Prosecutor who handed me that demand. They put out a scan of it. And next to it a photo of a dog in a hoodie with its tongue out. And I said … This is my official response to the prosecutor’s request to ban the opposition groups. That was very funny at the moment. But then I had armed policemen trying to get into my apartment, and I thought about many things at that moment. I asked myself, did I make the right choice? And I came to the conclusion that I made the right choice and I asked myself, what would be the next thing that would logically follow from this? And I realized they’re probably going to put me in prison, so what am I going to do about it? I asked myself.

(03:12:04) And I told myself, I’m going to starve myself to death. It’s something that probably many men have. They’re ready to die for other people or certain principles they strongly believe in. I’m not alone here. I guess Edward Snowden was ready to die as well, or some other people like Assange. Also, at that moment, I realized there’s no way to communicate securely. I need to tell my brother what’s going on. They’re probably going after him. How do I tell him without betraying him? Because in 2011, remember WhatsApp was already there. I think they launched in 2009, but it had zero encryption. All messages were plain text in transit, meaning that even your system administrator, let alone your carrier had access to your messages it was only after Telegram started this push for encryption that this other apps suddenly remembered that privacy wasn’t their DNA as WhatsApp founders famously stated, but it must have been a dormant gene in 2011.

Pavel Durov (03:13:33) In 2011, there was no way to send a message in secure way. And I also told myself, if I’m going to survive this, I’m definitely launching a secure messaging app. Somehow it ended up not being too bad. I was summoned to the Prosecutor, answered some silly questions, fewer questions that I had to answer more recently in the French investigation case. But it was the beginning of the end. It was clear that there’s no way I’m going to be allowed to run VK the way I wanted it to run. That was the moment I packed my backpack and just started to wait. I moved to hotel and realized any day I can leave the country, I kept running VK. I started to design Telegram and assembling the team. But I knew my days in Russia were numbered.

Lex Fridman (03:15:01) First I really have to say for myself from I think millions, maybe hundreds of millions, maybe the entirety of Earth, thank you for putting your life on the line in those cases, I think freedom of speech is fundamental to the flourishing of humanity. And it depends on people willing to put everything on the line for their principles. So thank you. Quick pause. I need a bathroom break. All right, we’re back. And once again, we had a super long day and the fact that you would spend many hours with me, thank you for powering through. We got this. It’s already late at night.

Pavel Durov (03:15:45) Thanks for doing this.

Lex Fridman (03:15:47) Okay. So there is increasing indication I think from things I’ve seen online that Russia is considering banning Telegram. First of all, do you think this might happen and what effect do you think this might have on humanity and in general what do you think about this?

Pavel Durov (03:16:07) It can definitely happen. As you said, there are certain indications. There have been certain attempts to partially ban it. Telegram is no longer accessible in parts of Russia such as Dagestan and will be incredibly sad if Russia restores its attempts to ban Telegram because currently it’s been used by its population for all kinds of purposes, not just personal communication or economic business activities, but also it’s the only platform which allows the Russian people to access independent sources of information. If you think about media outlets such as BBC or any other non-Russian of source of information, they’re only accessible in Russia through Telegram in the form of Telegram channels. Their websites banned. Some other social media sites banned. And as you said, there are indications that Russia is planning to migrate users from existing messaging apps such as WhatsApp and Telegram to their own homegrown tool, which would of course be fully transparent to the government and wouldn’t allow voices independent from the government to express themselves.

(03:17:53) It’s certainly an alarming trend. We see these attempts in countries that are not famous for protecting freedom of speech, but also increasingly in countries that have been known to protect freedoms. And this creates this vicious circle because in a way, European countries trying to fight freedom of speech under pretexts that sound legitimate, such as combating misinformation or election interference, they create precedents and they legitimize restrictions to freedom of speech, which then in turn be used by authoritarian regimes and they would say in places like China or Iran that they’re not doing anything different. It’s the norm now to restrict voices that don’t go in line with the narrative.

(03:19:11) That’s sad because one of the things that makes our life interesting is this abundance of different viewpoints of different people that we get to experience. You limit the freedom of people, you inevitably decelerate economic growth, level of happiness, the way people can contribute to the society, the way people can express themselves. I personally think it would be a huge mistake to ban a tool like Telegram in any country, particularly a large country such as Russia, because the Russian people are incredibly talented and resilient people. They’re among the first to start utilizing some of these recent innovations that Telegram implements. They’re the early adopters. I’d say them and also the Americans, perhaps other people from Eastern Europe like Ukrainians and Southeast Asians, they’re among the first people to start using any new addition that we launch. They’re incredibly hungry for innovation.

Lex Fridman (03:20:32) So all that said, as part of the propaganda and in general, there’s attacks on you all over the place. There’s misinformation. I’ve read a bunch of things that are, I think in a systematic way, lying about you, lying about telegram from all angles. Why do you get attacked so much by everybody?

Pavel Durov (03:20:56) For protecting freedom of speech. It’s not a way to make a lot of friends. Because you would inevitably find yourself in a situation where you would be protecting the freedom of the opposition to the current government in any country to express themselves. And then the initial reaction and a very basic instinctive reaction of any government would be to say our position shouldn’t be trusted and allowed to express themselves because they’re actually are agents of some foreign rival, a geopolitical force that wants to destroy our country. This is something that every authoritarian regime in history used. You take Stalinist Russia or Nazi Germany, Maoist China, they always use the same trick that say, “We need to limit your freedom of speech because these people who are masquerading as opposition are actually the agents of this other country that wants to take over.” That’s why their citizens forget about their freedoms. And now increasingly you see similar attempts in free countries.

(03:22:33) The initial instinct from say, President Macron’s team, when they’re confronted with some footage. For example, the footage of his wife slapping him would be to say it’s all fake Russian imagery. Something that is inaccurate. Something that is misinformation or interference. And then when they are confronted with more information, they have to refine the narrative. So when you find yourself in a situation that you’re running this platform like Telegram, and then you protect the freedom to express of ideas that don’t go in line with the mainstream narrative, you often find yourself in this crossfire when the forces in power will say that you must be working with some foreign government that they don’t like. Inevitably they would say that, oh, if you’re protecting this voices, it’s not right. They love you when you are protecting the freedom of speech in a country that is far from them or better yet in a country that is their geopolitical rival. They praise you for that. But then they have this bipolar attitude when you do the same in their own country and they say, “No, no, no, no, no. We loved you for protecting freedom of speech, but not here, not in my backyard. We don’t need it here. We’re all right. We have free press.”

(03:24:28) And then you will find yourself in this weird spot. The Ukrainians say you work for the Russians. The Russians say you work for the Ukrainians. And all this schizophrenia is something that we had to deal with for some time because it’s a very easy way to attack you. At some point you don’t understand where it is coming from. Is it our competitors? We must give credit to our competitors if it’s their invention to launch these kind of rumors because at a certain point they must have realized they can’t compete technologically on the product side, so they must do something like this. Or it’s just governments launching these rumors, trying to discredit the platform, trying to scare their citizens away from it because they understand that their power and grip of their own country is in danger as long as they allow a pro-freedom platform to operate.

Lex Fridman (03:25:39) And through all of this, we should say over and over, that you are simply preserving the freedom of speech for all people of earth no matter what they believe, as long as they don’t call for violence, and as long as they’re not doing some of the criminal activity that we discussed, including terrorist organizing. But other than that, it doesn’t matter what they believe. Left-wing or right-wing, you’re just preserving their freedom of speech. Do you think people of Ukraine, people of Russia and people of Iran, people of all over the world understand that despite the propaganda against you?

Pavel Durov (03:26:14) I think people are smart. Every time I meet somebody from one of these countries you mentioned in real life or people recognize me in the street, say here in Dubai, they come over, they seem incredibly grateful and understanding. The propaganda in each of these countries would tell them a number of things, but they learned to discount it. That’s why they’re so happy that Telegram exists is because the way they can understand the world around them is to receive conflicting, mutually exclusive viewpoints from sources that hate each other and try to understand what really is true. Because there’s no such thing as an unbiased source of information. When the war in Ukraine started in 2022, I instantly realized Telegram is going to be used to spread propaganda by both sides. And I didn’t want Telegram to be used as a tool for war and publicly. I suggested maybe we should just suspend the activity of all politics-related channels in both countries for the time of the war. Maybe we shouldn’t have channels in these two countries.

(03:27:55) And then interestingly, people from both countries revolted against this. They told me … Both people in Ukraine and in Russia that I don’t get to babysit them and decide for them what sources of information that they have to be granted access to. They are grown-ups that can make these decisions for themselves. They understand that there is a lot of propaganda. They learn to see through this propaganda. They learn to be able to tell truth from lie. And in this time of war, it was particularly available for them to receive as much information as possible because their relatives, their friends who are getting affected and are still getting affected, they want to understand what’s going on. At that point, when I realized people are smart, people get it, people can see through it. If you ask most people in any of these countries, do you agree that access to Telegram should be restricted for whatever reason, they would say no.

Lex Fridman (03:29:19) They hunger to have a voice.

Pavel Durov (03:29:21) They need a voice, and they need a place to share their opinion securely.

Lex Fridman (03:29:28) I have to ask in the question of leadership in the Le Point interview, the journalist said that you’re often compared to Elon Musk, and you highlighted some interesting nuances around that, that you’re quite different. That Elon runs several companies at once, while you only run one. And Elon can lean more on the emotional side while you deliberate and think deeply before acting. Can you expand on this? Also there’s an interesting point that you made that everybody’s weakness is also a strength.

Lex Fridman (03:30:00) Same point that he made that everybody’s weakness is also a strength. Everybody’s strength is also a weakness. There’s a dual nature to all our characteristics. So on the topic of Elon, what have you learned from his style of leadership? What do you respect about him?

Pavel Durov (03:30:20) First of all, I don’t think there is such thing as a negative personal trait. In most cases, our bad traits and our good traits are the same trait, or at least have the same source. Of course, there are some extreme examples, but I’d say 99% of people, if you analyze their character, their bravery can be seen and recklessness in other situations. Depending on circumstances, you would see exactly the same personality trait and it would be either a good thing or a bad thing. Because humanity is perfect as a whole, and each of us is different for a reason. We have evolved to be different, to complement each other’s abilities, so that together we’re invincible.

(03:31:20) And even if you take a person as complicated as Elon, I believe that certain traits that Elon demonstrates that people criticize about him are also the sources of his strength. For example, his emotionality is derived from the fact that he cares about issues deeply, and he’s willing to start as many wars and as many fights as it takes to change the world in the direction that he thinks is right. He also seems to be able to extract motivation from all these wars and personal conflicts, which is again, not something to be underestimated. At a certain point in the life of a successful entrepreneur, the question of motivation starts to be the primary question. If we’re talking about the richest person in the world and the most famous entrepreneur in the world, you have to wonder how does he motivate himself?

(03:32:40) And if starting a war on X, debating certain issues or becoming personal with other CEOs, criticizing them, if these activities help Elon to innovate and start new projects, he should be doing more of it. There’s nothing wrong in being non-agreeable. Actually, it’s one of the main traits of a successful entrepreneur, not agreeing with things. And every time somebody like Elon, but there’s no somebody like Elon, it’s just Elon, I think, at least from the entrepreneurs I know and I personally interacted with, he’s unique in the sense that he keeps launching new things, running them in parallel, and he doesn’t seem to be stretched too thin. Well, some people think he is, but he manages to still demonstrate success in all or most of his endeavors. So again, you can criticize Elon for being emotional, but would he be the same person without this? I doubt that.

Lex Fridman (03:34:11) And the incredible teams he’s motivated too. There’s an element of that which you’ve spoken about, the team at Telegram. Assembling a team of A players, as we’ve talked about, is a skill in itself. And that’s also a big part of the leaders that we’ve discussed, it’s like judged in part by the team you assemble.

Pavel Durov (03:34:39) Yes. And one of the necessary character features to enable that is to be ready to be unpleasant. You have to be ready to insult some people. If their work is inferior, you have to be ready to fire them without remorse. So in order to be an efficient and great entrepreneur and enrich the world of innovations, you have to do unpleasant things. Most people will shy away from it. And in a certain sense, entrepreneurs sacrifice their peace of mind in order to contribute to the world around them. And Elon is a great example of that.

Lex Fridman (03:35:31) I have to ask you about the big picture Telegram. We’ve already talked about the fact that you own 100% of it, and there’s a lot of on the business side of it, the business structure of Telegram is fascinating. You’ve invested hundred, maybe hundreds of millions of dollars of your money. As far as I know, you take a salary of what, $1.

Pavel Durov (03:35:57) One dirham is one third of that.

Lex Fridman (03:36:01) One-third of a dollar. And in 2024 was the first time Telegram was profitable. So one of the interesting questions here that we could talk for many hours about, but I’d love to get a high view picture. So you’ve left what I understand, what I think is a huge amount of money on the table by sticking to your principles. For example, not doing advertisement that’s based on user private data, which basically every social media company does. So the only advertisement that Telegram does is based on channels and groups, based on the topic, not the private data of the individuals. And the other thing is, which is also gangster and incredible, is you don’t do a news feed, which is the most addictive and engagement inducing aspect of social media, which feeds the very kind of addictive downside of the internet.

(03:37:02) The distraction, the engagement, drama farming aspect that we’ve talked about in the very beginning that you tried to resist, that you think is damaging the human mind at scale. So anyway, that’s just speaking to the fact that you’re leaving a lot of money on the table. So how the hell were you able to be profitable? What are the ways that Telegram makes money?

Pavel Durov (03:37:23) Yeah. We had to innovate a lot in order to reach a point where we are profitable without having to resort to dubious business activities involving exploiting personal data of users, something that most of our competitors do. Because money has never been the primary goal, at least not for me. When I sold the remaining share of my first company and I had to do it below market price because I didn’t leave Russia completely without any pressures, I reinvested the vast majority of everything in Telegram. Telegram is an operation that is losing money for me personally. I didn’t extract more from Telegram than I invested in it. I never sold a single share, but I also didn’t want to sell Telegram. So how do you reach a point when you’re profitable without sacrificing your values?

(03:38:40) One of the ideas we explored was a subscription model, but only for certain additional features. We wanted to keep all the existing features free and just add more business-related tools or tools for advanced users that they would have to pay for, say 4 or $5 a month. It was quite unprecedented at the time. It wasn’t considered a viable option for messaging apps to do that. We launched the premium subscriptions for Telegram in 2022, and now we have over 15 million paid subscribers. This is some very significant recurring revenue. So we would receive more than half a billion dollars from premium subscriptions alone this year, and it’s growing fast. For that, we had to innovate a lot. We included over 50 different features into the premium package. And then how do you make an app that is already more powerful than any other messaging app on the market, even more useful so that people would be ready to pay for this extra? That wasn’t easy. That took a lot of effort.

Lex Fridman (03:40:19) And you’re constantly adding features.

Pavel Durov (03:40:21) We’re constantly adding features.

Lex Fridman (03:40:22) It’s actually fun to watch just the rate of adding, and some of them are subtle, like the updates to improvements, expansions of polls, for example.

Pavel Durov (03:40:32) Yeah. So you keep improving the existing features and adding new ones. And every time when you add a new feature, you don’t want to clutter the app. So in a way, they’re not in your way, they’re invisible. That’s not an easy thing to do. And most of the features maybe are not even known to the majority of our users, but when you need them, they’re there. So premium is one source of our revenue. We also have ads, but they’re context-based, not targeted. Of course, we leave probably 80% of value on the table because we’re not ready to engage in all this practices, exploiting personal data.

Lex Fridman (03:41:15) Just to be clear, targeted ads is what most social media companies, most tech companies that do any kinds of advertisement do. And that’s the kind of advertisement that uses personal data from users. Just to clarify. And when you said 80%, that’s a lot of money.

Pavel Durov (03:41:34) Of course, because we would never use, for example, your personal messaging data or your context data or your metadata or your activity data to target ads. It’s sad that it became synonymous with the internet industry, this kind of exploitation. But we are happy with the fact that we managed to make Telegram profitable despite that. We are also experimenting a lot with blockchain-based technologies. We’re the first app to allow people to directly own their username or their digital identities using smart contracts and NFTs removing Telegram from the picture. So for example, Telegram cannot confiscate your username from you. It’s impossible. We do a lot of things related to the ecosystem of Telegram. We have a thriving mini app platform, millions of mini app developers launching their own bots and applications.

Lex Fridman (03:42:48) So a lot of people are making millions of dollars on the Telegram platform.

Pavel Durov (03:42:53) Yes. We enabled them to receive payments from the users through in-app purchase mechanism provided by Apple and Google, which I think was the first attempt of this kind, to allow that both on iOS and Android on a big platform so that third-party developers of mini apps, which are basically websites so deeply integrated into Telegram that you can’t tell whether they’re standalone or they’re part of the overall experience. And by providing this payment option, we’re able to extract a commission from these transactions. But it’s a very low commission. Presently it’s 5%. So we’re not greedy here. We want people to succeed in building these tools for our users. We understand that mini apps bring us users. The more users we have, the more successful and relevant Telegram becomes. We need third-party developers. I think at this point, Telegram gives developers by far the most powerful tools to create.

TON

Lex Fridman (03:44:21) Plus there’s a bot API. And I mean you have to tell me about the TON blockchain and the crypto ecosystem available through Telegram. So what is TON aka The Open Network blockchain?

Pavel Durov (03:44:34) TON is a blockchain technology that we initially developed in 2018 and 2019, and we started to develop it because we needed a blockchain platform to be integrated deeply into Telegram because we believe in blockchain. We think it’s one of the technologies that enable freedom. But at the time, if you look at Bitcoin, if you look at Ethereum, they were not scalable enough to cope with the load that our hundreds of millions of users would create. They would just become congested. And I asked my brother, “Can we create a blockchain platform that would be inherently scalable so that no matter how many users or transactions there are, it would split into smaller pieces?” which we call ShardChains and would still process all transactions. And he thought for a few days and said, “Yes, it’s possible, but it’s not easy.” And we started building it.

(03:45:37) We ended up succeeding in developing that technology, but we couldn’t release it because the SEC, the Securities and Exchanges Commission in the United States was unhappy with the way the fundraise for TON was conducted. So we had to abandon the project and the open source community took over. Luckily because we constantly conducted those contests for third-party developers, there was a thriving community around TON, which now stood for The Open Network as opposed to its prior name, Telegram Open Network. And so this project got eventually launched without our direct involvement. And it’s thriving now because everything we do, like I said, this blockchain based tokenized user names, Telegram accounts are all based on TON and its smart contracts.

(03:46:55) It’s the only way for third-party developers and creators to withdraw the funds that they earn through our revenue sharing programs. For example, with channel owners, we do a 50-50 split of ad revenues. It’s also the only way to transact on Telegram. For example, if you want to buy ads on Telegram, you should use TON. All the new things we launch, for example, let’s say gifts that we mentioned earlier, which you can define as a reinvented socially relevant NFT integrated into a billion user ecosystem, but at the same time available on chain, transferable, which you can own directly also based on TON. Incredibly fast growing space. We only launched them half a year ago, and now as a result of this Telegram gifts, TON has become I think the largest or the second largest blockchain in terms of daily NFT trading volumes.

Lex Fridman (03:48:19) So yeah, like you mentioned, it is a layer one technology as opposed to being built on top of Ethereum or Bitcoin and it’s able to achieve scale and the speed of transactions that’s needed for something like Telegram. And like you also mentioned the gifts. You recently launched some Snoop Dogg gifts. Is there going to be some other celebrities in the pipeline?

Pavel Durov (03:48:46) Yeah, I’m a big fan of Snoop, and that’s why when they reach out, suggests to do something together, say, “Let’s launch the Snoop related gifts.” And it was really fun. We managed to sell 12 million worth of gifts within 30 minutes.

Lex Fridman (03:49:03) 30 minutes. Well, there you go. I even got a few. But yeah.

Pavel Durov (03:49:09) After this we have many requests from many really high profile influencers that in a way are lining up.

Lex Fridman (03:49:19) So from my perspective as a fan, it’s just interesting to see what kind of art you create for any kind of celebrities, athletes, musicians, because the Snoop gifts are all just, going back to our previous conversation, just beautiful pieces of art that encapsulate certain memes, certain aspects of Snoop that everybody knows, these cultural icons that he represents. It’s cool. And the incredible detail of the art of the individual gifts is just incredible.

Pavel Durov (03:49:53) And each of these gifts is scalable because it’s vector based. It references certain points in Snoop’s creative biography, and each of them has countless different versions. We had to create over 50 distinctive versions of each. And then each individual piece is unique because it also has unique background, unique icon and the background. It’s something that we reinvented because we didn’t like the old school NFTs. First of all, they were not relevant socially because okay, you have an NFT, where do you demonstrate it? At Telegram, a telegram gift is there next to your name. It’s part of your digital identity on Telegram. And then you can create collections of gifts and show it off on your profile page.

(03:50:50) But also, the other thing that we wanted to reinvent is the aesthetic part of it. Most NFTs are just ugly and they’re not based on any sophisticated technology. So what we did with Snoop’s gifts I think represents an example of beautiful, aesthetically pleasing and at the same time very accurate in terms of references to this specific artist’s biography mixture between art and technology, which I think is quite rare. I’m quite proud of it. I think it’s a new trend, a new phenomenon. It’s only half a year old, so let’s see where it goes. We’re going to select our next influencer or artist to be part of it.

Lex Fridman (03:51:51) Hey listen, I’m really proud. I got a Snoop gift next to my name, and I figured out that you can add even more by pinning them. It’s like a cool little art icon.

Pavel Durov (03:52:02) We didn’t expect it, by the way. We just had a lot of fun launching these things. And then we realized that one of the first collections we sold each piece at something like $5. And then the minimum price of any items in this collections currently is something like $10,000. And it keeps going up. So I was quite surprised with the reception. I realized when you are trying to monetize social media platform in a way that is consistent with your values, you are forced to find ways that benefit your users, not exploit them. People love these gifts. People love the fact that they can congratulate a person close to them with something valuable and at the same time something beautiful. Also, some people make a business out of it, which is funny. They resell these gifts. We recently met a guy who earned several million dollars just from buying and selling gifts.

Lex Fridman (03:53:17) It’s a real market.

Pavel Durov (03:53:18) It’s a real market. It’s just something that he did in a few months. And last year when we launched many new features for the mini apps on Telegram and the payments options for them and the other monetization options, the same guy earned $12 million from mini apps. And I know several people saying, “Totally, I earned $10 million.” “I earned $3 million in just a matter of months single-handedly.” Sometimes they would have a team of two, three people. So whenever I hear stories from people who were able to build businesses on top of Telegram, this makes me incredibly proud.

Bitcoin

Lex Fridman (03:54:05) And mini apps include games, they include tools, services of any kind. It’s an app within the ecosystem of Telegram. Let me ask you about crypto in general. So you’ve been an early supporter of cryptocurrencies, Bitcoin. You’ve bought in into Bitcoin early on. You kept buying. Maybe you could speak to the reasoning why you kept buying Bitcoin. Do you think Bitcoin will go to a million dollars? Do you think it’ll keep increasing, Bitcoin and all the other cryptocurrencies?

Pavel Durov (03:54:40) I was a big believer in Bitcoins since more or less the start of it. I got to buy my first few thousands of Bitcoin in 2013, and I didn’t care much. I think I bought at the local maximum, it’s something like $700 per Bitcoin and I just threw a couple of millions there. A lot of people after Bitcoin later next year, went down somewhere close to 300, 200. Started to express their sympathy to me. So, “Poor Pavel. You made this horrible mistake investing in this new thing, but don’t feel bad about it. We still have some respect for you.” And my response to them were, “I don’t care. I’m not going to sell it. I believe in this thing. I think this is the way money should work. Nobody can confiscate your Bitcoin from you. Nobody can censor you for political reasons.”

(03:55:52) This is the ultimate means of exchange. And again, I’m now talking about Bitcoin, but it relates to cryptocurrencies in general. So I have been able to fund my lifestyle, so to say, from my Bitcoin investment. Some people think if I’m able to rent nice locations or fly private, it’s because I somehow extract money from Telegram. Like I said, Telegram is a money losing operation for me personally. Bitcoin is something that allowed me to stay afloat. And I believe it will come to a point when Bitcoin is worth $1 million. Just look at the trends. The governments keep printing money like no tomorrow. Nobody’s printing Bitcoin. There is a predictable inflation and then it stops at a certain point. Bitcoin is here to stay. All the fiat currencies, remains to be seen.

Two chairs dilemma

Lex Fridman (03:57:13) Let me ask you a deeply philosophical serious question. In your first Telco interview, you had two interesting chairs in the background. I think they reference a now legendary meme. The choice is Пики точёные или хуи дрочёные (Russian: “Sharpened pikes or jerked-off cocks.”) What is the philosophical wisdom in the dilemma that these two chairs present? Have you had to face the dilemma yourself personally?

Pavel Durov (03:57:37) Not this exact dilemma. I think this is a riddle that people have to face in Russian prisons. And metaphorically, it’s describing all the situations where you’re presented a choice between two suboptimal options. When you’re running a big business or when you’re running a large country, it is similar. You sometimes face this dilemma, what are you going to do, this very horrible thing or this also very horrible thing? So I think the right answer to this riddle is not to do any of these things. Reframe the question, design a solution that turns a disadvantage into an advantage and then use it to cope with the other side of the problem. So do you know the answer to that riddle?

Lex Fridman (03:58:44) No. Somebody on the internet said, “Не ходи туда, где задают такие вопросы”, which is basically try to avoid the situations where such dilemmas present themselves or there is no right answer.

Pavel Durov (03:59:02) This is one of the ways to answer this question. If you got to a tricky situation that probably earlier you made a certain mistake-

Lex Fridman (03:59:11) You fucked up already.

Pavel Durov (03:59:12) Should have been avoided. But the other quite creative answer to this question is that you take the sharp objects from one of the chairs, or the spikes and then they use them to cut off the objects from the other chair. And you know what objects I’m talking about?

Lex Fridman (03:59:38) That’s a very engineering solution. I’m glad somebody came up with that.

Pavel Durov (03:59:43) I believe this is the right answer. We’re often being manipulated by politicians, by corporate leaders to make a choice from two suboptimal options. And then when we are forced to make this choice and we make this choice, it’s almost as if it’s something that we have to assume responsibility for. I don’t think we should be buying into that.

Lex Fridman (04:00:12) Okay. And this theme of absurdity and ridiculousness, there’s an object here that appeared in… Not many people seem to have noticed this. People should go watch your excellent conversation in the Oslo Freedom Forum. Behind you, I’m no archeologist, but I believe this is a, how should I put it, a walrus penis bone, and it was behind you. You told me that you brought it with you to France and back to Dubai. I assume it brings you luck of some sort. Why did you bring it with you everywhere?

(04:01:00) Is it kind of like in America they have a wishbone? Is it just a large wishbone? Because the wishbone brings you luck. And I should also point out that just like with Telegram, with the art, there’s tiny little walruses. And thanks to you, I had to also find out that a lot of mammals have a bone inside their penis. And the evolutionary advantage, I guess, of having a bone is quite obvious. It actually raises the question of why humans don’t have a actual bone inside their penis. A lot of questions there.

Pavel Durov (04:01:31) That’s a very interesting subject. The reason I have this is because the tribe that is almost gone and extinct in Siberia and Mongolia called Evenki, passed me this gift from them. Normally they would craft something like this only for their most respected leaders. It is supposed to be a token of their appreciation for bravery, courage, leadership. Ironically, it also translates in a very specific way into the Russian language. In Russian, walrus’s penis means something a bit funny, which is often used to describe nothing. So for example, if you’re being requested by say certain government or a certain business partner to provide something that you’re not willing to provide, you can just politely have this penis bone in the background while you’re doing the video call and hope then they would.

Lex Fridman (04:02:52) Through osmosis figure out the deep message. It is an indirect rebellion. By the way, in the former Soviet Union, there was, and a lot of places throughout history, some of the rebellion had to take this kind of symbolic, metaphoric form through poetry, through children’s stories. It’s the beauty of the human language and art that we’re able to do that, say F-U, to whatever forces that try to overpower us. We say F-U through poetry, through art, and sometimes through a rather large walrus penis bone carried by what appears to be either a happy sumo wrestler or a cat of some sort.

Pavel Durov (04:03:39) They asked a lot of questions about this walrus’s penis bone in the airport, both here in the UAE and in France, they are always very interested in this thing.

Children

Lex Fridman (04:03:53) There seems to be some confusion over how many kids you have. It’s often said to be over 100. Can you explain how many kids you have?

Pavel Durov (04:04:06) The truthful answer to this question is I don’t really know how many biological kids I have exactly. Because at a certain point in my life, about 15 years ago, I decided that it was a good idea to be a sperm donor. Initially, a friend of mine asked me to help because they were trying to have a baby with his wife, and they experienced certain health issues that prevented them to do the natural way. And he asked me, he told me, “We don’t want to just rely on some random anonymous genetic material. We want somebody we know and respect to be the biological father of our kid.” And I said, “You got to be kidding me. Sounds ridiculous. What are we even talking about?”

Pavel Durov (04:05:00) … I mean, sounds ridiculous. What are they even talking about? But then I realized it’s actually a serious issue, and they were not the only couple struggling with that. So eventually, I got persuaded into doing more of it. I can’t say I am incredibly proud of that, but I think it was the right thing to do, particularly at the time when I thought, “Okay, I probably don’t have much time on this planet left. Things are getting trickier and trickier. So if I can help some couples have babies, let’s do it.”

(04:05:37) And then more recently, when I was working on my will, I realized that I shouldn’t make a distinction between the kids conceived naturally and the kids who are just my biological kids that I never seen. As long as they can establish their shared DNA with me someday, maybe in 30 years from now, they have to be entitled for a share of my estate after I’m gone. And that made a lot of noise in the news for some reason. People get very excited by this kind of news. I get a lot of messages from people claiming they’re my kids. I get a lot of requests from people asking me to adopt them. The memes were priceless. But understanding that it’s not a thing that most people do, I don’t see anything wrong with it. If anything, I think more people should be donating sperm.

Lex Fridman (04:06:52) So we should say, the 100-plus kids is from that. You also have naturally conceived kids. It was a pretty bold decision from a financial perspective to treat them all equally. And also quite interesting was that you said that they don’t receive any money for the first few decades of their life. Can you describe that thinking?

Pavel Durov (04:07:24) Yeah, I think overabundance paralyzes motivation and willpower. It’s extremely harmful, particularly for young boys, to grow up in an environment where they can be proud, not of their own achievements, but of their father’s achievements or their father’s wealth. This removes the incentive to work on developing their own skills, removes the incentive to study, to work. So I thought if they’re going to have this money, it should be something that they would only get when they’re already adult. It’s still risky, but one of the reasons I decided it makes more sense to divide this huge wealth that I’m likely to leave behind among a hundred or more than a hundred people is that it won’t be too much for every single descendant. But at the same time, some people did the calculation, it’s still many, many millions of dollars for each child, so I’m not sure it helps too much.

Lex Fridman (04:09:12) On the topic of abundance, offline we had a lot of fascinating philosophical discussions. One of which was about the mouse paradise experiment, also known as Universe 25. It’s an experiment from the 1960s and early ’70s conducted by ethologist John B. Calhoun. We can talk about this one for hours also, I’m sure. But it was an experiment with a few hundreds of individual mice compartments, and they provided them with unlimited food, water nesting, no predators, stable temperatures, and frequent cleaning. Basically the definition of abundance as far as mice go.

(04:09:56) The interesting aspect of this experiment is that at first the population doubled, it grew very quickly. But then it leveled off, and certain really negative social things started happening, like mothers neglected to kill their young, violent attacks, and hypersexual activity became widespread. Some “beautiful” ones, largely inactive, well-groomed mice withdrew, refusing to mate or interact. So all of these kind of societal qualities that we see as negative from the functioning of a society started to emerge because of the abundance. And finally, the collapse. The reproduction rates crashed, social dysfunction spread to the next generation, and eventually just went extinct. It didn’t just plummet to a low level, it plummeted steadily to zero despite the fact that those ongoing resource abundance. As this description states, the last mouse died surrounded by untouched food and water. I mean, there’s deep wisdom to that about abundance. You’ve mentioned this in different contexts throughout this conversation, is it seems like scarcity. It seems like constraints. It seems like non-abundance is essential for human flourishing, which is a counterintuitive notion. It’s true for mice, and I think it’s probably true for humans too.

Pavel Durov (04:11:27) We have evolved to overcome scarcity. Almost by definition, there has never been such thing as infinite amount of food or entertainment in our lives before now. We seem as a species to lose our ability to identify purpose in the world where you have everything and everything loses its meaning. Restrictions are important. I think though that they should be coming from within. It should be self-restriction rather than a restriction in order to create purpose and meaning in life. In a way, I was lucky in a very counterintuitive way because I grew up poor. I didn’t have money when I was a teenager. I had the same jacket for years, which was bought on a secondhand marketplace. My father wouldn’t receive his salary as a university professor for months because the Russian state was almost bankrupt back then. My mom had to juggle two jobs to take care of us. It was not easy, but it also created purpose. It created meaning. It created priorities. It allowed us to focus on things that mattered, allowed us to develop our character and intellectual abilities.

(04:13:17) Now, if we had everything, why do anything? These mice suffered societal collapse that was irreversible, and this is not an accident. This kind of experiment has been repeated countless times. At a certain point, social dysfunction and the erosion of social roles becomes contagious, and the society gradually degrades into a chaotic collection of individuals unable to take care of the next generation or even to produce the next generation, and it goes extinct.

Lex Fridman (04:14:14) It’s fascinating because we’re creating technologies and this is what AI is proposing to our future generations as a problem to solve, which is, AI may very well create abundance. So we will be like these mice potentially. Whether it’s AI or other kinds of technologies, they increasingly give more and more to all of us. And it is a thing that is good: decrease the amount of suffering in the world, increase the quality of life. But as we reach towards that abundance, the fabric that connects us, rooted in our biology that’s developed by evolution, it might create a real challenge for us.

Pavel Durov (04:14:54) We should find the right balance between chaos and order, between self-restriction and freedom for creativity.

Father

Lex Fridman (04:15:03) Your father recently celebrated his 80th birthday. You had a conversation with him. He gave you some life advice. I think you mentioned to me one of the things he said was not to just speak of your principles, but to live them, to lead by example. I think this is something you already do well. Maybe can you speak to what you’ve learned about life from your father, maybe some of the lessons he told you in the conversation you’ve had with him on his birthday.

Pavel Durov (04:15:40) I’m incredibly lucky to have my father. He’s a person who wrote countless books on Ancient Rome and Ancient Roman literature, dozens of scientific papers, and I always remember him working. He would be busy typing his books and articles in an old-school typewriter back in the late ’80s, early ’90s. He was relentless. The example he said to myself and my brother was priceless. Some people make this mistake of thinking that you can instill the right principles in the future generation or into your kids by saying things to them, but kids are smart. They discount words, they look at the actions. So observing our father was a big lesson by itself. It wasn’t necessary for him to say anything to us. And then at the same time, he was incredibly patient, emotionally resilient.

(04:17:06) My mom, great woman, incredibly smart, highly educated, but she would sometimes try to test the patience of my father. It’s a trait rooted in our biology. There’s an evolutionary explanation for that. Women sometimes tend to do that, and he demonstrated incredible patience all the time. He told me recently, “You shouldn’t give the wrong example to the people around you and in particular to your kids, because you can do the right thing nine times out of 10, but you make a mistake once, and they will instantly copy it. If you’re telling your kids not to use a smartphone, but you’re using a smartphone all the time yourself, and coming up with all kinds of sophisticated, brilliant explanations why they shouldn’t be using a smartphone, it won’t land. It’s bound to fail. So you lead by example.”

(04:18:19) There are other numerous lessons: staying positive, looking at the bright side, never despair, be honest. He told me last time I spoke to him that AI can have consciousness, can be creative, but it cannot have conscience in a way. It cannot be moral. It cannot have deeply rooted principles. It cannot have integrity in the meaning that we understand it as human beings.

Lex Fridman (04:18:57) I love the fact that you’re talking to your 80-year- old father, and you’re talking about AGI and the difference between human, the human spirit, human nature, and what AGI, AI is able to achieve. And conscience is the thing that humans have, the ability to know the right from wrong.

Pavel Durov (04:19:23) This is the lesson that he gave me. One of my goals in life is never to disappoint him.

Quantum immortality

Lex Fridman (04:19:33) Another thing we’ve talked about, which I think is a fascinating topic, is the power of the mind, power of thought. Do you believe you can affect your life and reality by thinking about it, by manifesting it into being? What do you think?

Pavel Durov (04:19:55) There are many explanations why it works. One thing most people agree on is that setting goals and staying positive and confident does allow you to achieve the things you want to achieve. It’s very hard to believe though that you can just manifest things into being without applying effort in the direction that seems to be logical. Maybe some people exist that can just sit on the bank of a river and materialize things by the power of their thought. But I’m not sure I’m one of these people. I always found it more easy to believe that if you couple this optimism and faith with logical action, then it is bound to be successful.

Lex Fridman (04:21:04) Prolonged effort, hard work, coupled with positive focus, thinking about the thing.

Pavel Durov (04:21:13) Oh yes, over many, many, many days. It’s possible to imagine our world as a high dimensional universe where humans have the ability to navigate through it with the power of belief, which is coupled with positive emotion and logical thinking. But we are getting into an esoteric realm. We don’t have any proof of that. But we also know that we probably at this point haven’t discovered even 1% about this universe.

Lex Fridman (04:22:00) I agree with you fully, and I like what you said in the way you were thinking about it. You’ve told me before that maybe there’s a way that with effort and with the focused mind, you can shape, you can morph the landscape of probabilities around you. It’s a nice way to visualize it, that somehow our effort and our focus changes the things that are likely and less likely. And by focusing on it, we make the thing more and more likely, at least as an estimate, as the kind of field that we, through our thoughts and through our actions, change that field. And then there’s eight billion of us doing so, and together there’s this collective intelligence that creates the world we see around us like the mice. Like you said, us as a humanity together are perfect. I like that you said that.

Pavel Durov (04:23:05) I admire your belief in the fact that we get to experience this together because it’s not obvious. Maybe each of us experiences his own or her own universe, and maybe every second of the universe splits into a billion of different universes, and everything that can happen happens. And there is a universe where, say, I died in 2013. Maybe every time I die, I actually get to shift to a parallel universe when I don’t die. And then it keeps going, and at certain points we achieve this quantum immortality when we’re 1,000 years old, but a lot of people from other versions of reality think we’re long gone.

Lex Fridman (04:24:04) Yeah. This is something you explained to me, the idea of quantum immortality, which is a thought experiment, which I find deeply fascinating, people should look into it, which is very crisp, clean consequence of the many worlds interpretation of quantum mechanics that we as conscious beings can’t experience our death. As we branch into these many worlds, only the living consciousnesses get to experience it. So in some sense, yeah, there’s many universes. If we were to seriously take the many worlds interpretation of quantum mechanics, there’s many universes where you died many times, especially you, and I’m glad we’re in the universe where we get to share the table with this impressive bone, a little humor, and a lot of serious topics covered today. Once again, I can’t say enough. Again, thank you from me. Again, thank you from hundreds of millions of people that follow your work, for you fighting for the freedom of all of us to speak and creating a platform where we can do so. Thank you so much for talking today, brother. It’s been an honor getting to know you and to be able to call you a friend.

Pavel Durov (04:25:22) Thank you for saying that. I’m also incredibly grateful to you and to the fact that I happened to be in this version of reality when I haven’t died, at least yet, and hopefully we’ll get to spend more fun moments in the years to come together.

Lex Fridman (04:25:44) Thank you, brother.

(04:25:45) Thank you for listening to this conversation with Pavel Durov. To support this podcast, please check out our sponsors in the description. Now, let me try to articulate some things I’ve been thinking about. If you’d like to submit questions or topics like this for me to talk about in the future, go to lexfridman.com/ama.

Kafka

(04:26:05) I’d like to use this opportunity to talk about Franz Kafka, one of my favorite writers. The reason he has been on my mind is that his work The Trial and the case of Pavel Durov in France has, let’s say, eerie parallels, both metaphorically and literally. Of course, The Trial is a work of fiction, but I think it is often useful to go to the surreal world of literature, even over-the-top dystopian variety like 1984, Animal Farm, Brave New World, The Trial, The Castle Metamorphosis, even The Plague by Albert Camus, all to understand our real world and the destructive paths we have the potential to go down together, which also hopefully helps us understand how to avoid doing so.

(04:26:55) So let me zoom out and speak about Franz Kafka. Who was he? He was an insurance clerk who wrote at night. He died young and almost completely unknown, and he asked for his manuscripts to be burned. Luckily for us, his friend, Max Brod refused to do so, giving us the work of what I consider to be one of 20th century’s greatest writers. In his work, Kafka wrote about the cold machine-like reduction of humans to case files through the labyrinth of institutional power. He wrote about an individual’s feeling of guilt even when a crime has not been committed, or more generally, he wrote about the feeling of anxiety that is part of the human condition in our modern, chaotic world.

(04:27:42) His writing style was to use short, declarative sentences to describe the surreal and the absurd, and in so doing, effectively, I think, convey the feeling of an experience versus simply describing the experience. For example, famously, his work, The Metamorphosis, opens with the following lines, “As Gregor Samsa awoke one morning from uneasy dreams, he found himself transformed in his bed into a gigantic insect. He was lying in his hard armor-plated back, and when he lifted his head a little, he could see his dome-like brown belly divided into stiff arched segments, on top of which, the bed-quilt could hardly keep in position and was about to slide off completely. His numerous legs, which were pitifully thin compared to the rest of his bulk, waved helplessly before his eyes.”

(04:28:38) Kafka, I think, effectively uses this image of being transformed into a giant bug stuck on his back to convey a feeling of helplessness and uselessness to his family, to his job, to society. The feeling of being a burden to everyone, dehumanized, alienated, and abandoned. The feeling of being only temporarily valued as long as he served some function for his job or for his family, and quickly discarded otherwise. I will probably talk about this work in more depth at another time, because it is so haunting, and I think it is such a profound description of the burden of existence in modern society for many people.

(04:29:24) But here, let me talk about another of his work, The Trial. In this novel, the main character, Josef K, is a successful bank officer, and he’s arrested on his birthday for an unspecified crime by a kind of amorphous court whose authority is everywhere and nowhere. He navigates a labyrinth-like legal system where everyone knows about his case, but no one can really explain it. The so-called trial never actually occurs in any conventional sense. Instead, Josef K’s entire life becomes the proceedings leading up to the trial. In a sense, the trial is the state of being accused itself, a permanent condition rather than a singular event. Kafka’s geniusness work was to show that modern institutions don’t need to hold trials; they just need to hold you in the permanent looming possibility of one.

(04:30:21) Public attention to this case, both positive and negative, gives Josef K a feeling of constantly being judged by people around him. This wears at his mind, and his psychological well-being begins to deteriorate. In a sense, the trial doesn’t need to convict him. The internal psychological turmoil and the external social scrutiny performs a conviction and the eventual execution. When exactly one year after his arrest, Josef K is visited by two men, walked him courteously through the city to an abandoned quarry, and stabbed him in the heart without Josef K resisting. To me, the trial shows that tyranny’s final victory isn’t when it kills you, or when you hold still for the knife, not because you’re forced, but because you’ve been exhausted into submission. Once again, it is a haunting story of the soullessness of bureaucracy in its suffocation of the human spirit. I highly recommend this short book, and I’ll probably talk about it even more in the future. I don’t think it’s especially useful for me to speak to any parallels between The Trial and Pavel Durov’s case, because after all, The Trial is a work of fiction. But on a positive note, let me report that as far as I saw, Pavel has maintained optimism and a general positive outlook throughout this whole process. What I always fear in such cases is that a bureaucratic system can wear people down, exhaust them into surrendering. I saw none of that with Pavel. I don’t think he knows how to give up or give in, no matter how much pressure he’s under. Again, this is truly inspiring to me.

(04:32:09) Also, now that we’re talking about it, let me mention some other of Kafka’s work that was moving to me. The Castle has similar description as The Trial does of the absurd inaccessibility of those in authority, of the nightmarish bureaucracy. The character in The Castle is also named K. Both bureaucracies operate through exhaustion, endless deferrals, procedures, waiting rooms. Again, highly relevant to modern times.

(04:32:37) I can also highly recommend Kafka’s In The Penal Colony and Hunger Artist. Both are too interesting and weird to explain in depth here. But let me say, the Hunger Artist is a story that I think is relevant to our modern-day attention economy, where so many people want to be famous. It tells the story of a, let’s say, professional faster who performs starvation in a cage as entertainment, and he slowly loses his audience to newer spectacles, so much so that eventually when he starves himself to death, nobody cares.

(04:33:14) Kafka’s work is heavy. It serves as a warning for the nightmare that civilization can become, and yet I think it is also a source of optimism, because when we can recognize elements of our own world in Kafka’s stories, when we can see elements of our institutions in The Trial or in The Castle, when we can see ourselves in Gregor Samsa, we’re not just diagnosing the disease, we’re proving that we’re still human and wise enough to see it and name it. Kafka gave us the goal: to resist against such systems that tried to dehumanize us and to ensure that individual freedom and the human spirit keep flourishing. I think it will. I have faith in us humans. I love you all.

DHH:编程的未来、人工智能、Ruby on Rails、生产力与育儿 (2025-07-12)

DHH: Future of Programming, AI, Ruby on Rails, Productivity & Parenting (2025-07-12)

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:Ruby on Rails 创始人及 37signals CTO DHH (David Heinemeier Hansson),在技术范式(如云计算、AI 编程)快速迭代的当下,与 Lex Fridman 探讨贯穿其职业生涯的软件开发持久原则与反主流的商业哲学。
  • 核心论点:本对话的核心是一套以人为本的软件开发与商业世界观。DHH 认为,软件开发的终极目标应是优化“程序员的幸福感”,即通过追求代码的美学、工具的简洁和心流的体验,来最大化创造者的潜能与满足感。这一理念不仅体现在他对 Ruby 语言优雅性的推崇,更延伸至他对技术架构(“雄伟的单体”优于微服务)、商业模式(盈利优于增长,小团队优于规模扩张)和未来趋势(警惕 AI 带来的技能退化)的批判性思考中。他挑战了硅谷“复杂性即进步”、“增长不惜一切代价”的主流叙事,主张通过构建简单、自主、可持续的系统,个人与小团队不仅能创造出卓越的产品(如 Basecamp、Shopify),更能实现一种更完整、更有意义的职业与人生。

2. 🧠 深度观点解析 (Deep Dive Analysis)

维度一:编程语言的“人文主义”——以 Ruby 为例

  • 核心观点:优秀的编程语言应将“程序员的幸福感”置于首位,其设计哲学应服务于人类的直觉与美学,而非仅仅为了机器的解析便利。Ruby 之美在于它信任并赋能程序员,视他们为“软件作家”,而非需要被严格约束的“工程师”。
  • 原理解构:这是一种将编程从纯粹的工程学科提升到兼具人文与艺术属性的视角。
    • 最小化认知负荷:通过消除“行噪声”(line noise),如不必要的分号、括号和美元符号,让代码读起来更像自然语言,降低了阅读和理解的心理摩擦。
    • 信任而非约束:与 Java 设计者 James Gosling 认为“普通程序员很蠢,需要被限制”的哲学相反,Ruby 的创造者 Matz 相信程序员有能力驾驭“锋利的刀”(sharp knives),如元编程和开放类(monkey-patching)。这种信任允许开发者扩展语言本身,创造出领域特定语言(DSL),如 Rails 中的 5.dayshas_many :comments,使代码更具表现力。
    • 美学优先unless 关键字的存在,以及判断方法以问号结尾(如 user.admin?)等设计,并非出于性能或功能的必要,而是纯粹为了提升代码的可读性和诗意。这是以人为中心的体现,愿意为人类的愉悦感增加解释器的实现复杂度。
  • 证据/案例:DHH 多次对比了 Python 的 def __init__(self, ...) 与 Ruby 的 def initialize,前者充满了为解释器服务的下划线和 self 样板代码,后者则更符合人类直觉。5.times { ... } 的简洁性被誉为无出其右。Rails 中的 Active Record 框架是元编程和 DSL 的集大成者,让数据库关系描述变得极其直观。

维度二:“雄伟的单体架构”(The Majestic Monolith) 的复兴

  • 核心观点:对于绝大多数 Web 应用而言,单体架构远优于当下流行的微服务架构。微服务过早地引入了分布式系统的复杂性,而单体则能将整个系统保持在单个开发者可以理解的认知范围内,从而实现更高的开发效率和系统内聚性。
  • 原理解构:这是对“康威定律”的一种反向应用。康威定律指出系统设计会反映组织的沟通结构。DHH 认为,大型组织(如 Netflix、Meta)因其庞大的团队而“被迫”采用微服务,但创业公司和小团队错误地将此“解药”当作“维生素”来效仿,导致不必要的复杂性。单体架构的优势在于:
    • 零网络延迟:方法调用取代了网络请求,避免了分布式系统中最棘手的故障模式和延迟问题。
    • 认知整体性:开发者可以轻松地在代码库中追溯逻辑,理解整个系统的运作方式,而无需跨越多个服务和 API 边界。
    • 统一的技术栈:简化了开发、测试和部署流程。
  • 证据/案例BasecampHEY 这两个成熟且盈利的产品,其核心 Ruby 代码量均在 10 万行左右,完全由小团队维护。更具说服力的是 Shopify,这个承载了全球巨大电商交易量的巨头,其核心系统依然是一个庞大的 Rails 单体应用,证明了单体架构的可扩展性远超人们的普遍认知。

维度三:云端“还乡”——对公有云成本与复杂性的反思

  • 核心观点:对于工作负载相对稳定的成熟企业而言,公有云(特别是 AWS)并非承诺中的“更便宜、更简单”的解决方案,而是一种成本高昂的“奢侈品”。自建硬件(On-premise)能以更低的成本获得更强的性能、更高的自主权和更贴近互联网本质的架构。
  • 原理解构:DHH 解构了公有云的核心卖点:
    • 成本神话:AWS 近 40% 的高利润率本身就违背了“规模经济带来低价”的承诺。通过购买硬件,37signals 将基础设施成本削减了 1/2 到 2/3。他认为,云是“租”,而自建是“买”,对于长期需求,买总比租便宜。
    • 简单性神话:AWS 的配置(如 IAM 规则)极其复杂,其复杂性已经超过了管理自己的 Linux 服务器。
    • 速度的误用:“分钟级扩展数千台服务器”的能力对绝大多数业务是伪需求。对于可预测的增长,提前采购硬件的周期(几周)是完全可以接受的。
  • 证据/案例:37signals 的年度云账单曾高达 320 万美元。他们在 6 个月内将 7 个主要应用迁出云端,没有增加一名运维人员,预计 5 年内节省 1000 万美元。他还提到了 XAI 团队自建万张 GPU 集群,证明了在技术前沿,拥有硬件同样是关键优势。

维度四:AI 时代的“编程手艺人”精神

  • 核心观点:AI 是一个强大的“结对程序员”,能极大地提升学习和探索效率,但不应让它完全“驾驶”编程过程。过度依赖 AI 代码生成(DHH 称之为“vibe coding”)会侵蚀程序员的核心能力,因为真正的技能是通过手指在键盘上反复练习、与代码亲密接触而内化的。
  • 原理解构:DHH 强调了过程本身的人类价值。编程不仅仅是为了得到一个可执行的产物,其过程本身,即“用双手雕琢代码”,是创造性满足感和心流体验的源泉。
    • “用手指学习”:他类比学习弹吉他,看再多视频也不如亲手按压琴弦。肌肉记忆和对工具的直觉是通过物理互动建立的。
    • 能力流失的恐惧:当他尝试让 AI 为其 Omakub 项目写 Bash 脚本时,他发现自己反复询问同样的问题却记不住,感受到了“能力正从指尖流失”。
    • 保持手艺:他选择将 AI 作为一个独立的咨询窗口,而非深度集成在编辑器中,以保留自己作为“工匠”的主动性和对代码的掌控感。他宁愿退休,也不愿放弃亲自编程的乐趣。
  • 证据/案例:他在学习 Bash 语言时,发现只有亲手把 AI 生成的代码重新敲一遍,才能真正掌握它。他将自己定位为“chiseling my code”(雕琢我的代码),与让 AI 驱动的“project manager of a murder of AI crows”(一群 AI 乌鸦的项目经理)形成鲜明对比。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识

    • 云是默认选项的反论:在初创企业和科技界普遍将 AWS/GCP/Azure 视为基础设施的默认起点时,DHH 提供了翔实的案例,证明对于成熟业务,“云端还乡”不仅可行,而且在经济和战略上都极为有利。
    • 动态类型的辩护:在 TypeScript 几乎成为 JavaScript 社区“专业”代名词的今天,DHH 激烈地捍卫动态类型,认为静态类型为换取他并不看重的“工具便利性”(如自动补全)而牺牲了代码的美学、简洁性和元编程的灵活性。
    • “小即是美”的商业哲学:与硅谷主流的 VC 驱动、追求独角兽地位和百亿市值的“增长叙事”完全相反,DHH 主张保持小规模、追求盈利、拒绝风投,认为这才是通往可持续幸福和创造优质产品的道路。
  • 盲点与局限

    • JavaScript 生态的“存在主义焦虑”:DHH 尖锐地指出,JS 社区在 2010-2020 年间经历的剧烈技术栈 churn(框架和工具的不断更迭)是一种“精神失常”。他将其归因于开发者为了逃避自己只是“CRUD 猴子”(增删改查数据库的工人)这一事实而产生的“存在主义恐惧”,需要通过过度复杂化来寻求慰藉和自我价值。
    • GDPR 的“善意地狱”:他将遍布全球的 Cookie Banner 称为“欧洲在科技领域失败的纪念碑”,是“善意直通地狱”的典型案例。一项旨在保护隐私的法规,最终沦为对全球用户体验的巨大破坏,却没有任何实际益处,且官僚体系无力纠正这一明显错误。
  • 未解之谜

    • AI 是否会终结编程职业:对话承认,没人能确切知道 AI 对编程的长期影响。编程是否会像马匹一样,从主要的生产工具变为一种娱乐性的“手艺”?
    • “Vibe Coding”的未来:通过提示词生成并迭代修改代码的“Vibe Coding”,究竟是一种浅薄的技能,还是未来一种全新的、有深度的核心编程能力?DHH 对此持怀疑态度,但承认这是一个悬而未决的问题。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “If you go to goddamn Mars on one of Elon’s rockets and you try to access a web page, you’ll still see a cookie banner. No one in the universe is safe from this nonsense.”

    • 中文意译:“就算你搭埃隆的火箭去了该死的火星,试图访问一个网页,你还是会看到一个 Cookie 弹窗。全宇宙没人能逃过这等蠢事。”
    • 语境:在批判 GDPR 带来的全球性 Cookie 弹窗灾难时,用极致的夸张手法描绘其无处不在和荒谬性。
  2. “A lot of people, I think, are very uncomfortable with the fact that they are essentially crud monkeys. They just make systems that create, read, update, or delete rows in a database and they have to compensate for that existential dread by over complicating things.”

    • 中文意译:“我认为,很多人对自己本质上只是‘增删改查的猴子’这一事实感到极度不适。他们做的系统只是在数据库里创建、读取、更新或删除数据行,他们必须通过过度复杂化来补偿这种存在主义的恐惧。”
    • 语境:分析 2010 年代 JavaScript 社区过度追求复杂框架和工具链的深层心理动机。
  3. “His [James Gosling, creator of Java] view of humanity was rather dark. His view of humanity was programmers at the average are stupid creatures. They cannot be trusted with sophisticated programming languages…”

    • 中文意译:“他(Java 创始人 James Gosling)的人性观相当灰暗。他认为,普通程序员是愚蠢的生物,不能信任他们使用复杂的编程语言……”
    • 语境:对比 Ruby 与 Java 的设计哲学,强调 Ruby 建立在对程序员智慧和创造力的信任之上,而 Java 则倾向于通过严格限制来防止错误。
  4. “Mojito Island is a mirage. It always was. There is no retirement for ambitious people.”

    • 中文意译:“(退休后的)莫吉托岛是个海市蜃楼,一直都是。对于有雄心壮志的人来说,根本不存在退休这回事。”
    • 语境:反驳“拼命工作几年然后财务自由退休”的创业神话,指出对于创造型人才,持续工作和解决难题本身就是幸福感的来源,彻底的“放松”反而是一种地狱。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)

    • 技术栈:对“No-build”理念(如 Rails 8)的兴趣会增加,更多中小型和盈利型公司可能会重新评估自建基础设施(On-premise)与公有云的成本效益,尤其是在利率上升、注重成本的宏观环境下。“单体优先”的架构思想可能在初创公司中获得更多拥护者。
    • 产品形态:更多关注提升“全栈个人开发者”生产力的集成工具会出现,以对抗行业过度分工的趋势。
    • 竞争格局:DHH 的“云端还乡”案例将成为行业内的重要参考,可能会催生更多帮助企业简化“下云”过程的服务商。
  • 长期终局 (5-10年)

    • 行业分化:如果 DHH 的设想成真,软件开发行业可能出现两极分化:一边是少数精英“软件工匠”,使用高度集成、以人为本的工具栈(如未来的 Rails)构建优雅、持久的商业应用;另一边是大量的“AI 协调员”或“系统集成者”,负责管理和编排由 AI 生成或由大型平台提供的服务。
    • 云的商品化:公有云巨头可能被迫降低利润率,回归到更纯粹的计算、存储和网络等基础设施即服务(IaaS)角色,而高利润的平台即服务(PaaS)和软件即服务(SaaS)层将面临来自更高效的自建方案的竞争。
    • 商业文化变迁:可能会出现一股“反 VC”的创业文化潮流,更多创始人会选择“37signals 模式”——小而美、持续盈利、创始人驱动,以此作为对抗“独角兽或死亡”文化的另一种成功范式。
  • 行动建议

    • 开发者投资于持久的技能,而非短暂的框架。深入理解 HTTP、SQL、Linux 等基础,同时选择一个能让你感到“幸福”的生态系统并深耕。警惕 AI 带来的技能退化风险,有意识地进行“手动挡”练习,保持对代码的亲密接触。
    • 投资者:在评估技术公司时,要穿透“技术选择”的表象,审视其背后的哲学和成本结构。一个采用单体架构并自建基础设施的公司,其利润率和运营效率可能远高于一个完全依赖微服务和公有云的“时髦”公司。
    • 创业者认真审视“默认路径”。不要盲目追随 VC、微服务、公有云的潮流。思考你的终极目标是什么:是快速退出,还是建立一个让你享受工作、掌控命运、长期盈利的“人生事业”?DHH 的经历证明,后者不仅是可能的,而且可能更令人满足。

总结 (gemini-3-flash-preview)

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:本次对话受访者是 David Heinemeyer Hansson (DHH),Ruby on Rails 框架创始人、37signals CTO 兼赛车手。在技术泡沫与复杂性泛滥的当下,DHH 分享了他对软件工程回归本质、技术自主权以及商业长期主义的深刻见解。
  • 核心论点:现代软件开发正陷入“复杂性陷阱”。从过度依赖云基础设施(AWS 等)到推崇极其复杂的 JavaScript 框架,开发者正逐渐失去对工具和代码的掌控感。DHH 提出了一种反思性的世界观:主张程序员幸福感优先、提炼代码的美学价值、回归单体应用(Monolith)、通过云退出(Cloud Exit)重掌硬件主权,并坚持两人的极致小团队模式。他认为软件开发不应是枯燥的工程堆砌,而是一场追求“流(Flow)”的创意写作,旨在通过极致的效率和简洁性来对抗行业的平庸化与科层制。

2. 🧠 深度观点解析 (Deep Dive Analysis)

A. 开发者人体工程学:程序员幸福感作为核心指标

  • 核心观点:编程语言应为人类的头脑服务,而非仅仅为了机器的高效解析。
  • 原理解构:DHH 强调 Ruby 的设计哲学——“程序员幸福感”。与 Python 的“唯一正确路径”或 Java 的“限制人类犯错”不同,Ruby 通过元编程 (Metaprogramming)DSL (领域特定语言) 赋予开发者极大的表达自由。它剔除了分号、大括号等“行噪声 (Line Noise)”,使代码读起来像诗歌或散文。
  • 案例:Ruby 中 5.days 这种表达方式,背后是扩展了基础整数类的逻辑。这种“对人类信任”的底层逻辑,使得开发者能在极高带宽的语义环境下工作。

B. “云退出”运动:重构经济与技术主权

  • 核心观点:公有云(Cloud)已从创业者的加速器变成了成熟企业的“智商税”。
  • 原理解构:DHH 指出 AWS 等云服务宣称的“便宜、简单、快速”中,只有“快速扩容”在极端情况下成立。对于业务稳定的企业,云服务 40% 的毛利率正是企业损失的利润。通过重回自建硬件 (On-premise) 配合“白手套”数据中心服务,企业可以大幅降低成本。
  • 案例/数据:37signals 通过退出云端,预计在 5 年内节省 1000 万美元,且团队规模并未因管理物理服务器而增加,这打破了“云端更省人”的迷思。

C. 软件开发中的“反复杂性”:单体 vs. 微服务

  • 核心观点:微服务是针对超大规模组织(如 Netflix)的组织架构产物,中小型团队盲目跟风是“自残”。
  • 原理解构:DHH 提倡单体王权 (The Majestic Monolith)。微服务将函数调用转变为网络调用,引入了巨大的分布式系统不确定性。单体应用通过概念压缩,允许单个程序员理解系统的全局,从而实现极高的开发速度。
  • 案例:Shopify 在黑五期间每秒处理百万级请求,底层依然运行在 Ruby on Rails 的单体结构之上,证明了单体架构完全具备支撑世界级规模的能力。

D. AI 时代的编程:Vibe Coding vs. 基础能力

  • 核心观点:AI 是卓越的“配对程序员 (Pair Programmer)”,但“氛围编程 (Vibe Coding)”会导致能力的退化。
  • 原理解构:DHH 认为编程能力的习得必须通过“手指在键盘上的敲击(Fingers in the sauce)”。AI 虽然能快速生成代码,但如果不亲手编写,开发者将失去对底层逻辑的深度理解,最终沦为无法排查错误的“点击猴子”。
  • 案例/类比:就像看健身视频无法强身健体、看吉他教程无法学会弹奏,编程是一门必须通过肌肉记忆和逻辑试错建立的技能。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识:规模并非成功的唯一标准。主流价值观认为不追求成为 Atlassian (Jira 开发商) 或独角兽就是失败。DHH 提出“够用就好”的哲学:37signals 坚持不接受风投,通过极高的利润率和 50-60 人的团队规模,实现了远超大厂高管的生活质量与创作自由。
  • 盲点与局限:静态类型的迷信。行业普遍认为大型项目必须使用静态类型(如 TypeScript)。DHH 尖锐地批评这是一种“工业上的多余工作”,TypeScript 增加了代码的冗余和“类型体操”,却并未从根本上减少逻辑错误。
  • 未解之谜:开源协议的“礼品交换”边界。针对 WordPress 创始人 Matt 与 WP Engine 的冲突,DHH 指出目前的开源治理正面临挑战。当“馈赠(Gift)”被商业巨头大规模收割时,BDFL(终身仁慈独裁者)是否该拥有“收回馈赠”的权力?目前尚无共识,但 Matt 的强力干预被 DHH 视为是对开源生态诚信根基的破坏。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “No one anywhere who’s serious believes that cookie banners does anything good for anyone… It’s a monument to good intentions leading straight to hell.” (任何严肃的人都不相信 Cookie 横幅有任何好处……它是“通往地狱的路是由善意铺成的”这一谚语的纪念碑。) —— 语境:讨论欧洲过度监管对互联网体验的破坏。
  2. “A lot of people are very uncomfortable with the fact that they are essentially CRUD monkeys. They have to compensate for that existential dread by overcomplicating things.” (很多人对自己本质上是“数据库增删改查猴子”这一事实感到不安。他们必须通过过度复杂化系统来补偿这种存在主义焦虑。) —— 语境:分析微服务和复杂架构滥用的心理动机。
  3. “Mojito Island is a mirage. There is no retirement for ambitious people.” (莫吉托岛是一个海市蜃楼。对于有野心的人来说,根本没有所谓的退休。) —— 语境:讨论财富自由后的心理状态,强调创造力驱动的生活才有意义。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)
    • 云原生的回流:随着利率上升和企业对利润的渴求,将会有更多中型 SaaS 企业效仿 37signals 尝试“云退出”。
    • Rails 8 的“无构建”趋势:Web 开发将抛弃复杂的 JS 构建工具,回归原生浏览器解析能力(Import maps 等)。
  • 长期终局 (5-10年)
    • 编程的“骑马化”:随着 AI 接管 90% 的基础代码编写,手动编程将从一种“生存技能”转变为一种“精英手艺”或“娱乐活动”,如同汽车普及后的骑马。
    • 小团队的爆发:AI 赋能下,1-2 人的公司(Solopreneur)将能支撑数百万美元收入的业务,传统的臃肿研发组织将面临解体。
  • 行动建议
    • 对开发者:不要只做“氛围程序员”,要像学吉他一样练习基础语法,保持“手感”。
    • 对创业者:慎重接受风投(VC)。如果业务本身盈利,保持独立性(Bootstrap)能带来更长的生命周期和更高的幸福感。
    • 对架构师:默认选择单体架构,除非业务规模真的突破了单台服务器的物理极限。

总结 (glm-4.7-flash)

1. 🎯 核心论题与背景

  • 对话背景:对话对象是 David Heinemeyer Hansen (DHH),知名程序员、Ruby on Rails(Rails)框架创建者、37signals(Basecamp 创始人)联合创始人及职业一级方程式(Le Mans)赛车手。讨论背景置于 AI 爆发、遥测数据突增以及行业回潮至 “No Build”(无构建)的十字路口,审视软件开发工具、企业文化与个人幸福感之间的冲突。
  • 核心论点:尽管科技界正在经历以 AI 和分布式系统为代表的激进变革,但 DHH 坚持认为,核心并没有改变——我们编写的依然是 CRUD 应用,目的是解决人类协作与商业问题。对话旨在捍卫 “开发者体感” (Developer Ergonomics) 的优先级,批判以企业为中心的复杂性(如 TypeScript 的类型系统、JS 的构建链、极繁的微服务架构),并回归到 “软件作家” 的本质。DHH 强调真正的幸福感源于 “Flow State” (心流),而非不断增加的代码行数会或将来的巨额财富,并提出在正确的利基市场,极简的瀑布流模式(单机/自托管)比云端模式更具成本效益和可控性。

2. 🧠 深度观点解析

A. 开发者工具的回退与 “No Build” 运动

  • 核心观点:现代前端开发经历了长达十年的 “黑暗时代”(依赖 Webpack、Jest 等臃肿的构建工具),DHH 呼吁通过 Rails 8 回归 90 年代的 “硬核极简主义”。
  • 原理解构:高级语言的编译速度和 NPM 等生态系统的确定性问题,使得在文本编辑器中直接编写、FTP 上传并运行成为可能。真正的进步不应是让机器更复杂,而是让工具更符合人类的直觉——即 “写完就能跑” 的成就感。
  • 证据/案例:DHH 提到 Pieter Levels 使用 PHP、jQuery 和 SQLite 即可保持极高的开发者生产率;Vercel 的 “Serverless” 和各种框架虽然在方法论上领先,但在开发者体验上增加了负向的摩擦力。

B. 程序员身份:从 “工程师” 到 “作家”

  • 核心观点:主张程序员应追求 “程序员幸福感”,认为代码的美感与可读性应当与逻辑同等重要,而不是为了防御性编程而牺牲代码的优雅。
  • 原理解构:与 Java/Rust 等语言设计者(视程序员为需要保护/管束的笨拙生物)不同,Ruby 设计者 Matz 相信人类的潜能,故意在语法中保留歧义(如双大括号、宽松的类型检查),让代码像诗歌一样可读。这对应 Ruby 的 “Soft Ramp”(软坡道)哲学。
  • 证据/案例:对比 Python 中 __init__ 为了兼容旧短语而被搞得面目全非,与 Ruby 中极简的 initialize 或直接的方法链;他自己定义 5.days 扩展方法来提高系统可维护性的案例。

C. 架构哲学:单体重结构优于微服务

  • 核心观点:对于大多数初创和成长型公司,过早引入微服务和分布式架构是 “愚蠢的”(premature optimization)。
  • 原理解构:微服务的目的是将复杂度去中心化以解决工程师脑力不足的问题,但对于单个程序员或小型团队,增加网络调用和系统间耦合(分布式编程第一定律:不要分布式)只会徒增复杂性。单体架构允许系统在脑海中和代码中保持完整性。
  • 证据/案例:认为 Shopify(年营收 1000 亿美金级别)仍在单一大单体上运行(约 500 万行代码),证明了 Monolith 的可行性;Basecamp 和 HEY(各约 10 万行代码)完全由 DHH 或单一 engineer-team 维护。

D. 资本主义与环境:自托管与反云端

  • 核心观点:企业不应忽视基础设施成本,自托管对于长期运行的基础软件是更经济、更符合互联网 90 年代精神(分布式节点)的选择。
  • 原理解构:AWS 等云厂商采用 “浪佣金” 模式赚取极高利润(约 40%),云计算的 “省心、省钱、更快” 是伪命题(慢于自买硬件,费于自托管运维,仅适合突发流量)。随着硬件演进(TSMC 制程),单机计算成本急剧下降。
  • 证据/案例:Basecamp 撤出云端后节省了大量费用,且将构建过程封装为 “Omakub”(标准化 Linux 环境),在 30 分钟内即可重置开发环境,超越了传统 CI/CD 的便利性。

E. AI、学习曲线与 “Vibe Coding”

  • 核心观点:AI 是极佳的结对编程伙伴,但不能取代亲手编写的成就感;真正的编程技能需要通过 “动手写” 来习得。
  • 原理解构:学习编程类似于学吉他,看视频教程(AI 生成)无法替代 “手指按下琴弦” 的肌肉记忆。未来的编程可能分为两类:高级的 “创意/编辑者”(写 Prompt、调试、定义架构)和低级的 “执行者”(码农)。
  • 证据/案例:尝试让 AI 自动编写 Bash 脚本时,DHH 感觉到自己失去了学习曲线;强调 “Instagram 大法”(看别人的美景图)无法让你学会 “登录”。

3. 💡 反直觉与批判性视角

  • 打破共识:主流观点认为 “威胁程序员饭碗的是 AI”,DHH 反其道行之,认为 AI 的洪水先淹没的将是初级和重复性工作,而高级的 “软件作家”(解决人类协作用例、 Crafting good software)因为能从 AI 生成的内容中把控审美和架构,反而更有竞争力。他认为编程的真正门槛不再是语法,而是 “如何提出正确的问题”。
  • 盲点与局限:DHH 对 “成熟团队必须使用复杂工程架构” 持全盘否定态度,这在极大规模(如 Google 或超大社交网络的核心层)可能存在逻辑缺陷,因为这些团队的 “沟通成本” 不仅源于人,更源于系统本身的复杂度。但他对未达波尔图图的初创公司的建议非常精准。
  • 软件的停滞与永恒:尽管 AI 瞬息万变,但网页本质在 30 年间几乎没变(CRUD、表单交互)。DHH 指出,如果尼尔森·诺曼 50 年前创建了现代网页,他现在只需要画布支持 4K 分辨率。真正的平台革新(Web、AI、智能手机)只是历史的低概率事件,科技界更多时候是在做像素级的 bump 磨合。

4. 💎 金句与高光时刻

  • “Everyone is a CRUD monkey, and they have to compensate for that existential dread by over-complicating things.”
    • (中文意译:每个人最终都是 CRUD 猴子,他们为了掩饰这种存在的焦虑,不得不把事情搞得极其复杂。)
  • “The internet is ugly in part because of cookie banners.”
    • (中文意译:互联网之所以丑陋,部分原因就在于 Cookie 弹窗。)
  • “Regress to complexity, but bring the power. That’s the goal.”
    • (中文意译:回归复杂性,但保留算力。这正是 “No Build” 的目标。)
  • “Mojito Island is a mirage.”
    • (中文意译:退休后的海岛生活(过得很舒服)只是海市蜃楼。真正的幸福在于持续挑战极限(Flow)。)

5. 🚀 行业启示与未来推演

  • 短期影响 (1-3年):开发者工具链将出现两极分化。激进的用户(如 DHH)正在推动回归 “CLI + 文本文件” 的极简工作流,并推广 Linux 作为首选桌面平台。小巧、轻量、单一栈的前端框架将重新流行,以对抗繁重的构建系统和混乱的依赖地狱。
  • 长期终局 (5-10年):行业将朝着两极分化演进。一端是追求极致性能和复杂性的超级大厂(分布式系统、AI 训练集群),另一端是大量 “自托管” 的独立开发者和小团队,他们运行着 400 小时就能搭建起来的高性能服务器,利用软件的欢乐而非压榨人力来获客。
  • 行动建议
    • 对于开发者/创业者:向内看,审视你的技术栈是否增加了 “工具噪音”。如果你不是在开发 CUDA 驱动程序,Ruby、Go 或简单的 JS 可能比 TypeScript 和复杂的配置更值得尝试。优先考虑一名程序员在特定时间段的 生产力,而不是大规模团队的协调成本。
    • 对于投资方:资金流向将从单纯的 “模型计算能力”(Training Compute)转向 “增效工具”(Developer Tools,即如何让人类写代码更快、更好、更快乐的投资)。
    • 对于个人生涯:避免进入管理陷阱(被会议填满日程)。坚持全栈能力并追求代码的 “诗意”(可读性),因为在 AI 时代,能驾驭 AI 并创造出具有人类审美和意图的作品,才是稀缺价值。

逐字稿

Episode highlight

DHH (00:00:00) No one anywhere who’s serious believes that cookie banners does anything good for anyone, yet we’ve been unable to get rid of it. This is the thing that really gets me about cookie banners too. It’s not just the EU, it’s the entire world. You can’t hide from cookie banners anywhere on this planet. If you go to goddamn Mars on one of Elon’s rockets and you try to access a web page, you’ll still see a cookie banner. No one in the universe is safe from this nonsense.

(00:00:26) It sometimes feels like we’re barely better off. Web pages aren’t that different from what they were in the late ’90s, early 2000s. They’re still just forms. They still just write to databases. A lot of people, I think, are very uncomfortable with the fact that they are essentially crud monkeys. They just make systems that create, read, update, or delete rows in a database and they have to compensate for that existential dread by over complicating things. That’s a huge part of the satisfaction of driving a race car is driving in at the edge of adhesion, as we call it, where you’re essentially just a tiny movement away from spinning out. Doesn’t take much. Then the car starts rotating. Once it starts rotating, you lose grip and you’re going for the wall. That balance of danger and skill is what’s so intoxicating.

Introduction

Lex Fridman (00:01:21) The following is a conversation with David Heinemeyer Hansen, also known as DHH. He is a legend in the programming and tech world, brilliant and insightful, sometimes controversial, and always fun to talk to. He’s the creator of Ruby on Rails, which is an influential web development framework behind many websites used by millions of people, including Shopify, GitHub, and Airbnb. He is the co-owner and CTO of 37signals that created Basecamp, HEY, and ONCE.

(00:01:57) He is a New York Times best-selling author together with his co-author, Jason Fried, of four books, Rework, Remote, Getting Real, and It Doesn’t Have To Be Crazy At Work. And on top of that, he’s also a race car driver, including being a class winner at the legendary twenty-four-hour Le Mans race. This is the Lex Fridman podcast. To support it, please check out our sponsors in the description and consider subscribing to this channel. And now, dear friends, here’s DHH.

Lex Fridman (00:02:32) For someone who became a legendary programmer, you officially got into programming late in life, and I guess that’s because you tried to learn how to program a few times and you failed. So can you tell me the full story, the saga of your failures to learn programming? Was Commodore 64 involved?

DHH (00:02:53) Commodore 64 was the inspiration. I really wanted a Commodore 64. That was the first computer I ever sat down in front. And the way I sat down in front of it was I was five years old and there was this one kid on my street who had a Commodore 64. No one else had a computer, so we were all the kids just getting over there and we were all playing Yie Ar Kung-Fu. I don’t know if you’ve ever seen that game. It was one of the original fighting games. It’s really a great game and I was playing that for the first time at five years old, and we were like seven kids sitting up in this one kid’s bedroom all taking our turn to play the game. And I just found that unbelievably interesting. And I begged and I begged and I begged my dad, “Could I get a computer?” And he finally comes home. He’s like, “I got you a computer.” I was like, yes, my own Commodore 64. And he pulls out this black, green and blue keyboard that’s an Amstrad 464. I was like, “Dad, what’s this?”

Lex Fridman (00:03:53) The disappointment.

DHH (00:03:54) This is not a Commodore 64. But it was a computer. So I got my first computer at essentially six years old, that Amstrad 464. And of course, the first thing I wanted to do, I wanted to play video games. And I think the computer, which he by the way had traded for a TV and a stereo recorder or something like that, came with two games. One was this Frogger game where you had to escape from underground. It was actually kind of dark, like this frog, you’re trying to get it out from underground. I was pretty bad at it. And I only had those two games and then I wanted more games. And one way to get more games when you’re a kid who doesn’t have a lot of money and can’t just buy a bunch of games is to type them in yourself. Back in ’84, ’85, magazines would literally print source code at the back of their magazines and you could just sit and type it in.

(00:04:46) So I tried to do that and it would take like two hours to print this game into the Amstrad, and of course I’d make some spelling mistake along the way and something wouldn’t work and the whole thing… I wasn’t that good of English, I was born in Denmark. So I was really trying to get into it because I wanted all these games and I didn’t have the money to buy them. And I tried quite hard for quite a while to get into it, but it just never clicked. And then I discovered the magic of piracy, and after that I basically just took some time off from learning to program because well now suddenly I had access to all sorts of games. So that was the first attempt around six, seven years old. And what’s funny is I remember these fragments. I remember not understanding the purpose of a variable.

(00:05:34) If there’s a thing and you assign something, why would you assign another thing to it? So for some reason, I understood constants. Constants made sense to me, but variables didn’t. Then maybe I’m 11 to 12, I’ve gotten into the Amiga at this point. The Amiga, by the way, still perhaps my favorite computer of all time. I mean, this is one of those things where people get older and they’re like, oh, the music from the ’80s was amazing. To me, even as someone who loves computers and love new computers, the Amiga was this magical machine that was made by the same company that produced the Commodore 64 and I got the Amiga 500 I think in ’87.

Lex Fridman (00:06:16) Look at this sexy thing. That is a sexy machine right there.

DHH (00:06:19) This is from an age by the way where computing wasn’t global in the same sense, that different territories had different computers that were popular. The Amiga was really popular in Europe, but it wasn’t very popular at all in the US as far as I understand. It wasn’t popular in Japan. There were just different machines. The Apple II was a big thing in the US. I’d never even heard of Apple in the ’80s in Copenhagen. But the Amiga 500 was the machine that brought me to want to try it again. And do you know what’s funny? The reason I wanted to try it again was I remembered the first time I tried to learn and then there was this programming language that was literally called EasyAMOS, like the easy version of AMOS. I’m like, if it’s easy AMOS, how hard can it be? I’ve got to be able to figure this out.

(00:07:04) And this time I tried harder. I got into conditionals, I got into loops, I got into all these things and still, I couldn’t do it. And on the second attempt, I really got to the point of maybe I’m not smart enough. Maybe it’s too much math. I like math in this sort of superficial way. I don’t like it in the deep way that some of my perhaps slightly nerdier friends did, who I had tremendous respect for, but I’m not that person. I’m not the math geek who’s going to figure it all out. So after that attempt with EasyAMOS and failing to even get… I don’t even think I completed one even very basic game. I thought, programming’s just not for me. I’m going to have to do something else. I still love computers. I still love video games.

(00:07:53) I actually at that time had already begun making friends with people who knew how to program, who weren’t even programming EasyAMOS, they were programming with freaking Assembly. And I would sit down and just go, the moves and the memories and the copies, how do you even do this? I don’t even understand how you go from this to Amiga demos for example. That was the big thing with the Amiga. It had this wonderful demo scene in Europe. It’s this really interesting period of time in the Amiga’s history where you had all these programmers spread out mostly all over Europe who would compete on graphic competitions where you could probably bring one of these different-

DHH (00:08:36) On this thing. They would make these little almost like music videos, combining some MIDI music, combining some cool graphics, and they would do all of it in like 4K. Four kilobytes that is. Not four Ks of resolution. Four kilobytes of memory. And I just thought that was such a cool scene. This was obviously pre-internet. It was even pre-BBS, bulletin board systems, to some extent. It was you swap your demo software with someone else by sending them a disk in the mail, like the 3.5s. And I was enamored with that whole scene. I was enamored with what they were able to create and I just wanted to be a part of it even though I kind of didn’t have any skills to contribute. And that’s how I got into running BBSs.

(00:09:22) I didn’t learn programming then and I wouldn’t learn programming until much later, until I was almost 20 years old. The bulletin board systems existed in this funny space where they were partly a service to the demo scenes allowing all these demo groups to distribute their amazing demos. And then it was also a place to trade piracy software, pirated software. And I ended up starting one of those when I was 14 years old in my tiny little bedroom in Copenhagen. I had my, at that point, Amiga 4000. I had three telephone lines coming in to my tiny room.

DHH (00:10:00) Which is funny because again, I’m 14 years old. By the time I was installing my third line, you had to get someone from the telephone company to come do it. I get this guy and he’s just looking around, like what is this? Why the hell is a 14 year old having three phone lines into their tiny little bedroom? What’s going on here? Why are all these modems blinking red and black and making funny sounds?

Lex Fridman (00:10:23) Did your parents know?

DHH (00:10:24) They did and they didn’t. They knew I had the phone lines. They knew I had the computer. I don’t think they really understood that I was trading pirated software that was both illegal and whatever else was going on.

Lex Fridman (00:10:38) Oh, we should probably say that in Europe, maybe you can comment on this, especially in Eastern Europe, but Europe in general, piracy I think was more acceptable than it was in the United States. I don’t know, maybe it’s just my upbringing-

DHH (00:10:52) Even that conversation wasn’t present. I never spoke to anyone growing up in Denmark-

Lex Fridman (00:10:56) That piracy is wrong.

DHH (00:10:57) Who had any moral qualms whatsoever about piracy. It was just completely accepted that you’re a kid, you want a lot of games, you don’t have a lot of money. What do you do? You trade. Some people would occasionally buy a game. I mean, I once bought a Sega Master system and I bought one game because that was what I could afford. I got After Burner II, I don’t know if you’ve ever played that game. It’s a pretty bad implementation on the Sega Master System, but it was like 600 crowners.

(00:11:28) And I was making money at that time doing newspaper delivery. I had to do that for a month to afford one game. I liked video games way too much to wait a month just to get one game. So piracy was just the way you did it, and that was how I got into running this bulletin board system, being part of the demo scene, being part of the piracy scene to some extent. And then also at some point realizing, oh, you can actually also make money on this and this can fund buying more phone lines and buying more modems and buying more Amigas. Oh yeah, that was one of the demo parties. These were amazing things.

Lex Fridman (00:12:04) What am I looking at?

Lex Fridman (00:12:06) Look at all those CRT monitors.

DHH (00:12:08) All these CRT monitors. Again, when I was 14, I don’t understand fully why my parents allowed this, but I traveled from Copenhagen, the capital of Denmark to [inaudible 00:12:20], this tiny little town in Jutland on the train with a bunch of dudes who were late teens, in their twenties. I’m 14 years old. I’m lugging my 14-inch CRT monitor with my computer in the back to go to the party. That was what it was called. That was the biggest demo scene party at that time and it was exactly as you see in that picture, thousands of people just lining up with their computers, programming demos all day long and trading these things back and forth.

Lex Fridman (00:12:48) That’s kind of awesome. Not going to lie. It’s a little ridiculous.

DHH (00:12:52) It’s totally awesome, and I miss it in ways where the internet has connected people in some ways, but the connection you get from sitting right next to someone else who has their own CRT monitor, who’s lugged at halfway around the country to get there is truly special because it was also just this burst of creativity. You’re constantly running around, you’re constantly surrounded by people who are really good at what they could do, they’re really good at programming computers. It’s infectious. It was part of that pang I felt then going like, oh man, why can’t I figure this out? I mean, why can’t I even figure out EasyAMOS? It’s kind of frustrating.

Lex Fridman (00:13:28) But on your third attempt, you were a little more successful.

DHH (00:13:30) So third attempt is when I start getting it. This is when I start helping out, let’s say, building things for the internet. So around ’95 I think it is, or ’96, I discovered the internet. Actually in ninth grade, that was my first experience. I went to some university in Denmark and in ninth grade we had this excursion and they sat us down in front of a computer and the computer had Netscape Navigator, the first version, or maybe it was even the precursor to that, and they had a text editor and us kids [inaudible 00:14:06] hey, build something on the internet. And it was just HTML and the first thing you do is like, oh, I can make the text blink by just putting in this tag and saving it? That moment, that was actually when I reawakened the urge to want to learn to program because I got a positive experience.

(00:14:23) All the other experiences I had with programming was I’d spend hours typing something in, I click run and it wouldn’t work, and I’d get an error message that made no sense to me as a kid either at six or seven or at 12. And here I am sitting in front of a computer connected to the internet and I’m making text blink. I’m making it larger. I’m turning it into an H1 or an H2. And these guys out here, we just did it for like an hour and a half and suddenly I go, oh, I can make things for the internet that someone in Germany can be able to access and see, and I don’t have to ask anyone for permission? This is super cool. I’ve got to do more of this. So I got into the internet. I got into working with HTML, and I still had all these friends from these demo parties, and I started working with them on creating gaming websites.

(00:15:11) I’d rather buy the video games, I’d review them. This was another good way of getting new video games was to walk down to some store and say like, hey, I’m a journalist. I’m like this fifteen-year-old kid and they’re looking at me. “You’re a journalist?” “Yeah, can I borrow some games?” Because this was when games moved on to the PlayStation and these other things. You couldn’t just as easily pirate, at least not at first. So I went down there, did all that, and that started the journey of the internet for me. I started working on these gaming websites, working with programmers, figuring out that I could do something, I could work on the HTML part.

(00:15:44) It’s not really programming, but it kind of smells like it. You’re talking to a computer, you’re making it put text on the screen and you’re communicating with someone halfway around the world. So that became my pathway back into programming, and then slowly I picked up more and more of it. First website I did with someone, one of these programmers from the demo scene that was dynamic was asp.net. It wasn’t even actually called .net. That was what we started on, and then we moved on to PHP and PHP was when I finally got it, when it finally clicked, when conditionals and loops and variables and all of that stuff started to make sense enough to me that I thought, I can do this.

Lex Fridman (00:16:26) So would it be fair to say that we wouldn’t have DHH without PHP and therefore you owe all of your success to PHP?

DHH (00:16:33) A hundred percent, that’s true. And it’s even better than that because PHP to me didn’t just give me a start in terms of making my own web applications. It actually gave me a bar. In many ways I think the pinnacle of web developer ergonomics is late ’90s PHP. You write this script, you FTP it to a server and instantly it’s deployed. Instantly it’s available. You change anything in that file and you reload, boom, it’s right there. There’s no web servers, there’s no setup. There’s just an Apache that runs mod PHP, and it was essentially the easiest way to get a dynamic web page up and going, and this is one of the things I’ve been chasing that high for basically the rest of my career. It was so easy to make things for the internet in the mid to late ’90s.

(00:17:26) How did we lose the sensibilities that allowed us to not just work this way but get new people into the industry to give them those success experiences that I had adding a freaking blink tag to an HTML page, FTPing a PHP page to an Apache web server without knowing really anything about anything? Without knowing anything about frameworks, without knowing anything about setup. All of that stuff have really taken us to a place where it sometimes feels like we’re barely better off. Web pages aren’t that different from what they were in the late ’90s, early 2000s. They’re still just forms. They still just write to databases.

(00:18:06) A lot of people, I think are very uncomfortable with the fact that they are essentially crud monkeys. They just make systems that create, read, update or delete rows in a database, and they have to compensate for that existential dread by over-complicating things. Now, that’s a bit of a character. There’s more to it and there’s things you can learn for more sophisticated ways of thinking about this, but there’s still an ideal here, which is why I was so happy you had Pieter Levels on because he still basically works like this. And I look at that and go, man, that’s amazing.

Lex Fridman (00:18:39) Yeah, you’re chasing that high. He’s been high all along.

Lex Fridman (00:18:43) Using PHP, jQuery and SQLite.

DHH (00:18:47) I think it’s amazing because he’s proving that this isn’t just a nostalgic dream. He’s actually doing it. He’s running all these businesses. Now, some of that is, as he would admit up first upfront, is that he’s just one guy. And you could do different things when you’re just one guy. When you’re working in a team, when I started working on a team, when I started working with Jason Fried on Basecamp, we at first didn’t use version control together.

(00:19:16) I used version control for myself, and then I thought, do you know what? Designers, they’re probably not smart enough to figure out CBS and therefore I was just like, no, no, no, you just FTP it up. You just FTP it. They knew how to do FTP. And then after the third time I had overwritten their changes I was like, goddamn it, I guess I’ve got to teach Jason CBS to not do that again. But I think there’s still way more truth to the fact that we can work the way we did in the ’90s, work the way Pieter works today even in the team context, and that we’ve been far too willing to hand over far too much of our developer ergonomics to the merchants of complexity.

JavaScript

Lex Fridman (00:19:57) And you’ve been chasing that with Rails 8. So how do you bring all the cool features of a modern framework and make it no build, make it as easy to create something and to ship it as it was in the ’90s with just PHP? It’s very difficult for me to beat the Pieter Levels approach of just… It’s so easy to just ship some PHP.

DHH (00:20:21) And it should be. Why should it be harder than that? Our computers today are almost infinitely faster than what they were in the ’90s. So shouldn’t we be able to work in even easier ways? We should be looking back on the ’90s and go, oh, that was way too complicated. Now we have more sophisticated technology that’s way faster and it allows us to work in these easier to use ways. But that’s not true. But now you can see the line I draw in my work with Ruby on Rails, and especially with Rails 8. No build to me is reaching back to that ’90s feeling and going, now we can do some of those things without giving up on all the progress. Because I do think you can get too nostalgic. I do think you can start just fantasizing that everything was better in the ’90s. I wasn’t.

(00:21:10) I mean, I was there, there was a lot of things that sucked. And if we can somehow find a way to combine the advantages and advances we’ve had over the past 20 years with that ease of developer ergonomics, we can win. No build is a rejection of the part of web development I’ve hated the most in the past 10, 15 years, which is the JavaScript scene. And I don’t say that as someone who hates JavaScript. I mean, I often joke that JavaScript is my second favorite program language. It’s a very distant second. Ruby is by far and away number one, but I actually like JavaScript. I don’t think it’s a bad language. It gets a lot of flak. People add a string of two plus a one and it gives something nonsense, and I just go, yeah, but why would you do that? Just don’t do that. The language is actually quite lovely, especially the modern version.

(00:22:02) ES6, that really introduced a proper class syntax to it, so I could work with JavaScript in many of the same ways that I love working with Ruby. It made things so much better. But in the early 2010s until quite recently, all of that advancement happened in pre-processing, happened in build pipelines. The browsers couldn’t speak a dialect of JavaScript that was pleasant to work with so everyone started pre-compiling their JavaScript to be able to use more modern ways of programming with a browser that was seen as stuck with an ancient version of JavaScript that no one actually wanted to work with. And that made sense to me, but it was also deeply unpleasant. And I remember thinking during that time, the dark ages as I refer to them with JavaScript, that this cannot be the final destination. There’s no way that we have managed to turn the internet into such an unpleasant place to work where I would start working on a project in JavaScript using Webpack and all of these dependencies, and I would put it down for literally five minutes and the thing wouldn’t compile anymore.

(00:23:14) The amount of churn that the JavaScript community, especially with its frameworks and its tooling, went through in the decade from 2010 to 2020 was absurd. And you had to be trapped inside of that asylum to not realize what an utterly perverse situation we had landed ourselves in. Why does everything break all the time? I mean, the joke wouldn’t be just that the software would break, that would annoy me personally. But then I’d go on Hacker News and I’d see some thread on the latest JavaScript release of some framework, and the thread would be like, someone would ask, well, aren’t we using the thing we just used three months ago? And people would be like, that thing is so outdated. That’s so three months ago. You’ve got to get with the new program, we’re completely rewriting everything for the [inaudible 00:24:07] time and anything you’ve learned in the framework you’ve been spending the last amount of time on, it’s all useless. You’ve got to throw everything out and you’ve got to start over. Why aren’t you doing it stupid idiot?

Lex Fridman (00:24:18) Is that a kind of mass hysteria that took over the developer community you think? Like where you have to keep creating new frameworks and new frameworks and are we past that dark age?

DHH (00:24:29) I think we’re getting out of it and we’re getting out of it because browsers have gotten so much better. There was a stagnation in browser technology. Some of it was an overhang all the way back from IE5. So IE5 essentially put the whole internet development experience into a deep freeze because Microsoft won the browser wars in the mid-2000s, and then they basically disbanded their browser development team because they’re like all right, job done, we don’t need any more innovation on the internet. Can we just go back to writing Windows forms or something now that we control everything? And it really wasn’t until obviously Firefox kind of kindled a little bit of something. Then Chrome got into the scene and Google got serious about moving to web forward, that you had a kindling of maybe the browser could be better. Maybe the browser wasn’t frozen in time in 2005. Maybe the browser could actually evolve like the development platform that it is. But then what happened was you had a lot of smart people who poured in to the web because the web turned out to be the greatest application development platform of all time. This was where all the money was being made. This was where all the billionaires were being minted. This was where the Facebook’s and whatever of the world came to be. So you had all of this brain power applied to the problem of how to work with the web, and there were some very smart people with some I’m sure very good ideas who did not have programmer happiness as their motivation number one. They had other priorities and those priorities allowed them to discount and even rationalize the complexity they were injecting everywhere. Some of that complexity came from organizational structure. When you have a company like Facebook for example that does depend on the web and want to push it forward, but have sliced the development role job into these tiny little niches… I’m a front-end glob pipeline configurator.

(00:26:41) Oh yeah, well, I’m a front-end whatever engineer. And suddenly the web developer was no longer one person. It was 15 different roles. That in itself injected a ton of complexity. But I also want to give it the bold case here, which was that some of that complexity was necessary to get to where we are today, that the complexity was a bridge. It wasn’t the destination, but we had to cross that bridge to get to where we are today where browsers are frankly incredible. The JavaScript you can write in a text file and then serve on a web server for a browser to ingest is amazing. It’s actually a really good experience. You don’t need any pre-processing. You could just write text files, send them to a browser, and you have an incredible development-

Lex Fridman (00:27:25) And we should also say that it can kind of be broken, at least the HTML, but even the JavaScript could be a little bit broken and it kind of still works. Like maybe it half-ass works, but just the amount of mess of smelly code that a browser has to deal with is insane.

DHH (00:27:44) This is one of the hardest problems in computing today is to parse the entire internet. Because thankfully for us as web developers, but perhaps not so much for the browser developers, every webpage that has ever been created minus the brief period with Flash still runs today. The webpage I did in ninth grade would render on a modern browser today, 30 years later.

DHH (00:28:11) That is completely crazy when you think about the amount of evolution we’ve had with the web, how much better we’ve made it, how many more standards browsers have adopted. It’s essentially an Apollo project today to create a new browser, which is why it doesn’t happen very often, which is why even companies like Microsoft had to throw in the towel and say, we can’t do it. Now, I actually don’t think that’s good for the web. There is the danger of the monoculture if we just get a single browser engine that runs everything, and we are in danger of that. I love the fact that the Ladybird project, for example, is trying to make a new browser engine from scratch. I’ve supported that project. I would encourage people to look into that. It’s really a wonderful thing. It’s staffed by a bunch of people who worked on other browser projects in the past.

Lex Fridman (00:28:57) Truly independent web browser.

DHH (00:28:59) We really need that. But I can hold that thought in my head at the same time I hold the thought in my head that Google Chrome was pivotal to the web surviving as the premier web development platform. If it had not been for Google and their entire business depending on a thriving open web, Apple, Microsoft I think would’ve been just as fine to see the web go away to disappear into being something that’s just served native mobile applications and native desktop applications that they could completely control. So I have all sorts of problems with Google, but it’s not Chrome. Chrome is a complete gift to web developers everywhere, to the web as a development platform, and they deserve an enormous amount of credit I think for that. Even if it’s entangled with their business model and half of Chrome is code that spies on you or informs targeted ads and a bunch of things I’m not a big fan of, I can divorce that from the fact that we need champions in the corner of the web who have trillions of dollars of market cap value riding on the open web.

Google Chrome and DOJ

Lex Fridman (00:30:16) We’re going to take tangents upon a tangent upon a tangent. So let’s go to Chrome. I think Chrome positive impact on humanity is immeasurable for reasons that you just described. On the technology front, the features that present the competition they created, it’s spurred on this wonderful flourishing of web technologies. But anyway, I have to ask you about the recent stuff with the DOJ trying to split up Chrome and Google. Do you think this is a good idea? Do you think this does harm?

DHH (00:30:47) It’s a disaster. And I say that as someone who’s been very sympathetic to the antitrust fight, because I do think we have antitrust problems in technology, but the one place where we don’t have them by and large is with browsers, is with the tools we use to access the open web. First of all, we have Firefox. Now, Firefox is not doing all that great, and Firefox has been propped up by Google for many years to deter from exactly what’s going on with the DOJ that they were the only game in town. Apple has Safari. I have a bunch of problems with Apple too, but I love Safari. I love the fact that we have a premier browser running on a premier operating system that people can’t turn the web into just a Chrome experience. But I also think that the open web needs this trillion dollar champion, or at least benefits from it.

(00:31:44) Maybe it doesn’t need it, but it certainly benefits from it. And of all the things that are wrong with monopoly formation in technology, Chrome is the last thing, and this is why I get so frustrated sometimes about the monopoly fight, that there are real problems and we should be focusing on the premier problems first like the toll booths on our mobile phones. There are far bigger problems. It’s not the open web, it’s not the tools that we use to access the open web. If I don’t want to use Chrome, if my customers of my businesses that run on the internet don’t want to use Chrome, they don’t have to. We’re never forced to go through it. The open internet is still open. So I think it’s a real shame that the DOJ has chosen to pursue Google in this way. I do think there are other things you can nail Google for, their ad monopoly maybe, or the shenanigans they’ve done in controlling both sides of the ad ledger, that they both control the supply and the demand.

(00:32:45) There are problems. Chrome, isn’t it. And you end up making the web much worse. And this is the thing we’ve always got to remember when we think about legislation, when we think about monopoly fights is you may not like how things look today and you may want to do something about it, but you may also make it worse. The good intentions behind the GDPR in Europe currently has amounted to what? Cookie banners that everyone on the internet hates, that helps no one do anything better, anything more efficient, that saves no privacy in any way, shape or form, has been a complete boondoggle that has only enriched lawyers and accountants and bureaucrats.

Lex Fridman (00:33:29) Yeah, you said that the cookie banner is a monument for why Europe is losing, is doing the worst of all the regions in tech.

DHH (00:33:40) It’s a monument to good intentions leading straight to hell, and Europe is actually world-class in good intentions leading straight to hell.

Lex Fridman (00:33:53) So hell is the cookie accept button, that you have to accept all cookies. That’s what hell looks like. Over and over, you don’t actually ever get to the web page-

Lex Fridman (00:34:00) … over. You don’t actually ever get to the web page.

DHH (00:34:03) Just on a human scale, try to imagine how many hours every day are wasted clicking that away and how much harm we’ve done to the web as a platform that people enjoy because of them. The internet is ugly in part because of cookie banners. Cookie banners were supposed to save us from advertisement, and advertisement can make the web ugly. There’s plenty of examples of that, but cookie banners made the entire internet ugly in one fell swoop, and that’s a complete tragedy. But what’s even worse, and this is why I call it out as a monument to everything the EU gets wrong, is that we have known this for a decade. No one anywhere who’s serious believes that cookie banners does anything good for anyone, yet we’ve been unable to get rid of it.

(00:34:50) There’s this one piece of legislation that’s now I think 10 or 12 years old. It’s complete failure on every conceivable metric. Everyone hates it universally, yet we can’t seem to do anything about it. That’s a bankruptcy declaration for any body of bureaucrats who pretend or portend to make things better for not just citizens but people around the world. This is the thing that really gets me about cookie banners, too. It’s not just the EU, it’s the entire world. You can’t hide from cookie banners anywhere on this planet. If you go to goddamn Mars on one of Elon’s rockets and you try to access a webpage, you’ll still see a cookie banner. No one in the universe is safe from this nonsense.

Lex Fridman (00:35:33) Probably the interface on the rocket.

DHH (00:35:36) It’d be slower. You have basically 150 second ping time, so it’ll take you 45 seconds just to get through the cookie banners from Mars.

Lex Fridman (00:35:46) All right, let’s walk back up the stack of this recursive tangents we’ve been taking. So Chrome, we should say, at least in my opinion, is not winning unfairly. It’s winning in the fair way by just being better.

DHH (00:36:03) It is. If I was going to Steelman the other side just for a half second, people would say, well, maybe yes, most people do sort of begrudgingly agree that Chrome is a pretty good browser. But then they’ll say the reason it got dominance was distribution, and the reason it got distribution was because Google also controls Android and therefore can make Chrome the default browser on all these phones.

(00:36:27) Now, I don’t buy that, and the reason I don’t buy that is because on Android, you are actually allowed to ship a different browser that has a browser engine that’s not the same as Chrome. Unlike an iOS where if you want to ship a browser, Chrome, for example, ships for iOS, but it’s not Chrome, it’s Safari wrapped in a dress, and every single alternative browser on iOS have to use the Safari web engine. That’s not competition. That’s not what happened on Android.

(00:36:57) Again, I think there are some nuances to it, but if you zoom out and you look at all the problems we have with Big Tech, Chrome is not it. Chrome One unmerits. I begrudgingly have switched to Chrome on that realization alone. As a web developer, I just prefer it. I like Firefox in many ways. I like the ethos of it, but Chrome is a better browser than Firefox, full stop.

Lex Fridman (00:37:21) And by the way, we’ve never mentioned Edge. Edge is also a good browser.

DHH (00:37:26) Because it’s also Chrome in a dress.

Lex Fridman (00:37:27) But it never gets the love. I don’t think I’ve ever used Bing, and I’m sure Bing is really nice.

DHH (00:37:34) Maybe you have, because you know what is Bing in a dress?

DHH (00:37:38) Which is actually the search engine that I use. DuckDuckGo gets its search results from Bing, or at least it used to. If they changed that, that would be news to me.

Lex Fridman (00:37:47) Well, maybe everything is just a wrap or a dress. Everything is wearing a dress underneath. There’s some other turtles-

Ruby programming language

Lex Fridman (00:37:56) The turtles, the dress is all the way down. Okay, what were we talking about? They got there from JavaScript and from you learning how to program. So eventually the big success stories when you built a bunch of stuff with PHP and you were like actually chipping things.

Lex Fridman (00:38:15) And that’s when the Ruby story came. So your big love affair with programming began there. So can you take me there? What is Ruby? Tell the story of Ruby. Explain Ruby to me.

DHH (00:38:28) PHP was what converted me from just being able to fondle HTML and turn out some web pages to actually being able to produce web applications myself. So I owe a tremendous gratitude to PHP in that regard. But I never thought of PHP as a calling. I’m a professional programmer who writes PHP. That’s who I am, and that’s what I do. I thought of PHP as a tool I needed to smack the computer with until it produced web applications I wanted. It was very much a means to an end. I didn’t fall in love with PHP. I’m very grateful that it taught me the basics of programming, and I’m very grateful that it set the bar for the economics. But it really wasn’t until Ruby that I started thinking of myself as a programmer. The way that came about was that the first time I ever got hired as a professional programmer to write code was actually by Jason Fried, my business partner still.

(00:39:31) All the way back in 2001, I had been working on these gaming websites in PHP for essentially 18 months at that point. No one had been paying me to do code in that regard, and I connect with Jason Fried over an email sent from Copenhagen, Denmark to Chicago, Illinois to a person who didn’t know who I was. I was just offering solicited advice. Jason had asked a question on the internet, and I had sent him the answer and he was asking me PHP, and I’d sent him the answer to that question and we started talking and then we started working, which by the way is a miracle of what the internet can allow. How can a kid in Copenhagen who’s never met this guy in Chicago connect just over email and start working together? By the way, we’re still working together now 24 years later. That’s incredible. But we started working together and we started working together on some client projects.

(00:40:25) Jason would do the design, 37signals would do the design. I would bring the programming PHP. And after we work on I think two or three client projects together in PHP, we kept hitting the same problem that whenever you work with a client, you start that project off an email, “Oh, yeah, let’s work together. Here’s what we’re building.” And you start trading more and more emails and before a few weeks have passed, you got to add someone to the project. They don’t have the emails, they don’t have the context. You send them, “Where’s the latest file?” “Oh, I’ve uploaded it on the FTP. It’s like final, final V06 2.0.” Right? That’s the one to get. It’s just a mess, a beautiful mess in some ways. It’s a mess that still runs the vast majority of projects to this day. Email is the lowest common denominator. That’s wonderful.

(00:41:13) But we had dropped the ball a couple of times in serious ways with customers and we thought we can do better. We know how to make web applications. Can’t we just make a system that’s better than email for managing projects? It can’t be that hard. We’ve been doing blogs, we’ve been doing to-do lists. Let’s put some of these things together and just make a system where everything that anyone involved in the project needs is on one page. And it has to be simple enough that I’m not going to run a seminar teaching you how to use the system. I’m just going to give you the login code. You’re going to jump into it. So that’s Basecamp. When we started working on Basecamp, I, for the first time in the experience I had with Jason had the freedom of technology choice. There was no client telling me, “Yeah, PHP, that sounds good. We know PHP. Can you build it in PHP?”

(00:42:06) I had free reins. At that time I’d been reading IEEE magazine and a couple of other magazines back from the early 2000s where Dave Thomas and Martin Fowler had been writing about programming patterns and how to write better code. These two guys in particular were both using Ruby to explain their concepts because Ruby looked like pseudocode. Whether you were programming in C or Java or PHP, all three constituencies could understand Ruby because it basically just reads like English. So these guys were using Ruby to describe the concepts, and first of all, I would read these articles for just the concepts they were explaining and I’d be like, “What is this program language?” I mean, I like the concept you’re explaining, but I also want to see the programming language. Why haven’t I heard of this?

(00:43:02) So I started looking into Ruby and I realized at that time, Ruby might not be known by anyone, but it’s actually been around for a long time. Matz, the Japanese creator of Ruby, had started working on Ruby back in ’93 before the internet was even a thing. And here I am in 2003, 10 years later, picking up what seems like this hidden gem that’s just laying in obscurity and plain sight. But Dave Thomas and Martin Fowler, I think successfully put me and a handful of other people on the trail of a programming language that hadn’t been used much in the west, but could be. So I picked up Ruby and I thought, this is very different. First of all, where are all the semicolons? I’d been programming in PHP, in ASP, I’d even done some Pascal. I’d looked at some C. There were semicolons everywhere.

(00:44:05) That was the first thing that struck me is where are the damn semicolons? And I started thinking, actually, why do we have semicolons in programming? They’re to tell the interpreter that there’s a new line of instructions, but I don’t need them as a human. Oh, someone is looking out for the human here, not for the machine. So that really got me interested. And then I thought to myself, do you know what? I know PHP quite well. I’m not an amazing programmer. I haven’t been working in programming for all that long, but maybe I can figure it out. I’m going to give myself two weeks. I’m going to write a proof of concept where I talked to a database, I pulled some records, I format them a bit, and I display them on an HTML page. Can I figure that out in a couple of weeks? It took about one weekend and I was completely mesmerized. I was completely mind blown because Ruby was made for my brain like a perfect tailored glove by someone I’d never met. How is this even possible?

Beautiful code

Lex Fridman (00:45:14) We should say maybe paint the picture of the certain qualities that Ruby has, maybe even compare it to PHP. We should also say that there’s a ridiculous thing that I’m used to that I forget about, that there’s dollar signs everywhere.

DHH (00:45:31) That’s what I like to call it.

Lex Fridman (00:45:31) Line noise. Line noise. That’s such a beautiful phrase. So there’s all these things that look like programs, and with Ruby, I mean there’s some similarities in Python there. It just looks kind of like natural language. You can read it normally,

DHH (00:45:47) Here’s a wild loop that does five iterations. You can literally type the number five, dot, now I’m calling a method under number five. By the way, that’s one of the beautiful aspects of Ruby that primitives like integers are also objects and you can call five dot times start brackets. Now you’re iterating over the code in that bracket five times. That’s it.

Lex Fridman (00:46:15) Okay, that’s nice.

DHH (00:46:16) That’s not just nice, that’s exceptional. There’s literally no other programming language that I know of that has managed to boil away the line noise that almost every other programming language would inject into a five-time iteration over a block of code to that extent.

Lex Fridman (00:46:32) Wow. That’s a really nice… Well, thank you for giving that example. That’s a beautiful example. Wow, I don’t think I know a programming language that does that. That’s really nice.

DHH (00:46:41) Ruby’s full of that. So let me dive into a couple of examples because I really think it helps paint the picture and let me preface this by saying I actually, I like the ethos of Python. I think the Ruby and the Python community share a lot of similarities. They’re both dynamic interpreted languages. They’re both focused on immediacy and productivity and ease of use in a bunch of ways, but then they’re also very different in many other ways. One of the one ways they’re very different is aesthetically.

(00:47:12) Python to me, I hope I don’t offend people too much. I’ve said this before, it’s just it’s ugly and it’s ugly in its base because it’s full of superfluous instructions that are necessary for legacy reasons of when Guido made Python back in ’87 that are still here in 2025, and my brain can’t cope with that. Let me give you a basic example. When you make a class in Python, the Initializer method, the starting method is def, okay, fair enough. That’s actually the same as Ruby. D-E-F definition of a method. Then it is underscore not one, underscore, two, init, underscore underscore, parentheses start, self, comma, and then the first argument.

Lex Fridman (00:48:03) Yeah, the whole self thing. Yeah.

DHH (00:48:06) I look at that and go, “I’m sorry I’m out. I can’t do it.” Everything about it offends my sensibilities to the core. Here you have the most important method that all new objects or classes have to implement, and it is one of the most aesthetically offensive ways of typing initialize that I’ve ever seen anywhere, and you guys are okay with this?

Lex Fridman (00:48:29) Hey, you’re making me… You know where you’re talking about my marriage or something like this, and I’m not realizing I’ve been in a toxic relationship all along yet. I just get used to it.

DHH (00:48:39) That to me by the way, was the magic of Ruby.

Lex Fridman (00:48:39) That’s the problem.

DHH (00:48:41) It opened my eyes to how beautiful programs could be. I didn’t know. I’d been working in ASP, I’d been working in PHP. I didn’t even have the concept that aesthetics, beautiful code was something we could optimize for. That’s something we could pursue, and even more than that, that we could pursue it above other objectives. That Ruby is as beautiful as it is, it’s not an accident and it’s not easy. Ruby itself is implemented in C. It’s very difficult to parse Ruby code because Ruby is written for humans and humans are messy creatures. They like things in just the right way. I can’t fully explain why the underscore, underscore, init, underscore, underscore make me repulse, but it does. And when I look at the Ruby alternative, it’s really instructive. So it’s def, same part, D-E-F space, initialize, parentheses, not even parentheses if you don’t need to call it within the arguments, there’s not even a parentheses.

(00:49:44) That in itself is actually also a major part. If the human doesn’t need the additional characters, we’re not just going to put them in because it’d be nicer to parse for the computer. We’re going to get rid of the semicolons, we’re going to get rid of the parentheses, we’re going to get rid of the underscores, we’re going to get rid of all that ugliness, all the line noise and boil it down to its pure essentials and at the same time, we’re not going to abbreviate. This is a key difference in the aesthetics between Ruby and Python as well. Init is shorter to type, it’s only five characters. Initialize is a lot longer, but it looks a lot better and you don’t type it very often, so you should look at something pretty. If you don’t have to do it all the time, it’s okay that it’s long.

(00:50:29) Those kinds of aesthetic evaluations are rife all over the Ruby language. But let me give you an even better example. The if conditional, that’s the bedrock of all programming languages. They have the if conditional, if you take most programming languages, they’ll have if, that’s basically the same in almost every language, space, start parentheses, we all do that. And then you have perhaps, let’s say you’re calling a object called user. is admin, close parentheses, close parentheses, start brackets, and here’s what we’re going to do if the user’s an admin, right? That would be a normal programming language. Ruby doesn’t do it like that. Ruby boils almost all of it away. We start with the if. Okay, that’s the same, no parentheses necessary because there’s no ambiguity for the human to distinguish that the next part is just a single statement. So you do if, space, user dot admin, question mark, no open brackets, no parentheses, no nothing. Next open line, here’s your conditional.

(00:51:45) That question mark means nothing to the computer, but it means something to the human. Ruby put in the predicate method style purely as a communication tool between humans. It’s actually more work for the interpreter to be able to see that this question mark is there. Why is this question mark in here? Because it just reads so nicely. If user admin question mark, that’s a very human phrase, but it gets better. You can turn this around. You can have your statement, you want to execute before the conditional. You can do user.upgrade, say you’re calling an upgrade method on a user, space, if, space, user.admin question mark. We do the thing, if the thing is true, instead of saying if the thing is true, do the thing. But it gets even better. This is why I love this example with the conditional because you can keep diving into it. So let’s flip it around. user.downgrade if exclamation point, not user.admin, that’d be a typical way of writing it. Ruby goes that exclamation point is light noise. Why do we have if and then an exclamation point that’s ugly? We could do user.downgrade unless user.admin question mark.

DHH (00:53:17) That to me is an encapsulation of the incredible beauty that Ruby affords the programmer through ambiguity that is only to serve the human reader and writer. All of these statements we’ve just discussed, they’re the same for the computer. It’ll compile down to the same C code. They’ll compile down to the same assembly code. It makes no difference whatsoever. In fact, it just makes it harder to write an interpreter. But for the human who gets to choose whether the statement comes before the conditional or the predicate method has, it’s just incredible. It reads like poetry at some point.

Lex Fridman (00:53:55) It’s also incredible that one language designer is creating that. Guido van Rossum also. It’s like one person gets to make these extremely difficult decision because you have to think about how does that all get parsed and you have to think about the thousands, if it’s a popular language that millions of people that end up using this and what they feel, what that question mark for the if statement, what does that feel like of the user?

DHH (00:54:24) That’s what Matz thought about because he started his entire mission off a different premise than almost every programming language designer that I’d heard at least articulate their vision, that his number one goal was programmer happiness. That his number one goal was the affordances that would allow programmers to articulate code in ways that not just executed correctly, but were a joy to write and were a joy to read. That vision is based on a fundamentally different view of humanity. There’s no greater contrast between Matz and James Gosling, the designer of Java. I wanted to listen to James talk about the design of Java. Why was it the way it was? Why was it so rigid? He was very blunt about it, which by the way, I really appreciate and I think Gosling has done a tremendous job with Java, but his view of humanity is rather dark.

(00:55:24) His view of humanity was programmers at the average are stupid creatures. They cannot be trusted with sophisticated programming languages because they’re going to shoot their foot off or their hand off. And that would be kind of inconvenient to the regional development office of a mid-tier insurance company writing code that has to last for 20 years. Now it’s actually a very Thomas Sowell view of constrained capacity in humans that I’ve come to appreciate much later in life. But it’s also a very depressing view of programmers that there are just certain programmers who are too dumb to appreciate code poetry. They’re too ignorant to learn how to write it well. We need to give them a sandbox where they just won’t hurt themselves too much.

(00:56:20) Matz went the complete opposite direction. He believes in humanity. He believes in the unlimited capacity of programmers to learn and become better so much so that he’s willing to put the stranger at his own level. This is the second part I truly appreciate about Ruby. Ruby allows you to extend base classes. You know how we just talked about five dot times is a way to iterate over a statement five times. That five is obviously a base class, it’s a number. Do you know what? You can add your own methods to that? I did extensively. In Rails, we have something called active support, which is essentially my dialect of Ruby for programming web applications. I’ll give you one example. I’ve added a method called Days to the Number. So if you do five .days, you get five days in seconds because seconds is the way we set cache expiration times and other things like that. So you can say cache expires in five .days and you’re going to get whatever-

DHH (00:57:35) … five times, 24 times 60 times 60 is or whatever the math is, right? Very humanly readable. In a normal programming language, you would type out the seconds and then you would have a little comment above it saying this represent five days. In Ruby, you get to write five days. But even better than that, Matz didn’t come up with it. Matz didn’t need the five days. I needed that because I needed to expire caches. I was allowed by Matz to extend his story with my own chapters on equal footing such that a reader of Ruby could not tell the difference between the code Matz wrote and the code that I wrote.

(00:58:16) He trusted me as a complete stranger from Denmark who he’d never met to mess with his beautiful story. That level of trust is essentially unheard of. I know there are other program languages that allow things with macros and so forth, but none do it in a way like Ruby does it. None does it with an articulated vision of humanity, a trust in humanity like Matz does. That is the opposite end of the spectrum of Java.

Lex Fridman (00:58:46) Yeah, I mean for my aesthetic sensibilities, just the way you described five .days, that’s really pleasant to me. I could see myself sitting alone sleep-deprived and just writing that. It’s just an easy thing. You can write it in a long way with a comment. You can write in multiple lines, you could do… And now with AI, I’m sure it’s going to generate it correctly, but there’s something really pleasant about the simplicity of that. I’m not sure what that is, but you’re right. There is a good feeling there. I’m sure we’ll talk about happiness from all kinds of philosophical angles, but that is what happiness is made of. That little good feeling there.

DHH (00:59:29) Exactly. It’s the good feeling that come out of a concept compressed to its pure essence. There’s nothing you can take away from that statement that’s superfluous.

Lex Fridman (00:59:39) But see, I also want to push back a little bit because it’s not… Because I also programed in Perl a bunch just to be cool. So it’s not all about compression.

DHH (00:59:51) No, you can compress it too far. Perl golf is a thing where you can turn programs into something that’s unreadable for humans. Now the great thing about Perl was that it came out before Ruby. Matz was a great student of Wall, was a great student of Perl, was a great student of Python and Smalltalk and Lisp. He took inspiration from all of these prior attempts at creating good programming languages and really edited down the very best bits into this. So he was able to learn from his lessons. But what I found incredible about Ruby is that here we are, 2025, Ruby has been worked on for over 30 years and essentially the first draft is 90% of what we’re still using.

(01:00:38) There was almost a sense of divine inspiration possible in wherever Matz was writing that initial version of Ruby that transcended time to such a degree that no one has still even begun to reach it. This is the other thing I always find fascinating. I generally believe in the efficient market theory that if someone comes up with a better mousetrap or better idea, others, they’ll eventually copy them to such an extent that perhaps the original mousetrap is no longer even remembered. No one has been able to copy that essence of Ruby. They borrowed elements and that’s totally fine, but Ruby still stands taller than everyone else on these metrics, on this trust in humanity and programmers.

Lex Fridman (01:01:21) And we should also say maybe the perfect programming language is that metric, and then there’s the successful language and those are often different. There’s something wonderful about the Brendan Eich story of creating JavaScript. There’s something truly beautiful about the way JavaScript took over the world. I’ve recently got to visit the Amazon jungle and just one of my favorite things to do is just to watch the ants take over anything, everything. And it’s just like it’s a nice distributed system. It’s a messy thing that doesn’t seem to be ordered, but it just works and the machinery of it.

DHH (01:01:58) Worse is Better. I mean that’s actually the name of a pattern in software development and other ways of how is the pattern of Linux. Linux was quantifiably worse than I think it was Minix at the time, other ways of it that were more cathedral, less bizarre, and it’s still want. That there’s something to it that the imperfections can help something go forward. It’s actually a trick I’ve studied to the degree that I now incorporated in almost all open source that I do. I make sure that when I release the first version of any new thing I work on, it’s a little broken. It’s a little busted in ways that invite people to come in and help me. Because there’s no easier way to get the collaboration of other programmers than to put something out that they know how to fix and improve.

Lex Fridman (01:02:49) Yeah, that’s awesome.

DHH (01:02:49) But Ruby is somehow or was at least a little bit different in that regard. Not in all regards. Matz got the ethos of the language, the design of language just right. But the first versions of Ruby were terribly slow. It’s taken, I mean hundreds of man-years to get Ruby to be both this beautiful yet also highly efficient and really fast.

Lex Fridman (01:03:15) We should say that the thing that made you fall in love with this particular programming language is Metaprogramming.

DHH (01:03:21) Yes. So that takes all of these elements we’ve just talked about and turned them up to 11. I’ll explain Metaprogramming real simple.

DHH (01:03:29) Metaprogramming is essentially a version of the five .days. You get to add keywords to the language. Active record is the part of Rails that communicates with the database. This is a system where every table in the database is represented by a class. So if we take the user example, again, you do class, user descends from active record base, and then the first line you can write is this, I want my users to have many posts or have many comments. Let’s do that. We’re making some system where users can make comments. The very next line is, has underscore many space colon comments.

(01:04:15) Now you’ve set up a dependency between users and comments that will give you a whole host of access and factory methods for users to be able to own comments, to create comments, to update comments. In that line alone ” has many” looks like a keyword. It looks like it’s part of the Ruby language. That’s metaprogramming. When Rails is able to add these elements to how you define a class, and then that runs code that adds a bunch of methods to the use of class, that’s Metaprogramming.

(01:04:49) And when Metaprogramming is used in this way, we call it domain-specific languages. You take a generic language like Ruby and you tailor it to a certain domain like describing relationships in a database at a object level. This is one of those early examples where you can do, user has many comments, belongs underscore two space colon account. Now you’ve set up a one-to-one relationship before we had a one-to-many relationship. Rails is rife with all these kinds of domain-specific languages where at sometimes it doesn’t even look like Ruby. You can’t identify Ruby keywords. You can just identify what looks like keywords in its own programming language. Now again, I know that Lisp and others also do this stuff. They just do it with the maximum amount of line noise that can ever be crammed into a programming language and Ruby does it at a level where you cannot tell my metaprogramming from Matz’s keywords and with zero line noise.

Lex Fridman (01:05:56) Yeah, I should say that my first love was Lisp. So there’s a slow tear that you can’t see.

DHH (01:06:01) I’ve actually never written any real Lisp myself.

Lex Fridman (01:06:04) Well, how can you judge it so harshly then?

DHH (01:06:07) Because I have two eyes and I can look at code and my aesthetic sensibilities forbid me to even go much further, which is the limitation, I know. I should actually dive into Lisp because I’ve found that I’ve learned a lot just diving into, maybe I’m insulting Lisp again here, but the past of programming languages. With Smalltalk, for example, I think Smalltalk is a incredible experiment that also worked but isn’t suitable for today’s programming environments.

Dynamic typing

Lex Fridman (01:06:36) I love that we’re talking about Ruby so much and what a beautiful code is and what a beautiful programming language is. So one of the things that is I think implied maybe you made explicit in your descriptions there is that Ruby is dynamic typing versus strict typing. And you have been not just saying that it’s a nice thing, but that you will defend dynamic typing to the death. That freedom is a powerful freedom to preserve.

DHH (01:07:04) It’s the essence of what makes Ruby Ruby. This is why I don’t fully understand when people call for Ruby to add static typing because to me it’s the bedrock of what this is. Why would you want to turn one of the most beautiful languages into something far uglier? This is one of my primary objections to static typing. It’s not just that it limits you in certain ways. It makes metaprogramming harder. I write a bunch of metaprogramming. I’ve seen what it takes to do metaprogramming in TypeScript. That was actually one of the things that just really sent me on a tear of getting meta or getting TypeScript out of some of the projects that I’m involved with.

(01:07:42) We pulled TypeScript out of Turbo, one of the front-end frameworks that we have because I tried to write to Metaprogramming in TypeScript and I was just infuriated. I don’t want that experience, but I also don’t want it from an aesthetic point of view. I hate repetition. We’ve just talked about how much I love that Ruby boils all of these expressions.

DHH (01:08:00) … about how much I love that Ruby boils all of these expressions down to its essence. You can’t remove one dot. You can’t remove one character without losing something. This moment you go for static typing, that you declare at least … I know there are ways to do implied typing and so forth, but let’s just take the stereotypical case of an example, for example. Capital U user, I’m declaring the type of the variable. Lowercase user. I’m now naming my variable, equals uppercase user or new uppercase user. I’ve repeated user three times. I don’t have time for this. I don’t have sensibilities for this. I don’t want my Ruby polluted with this. Now, I understand all the arguments for why people like static typing. One of the primary arguments is that it makes tooling easier. It makes it easier to do auto-complete in editors, for example. It makes it easier to find certain kinds of bugs, because maybe you’re calling methods that don’t exist on an object and the editor can actually catch that bug before you even run it. I don’t care.

(01:09:11) First of all, I don’t write code with tools, I write them with text editors. I chisel them out of the screen with my bare hands. I don’t auto-complete. This is why I love Ruby so much, and this is why I continue to be in love with the text editor rather than the IDE. I don’t want an IDE. I want my fingers to have to individually type out every element of it, because it will force me to stay in the world where Ruby is beautiful. Because as soon as it gets easy to type a lot of boilerplate, well, guess what? You can have a lot of boilerplate. Every single language basically that has great tooling support has a much higher tolerance for boilerplate because the thinking is, well, you’re not typing it anyway, you’re just auto- completing it. I don’t want that at all. I want something where the fabric I’m working in is just a text file, there’s nothing else to it. So these things play together. There’s the aesthetic part, there’s the tooling part, there’s the meta-programming part.

(01:10:16) There’s the fact that Ruby’s ethos of duck typing … I don’t know if you’ve heard that term before. It’s essentially not about, can I call this method if an object is of a certain class? Can I call this method if the method responds? It’s very out of small talk in that regard. You don’t actually check whether that class has the method, which allows you to dynamically add methods at runtime and do all sorts of really interesting things that underpin all the beautiful meta-programming that we do in Ruby. I don’t want to lose any of that and I don’t care for the benefits. One of the benefits I’ve seen touted over and over again is that it’s much easier to write correct software. You’re going to have fewer bugs. You’re going to have less Null Pointer Exceptions, you’re going to have less of all of this stuff. Yeah, I don’t have any of that. It’s just not something that occurs in my standard mode of operation. I’m not saying I don’t have bugs, of course I do, but I catch those bugs with unit testing, with integration testing.

(01:11:19) Those are the kinds of precautions that will catch logical bugs, things that compile but are wrong, along with the uncompilable stuff. So I’ve never been drawn into this world, and part of it is because I work on a certain class of systems. I fully accept that. If you’re writing systems that have five, 10, 50 million lines of code with hundreds, thousands or tens of thousands of programmers, I fully accept that you need different methods. What I object to is the idea that what’s right for a code base of 10 million lines of code, with 100,000 programmers working on it, is also the same thing I should be using in my bedroom to create Basecamp, because I’m just a single individual. That’s complete nonsense. In the real world, we would know that that makes no sense at all. That you don’t, I don’t know, use your Pagani to go pick up groceries at Costco. It’s a bad vehicle for that. It just doesn’t have the space, you don’t want to muddy the beautiful seats. You don’t want to do any of those things.

(01:12:21) We know that certain things that are very good in certain domains don’t apply to all. In programming languages, it seems like we forget that. Now, to be fair, I also had a little bit perhaps of a reputation of forgetting that. When I first learned Ruby, I was so head over heels in love with this programming language that I almost found it unconceivable that anyone would choose any other programming language at all to write web applications. I kind of engaged the evangelism of Ruby on Rails in that spirit as a crusade, as, I just need to teach you the gospel. I just need to show you this conditional code that we just talked about, and you will convert at the point of a sharp argument. Now, I learned that’s not the way, and part of the reason it’s not the way is that programmers think differently. Our brains are configured differently. My brain is configured perfectly for Ruby, perfectly for a dynamically duck-typed language that I can chisel code out of a text editor with.

Scaling

(01:13:22) Other people need the security of an IDE. They want the security of classes that won’t compile unless you call the methods on it. I have come to accept that, but most programmers don’t. They’re still stuck in essentially, I like static typing. Therefore, static typing is the only way to create reliable, correct systems. Which is just such a mind-blowing, to be blunt, idiotic thing to say in the face of mountains of evidence to the contrary. This is one of the reasons I’m so in love with Shopify as the flagship application for Ruby on Rails. Shopify exists at a scale that most programmers will never touch. On Black Friday, I think Shopify did one million requests per second. That’s not one million requests of images, that’s of dynamic requests that are funneling through the pipeline of commerce. I mean, Shopify runs something like 30% of all E-commerce stores on the damn Internet. A huge portion of all commerce in total runs through Shopify and that runs on Ruby on Rails. So Ruby on Rails is able to scale up to that level without using static typing in all of what it does.

(01:14:45) Now, I know they’ve done certain experiments in certain ways, because they are hitting some of the limits that you will hit with dynamic typing. Some of those limits you hit with dynamic typing are actually, by the way, just limits you hit when you write 5 million lines of code. I think the Shopify monolith is about 5 million lines of code. At that scale, everything breaks because you’re at the frontier of what humans are capable of doing with programming languages. The difference in part is that Ruby is such a succinct language that those 5 million, if they had been written in, let’s just say Go or Java, would have been 50 or 25. Now, that might have alleviated some of the problems that you have when you work on huge systems with many programmers, but it certainly would also have compounded them; try to understand 25 million lines of code.

Lex Fridman (01:15:33) So the thing does scale. That’s a persistent myth, that it doesn’t scale, Shopify, and others, but Shopify I think is a great example. By the way, I love Shopify and I love Toby.

DHH (01:15:45) You’ve got to have Toby on. I just talked to him this morning

Lex Fridman (01:15:47) For sure. He’s a brilliant … I got to hang out with him in the desert somewhere, I forget, in Utah. He’s just a brilliant human. Shopify.com/luxe has been supporting this podcast for the longest time. I don’t think actually Toby knows that they sponsor this podcast. I mean, it’s a big company, right?

DHH (01:16:05) It’s a huge company. I think just under 10,000 employees, market cap of $120 billion, GMV of a quarter of a trillion every quarter.

Lex Fridman (01:16:16) He’s involved with the details though.

DHH (01:16:18) He is, very much so. Funny story about Toby, Toby was on the Rails core team back in the mid-2000s. Toby himself-

DHH (01:16:28) … wrote Active Merchant, which is one of the frameworks for creating shops. He wrote the Liquid templating language that Shopify still uses to this day. He has a huge list of contributions to the Rails ecosystem and he’s the CEO of the company. I think it’s very inspiring to me, because it’s such at the opposite end of what I like to do. I like to chisel code with my own hands most of the day, he runs a company of almost 10,000 people. That is literally, world commerce depends on it, a level of criticality I can’t even begin to understand. Yet, we can see eye to eye on so many of these fundamental questions in computer science and program development. That is a dynamic range, to be able to encompass Rails, being a great tool for the one developer who’s just starting out with an idea … who don’t even fully know everything, who is right at the level where PHP would have been a good fit in those late ’90s. Because yeah, I can probably upload something to an FTP server and so on.

(01:17:33) Rails does have more complexity than that, but it also has so much longer runway. The runway goes all the way to goddamn Shopify. That is about the most convincing argument I can make for dynamic range, that we can do a lot of it. And even having said that, Shopify is the outlier of course. I don’t think about Shopify as the primary target when I write Rails, I think of the single developer. Actually, I do think about Shopify, but I don’t think about Shopify now. I think of Shopify when Toby was writing Snow Devil, which was the first E-commerce store to sell snowboards that he created. That was the pre-Shopify Shopify he created all by himself. And that was possible because Ruby on Rails isn’t just about beautiful code, it’s just as much about productivity. It’s just as much about the impact that an individual programmer is able to have.

(01:18:24) That they can build system where they can keep the whole thing in their head and be able to move it forward, such that you can go from one developer sitting and working on something … and that something is Shopify, and it turns into what it is today. When we talk about programming languages and we compare them, we often compare them at a very late stage. Like, what is the better programming language for, let’s say Twitter in 2009 when it’s already a huge success? Twitter was started on Ruby on Rails. They then hit some scaling problems, it was a big debacle at the time. They end up then I think writing it in some other language, which by the way I think is the best advertisement ever for Ruby on Rails, because nothing fucking happened for 10 years after they switched over, essentially zero innovation. Some of that was because they were doing a long conversion, and all of the early success in part came because they had the agility to quickly change and adopt and so forth. That’s what startups need. That’s what Shopify needed, that’s what Twitter needed.

(01:19:24) That’s what everyone needs, and that’s the number one priority for Ruby on Rails, to make sure that we don’t lose that. Because what happens so often when development tools and programming language are driven by huge companies, is that they mirror their org chart, React and everything else needed to use that, is in some ways a reflection of how Meta builds Facebook. Because of course it is, because of course it’s an distraction of that. I’m not saying React isn’t a great tool and that can’t used by smaller teams, of course it can, but it’s born in a very different context than something like Ruby on Rails.

Lex Fridman (01:20:00) Let me say as a small aside … because I think we might return to Shopify and celebrate it often, just a personal note. This particular podcast has way more sponsors, and sponsors that want to be sponsors, than I could possibly ever have. It’s really, really important for me to not give a shit and to be able to celebrate people. I celebrate people, I celebrate companies, and I don’t care that they’re sponsoring. I really don’t care. I just want to make that very explicit, because we’re going to continue saying positive things about Shopify. I don’t care, stop sponsoring, it doesn’t really matter to me. Yeah, I just want to make that explicit. But to linger on the scaling thing with the Twitter and the Shopify, can you just explain to me what Shopify is doing with the JIT? What did they have to try to do to scale this thing, because that’s kind of an incredible story, right?

DHH (01:20:59) Yeah. One of the great contributions that Shopify has made to the entire Ruby ecosystem … not just Rails, but in particular Rails, is YJIT. YJIT is their compiler for Ruby that just makes everything a lot more efficient. At Shopify scale, eking out even a five, 10% improvement in Ruby’s overhead and execution time is a huge deal. Now, Shopify didn’t need YJIT. Shopify was already running on the initial version of Ruby that was I think 10 times slower than what we have today, if you look back upon the Ruby 186 that Toby probably started on, just as I started on. That was enough to propel Shopify to the scale that it has today. A lot of the scaling conversation is lost in a failure to distinguish two things. Scale is one package we talk about when there are really multiple packages inside of it. One is runtime performance, latency, how fast can you execute a single request? Can it happen fast enough that the user will not notice? If your Rails request takes a second and a half to execute, the user’s going to notice. Your app is going to feel slow and sluggish.

(01:22:16) You have to get that response time down below, let’s say at least 300 milliseconds. I like to target a 100 milliseconds as my latency. That kind of performance, how much performance of that kind of latency can you squeeze out of a single CPU core? That tells you something about what the price of a single request will be. But then whether you can deal with one million requests a second, like Shopify is doing right now, if you have one box that can do 1,000 requests a second, you just need X boxes to get up to a million. What you’ll actually find is that when it comes to programming languages, they’re all the same in this way. They all scale, largely, beautifully horizontally, you just add more boxes. The hard parts of scaling a Shopify is typically not the programming language, it’s the database. That’s actually one of the challenges that Shopify has now is, how do you deal with MySQL at the scale that they’re operating at? When do you need to move to other databases to get worldwide performance? All of these things. The questions about scaling Ruby are economic questions.

(01:23:28) If we’re spending so-and- so much on application servers, if we can get just 5% more performance out of Ruby, well, we could save 5% of those servers and that could filter down into the budget. Now, that analysis concludes into basically one thing, Ruby is a luxury language. It’s a luxury, the highest luxury, in my opinion. It is the Coco Chanel of programming languages, something that not everyone can afford, and I mean this in the best possible way. There are some applications on the Internet where each request has so little value, you can’t afford to use a luxurious language like Ruby to program in it. You simply have to slum it with a C or a Go or some other low-level language, or a Rust, talk about line noise there.

Lex Fridman (01:24:17) That’s like the thrift store of languages.

DHH (01:24:19) Exactly. What you need, you need a very low level to do it. You can’t afford to use a luxury language to build it with. That’s not true of Shopify. It wasn’t true of Basecamp even back in 2004. It’s not been true of 99% of all web applications ever created because the main cost component of 99% of web applications, it’s not CPU cores. It’s web cores, it’s human cores. It’s human capacity to understand and involve systems. It’s their personal productivity. I did a calculation once when someone had for the 400th time said, “Oh, if you switch from Ruby to some faster language, you could save a bunch of money.” I calculated it out that at the time … and I think the last time I did this calculation was almost a decade ago, we were spending about 15% of our operating budget on Ruby application servers. So for me, to improve my cost profile of the business by seven percentage points, I’d have to pick something twice as fast. That’s quite hard.

(01:25:27) Versus, if Ruby and Ruby on Rails was even 10% more productive than something else, I would move the needle far more, because making individual programmers more productive actually matters a lot more. This is why people are so excited about AI. This is why they’re freaking out over the fact that a single programmer in Silicon Valley, who makes $300,000 a year, can now do the work of three or five, at least in theory. I haven’t actually seen that fully in practice. But let’s just assume the theory is correct, if not now, then in six months, that’s a huge deal. That matters so much more than whether you can squeeze a few more cycles out of the CPU when it comes to these kinds of business applications. If you’re making Unreal Engine rendering stuff, like Tim Sweeney you had on, yeah, he needs to really sweat all those details. The Nanite engine can’t run on Ruby. It’s never going to, it was not meant for that, fine. These kinds of business applications absolutely can.

(01:26:25) And everything people are excited about AI for right now, that extra capacity to just do more, that was why we were excited about Ruby back in the early 2000s. It was because I saw that if we could even squeeze out a 10% improvement of the human programmer, we’d be able to do so much more for so much less.

Future of programming

Lex Fridman (01:26:47) We probably argue about this, but I really like working together with AI, collaborating with AI. I would argue that the kind of code you want AI to generate is human-readable, human interpretable. If it’s generating pro golf code, it’s not a collaboration. So it has to be speaking the human … it’s not just, you’re writing the prompts in English, you also want to read the responses in the human-interpretable language at Ruby, right? So that actually is beneficial for AI too. Because you’ve said that for you the sculptor, the elitist Coco Chanel sculptor, you want on your fancy keyboard to type every single letter yourself with your own fingers. But it’s also, the benefit of Ruby also applies once that is written by AI and you’re actually doing with your own fingers the editing part, because you can interact with it because it’s human interpretable.

DHH (01:27:47) The paradigm I really love with this was something Elon actually said on one of your shows when you guys were talking about Neuralink, that Neuralink allows the bandwidth between you and the machine to increase. That language, either spoken or written, is very low bandwidth. If you are to calculate just how many bits we can exchange as we’re sitting here, it’s very slow. Ruby has a much higher bandwidth of communication, revealed, conveys so much more concept per character than most other programming languages do. So when you are collaborating with AI, you want really high bandwidth. You want it to be able to produce programs with you, whether you’re letting it write the code or not, that both of you can actually understand really quickly. And that you could compress a grand concept, a grand system into far fewer parts that both of you can understand. Now, I actually love collaborating with AI too. I love chiseling my code, and the way I use AI is in a separate window. I don’t let it drive my code. I’ve tried that. I’ve tried the Cursors and the Windsurfs and I don’t enjoy that way of writing.

(01:29:03) One of the reasons I don’t enjoy that way of writing is, I can literally feel competence draining out of my fingers. That level of immediacy with the material disappears. Where I felt this the most was, I did this remix of Ubuntu called Omakub when I switched to Linux. It’s all written in Bash. I’d never written any serious amount of code in Bash before, so I was using AI to collaborate, to write a bunch of Bash with me, because I needed all this. I knew what I wanted, I could express it in Ruby, but I thought it was an interesting challenge to filter it through Bash. Because what I was doing was setting up a Linux machine, that’s basically what Bash was designed for. It’s a great constraint. But what I found myself doing was asking AI for the same way of expressing a conditional, for example, in Bash over and over again. That by not typing it, I wasn’t learning it. I was using it, I was getting the expression I wanted, but I wasn’t learning it. I got a little scared.

(01:30:08) I got a little scared, is this the end of learning? Am I no longer learning if I’m not typing? The way I, for me, recast that was, I don’t want to give up on the AI. It’s such a better experience as a programmer to look up APIs, to get a second opinion on something, to do a draft, but I have to do the typing myself because you learn with your fingers. If you’re learning how to play the guitar, you can watch as many YouTube videos as you want, you’re not going to learn the guitar. You have to put your fingers on the strings to actually learn the motions. I think there is a parallel here to programming, where programming has to be learned in part by the actual typing.

Lex Fridman (01:30:50) I’m just really, this is fascinating. Listen, part of my brain agrees with you 100%, part doesn’t. I think AI should be in the loop of learning. Now, current systems don’t do that, but I think it’s very possible for Cursor to say, to basically force you to type certain things. So if you set the mode of learning … I don’t want to be this, give up on AI. I think vibe coding is a skill, so for an experienced programmer it’s too easy to dismiss vibe coding as a thing.

DHH (01:31:31) I agree, I wouldn’t dismiss it.

Lex Fridman (01:31:32) But I think you need to start building that skill and start to figure out, how do you prevent the competency from slipping away from your fingers and brain? How do you develop that skill in parallel to the other skill? I don’t know. I think it’s a fascinating puzzle though. I know too many really strong programmers that just avoid AI, because it’s currently a little too dumb.

DHH (01:31:57) Yes. It’s a little too slow, is actually my main problem. It’s a little too dumb in some ways, but it’s a little too slow in other ways. When I use Claude’s Code, the terminal version of Claude … which is actually my preferred way of using it, I get too impatient. It feels like I’m going back to a time where code had to compile and I had to go do something else, boil some tea while the code is compiling. Well, I’ve been working in Ruby for 20 years, I don’t have compile wait in me anymore, so there’s that aspect of it. But I think the more crucial aspect for me is, I really care about the competence. I’ve seen what happens to even great programmers the moment they put away the keyboard, because even before AI, this would happen as soon as people would get promoted. Most great programmers who work in large businesses, stop writing code on a daily basis because they simply have too many meetings to attend to, they have too many other things to do, and invariably they lose touch with programming.

(01:32:57) That doesn’t mean they forget everything but if you don’t have your fingers in the sauce, the source, you are going to lose touch with it. There’s just no other way. I don’t want that because I enjoy it too much. This is not just about outcomes. This is what’s crucial to understand, programming for programmers who like to code is not just about the programs they get out of it. That may be the economic value. It’s not the only human value. The human value is just as much in the expression. When someone who sits down on a guitar and plays Stairways to Heaven, there’s a perfect recording of that, that will last in eternity. You can just put it on Spotify, you don’t actually need to do it. The joy is to command the guitar yourself. The joy of a programmer, of me as a programmer, is to type the code myself. If I elevate, if I promote myself out of programming, I turn myself into a project manager, a project manager of a murder of AI crows, as I wrote the other day. I could have become a project manager my whole career.

(01:34:05) I could have become a project manager 20 years ago if I didn’t care to write code myself and I just wanted outcomes. That’s how I got started in programming, I just wanted outcomes. Then I fell in love with programming, and now I’d rather retire than giving it up. Now, that doesn’t mean you can’t have your cake and eat it too. I’ve done some vibe coding where I didn’t care that I wasn’t playing myself. I just wanted to see something that was an idea in my head. I wanted to see something, that’s fine. I also use AI all day long. In fact, I’m already at the point where if you took it away from me, I’d be like, oh my God, how do we even look things up on the Internet anymore? Is Stack Overflow still around, is forum still a thing? How do I even find answers to some of these questions I have all day long? I don’t want to give up AI. In fact, I’d say the way I like to use AI, I’m getting smarter every day because of AI because I’m using AI to have it explain things to me.

(01:35:02) Even the stupid questions I would be a little embarrassed to even enter into Google, AI is perfectly willing to give me the ELI5 explanation of some Unix command I should have known already but I don’t. I’m sorry, can you just explain it to me? Now I know the thing. So at the end of the day, of me working with AI all day long, I’m a little bit smarter, like 5%. Sorry, not 5%, half a percent maybe, that compounds over time. But what I’ve also seen when I worked on the Omakub project and I tried to let AI drive for me, I felt I was maybe half a percent dumber at the end of the day.

Lex Fridman (01:35:41) Okay, you’ve said a lot of interesting things. First of all, let’s just start at the very fact that asking dumb questions, if you go to Stack Overflow and ask a dumb question or read somebody else’s dumb question and the answer to it, there’s a lot of judgment there. AI, sometimes to an excessive degree, has no judgment. It usually says, oh, that’s a great question.

Lex Fridman (01:36:02) Yeah. Oh, that’s wonderful. Yeah. I mean, it’s so conducive to learning. It’s such a wonderful tool for learning and I too would miss it. It’s a great basically search engine into all kinds of nuances of a particular programming language, especially if you don’t know it that well. Or APIs you can load in documentation, it’s just so great for learning. For me personally, I mean, on the happiness scale, it makes me more excited to program. I don’t know what that is exactly. Part of that is the … I’m really sorry, Stack Overflow is an incredible website but there is a negativity there. There’s a judgment there. It’s just exciting to be with a hype man next to me just saying, yeah, that’s a great idea. I’ll say, no, that’s wrong, I’ll correct the AI. The AI will say, you’re absolutely right, how did I not think about that? You’re ready to go. I’m like, holy shit, I’m having, it’s like a buddy that’s really being positive and is very smart and is challenging me to think.

(01:37:12) And even if I never use the code it generates, I’m already a better programmer. But actually the deeper thing is, for some reason I’m having more fun. That’s a really, really important thing.

DHH (01:37:23) I like to think of it as a pair programmer for exactly that reason. Pair programming came vogue in the 2000s, where you’d have two programmers in front of one machine and you’d push the keyboard between you. One programmer would be driving, they’d be typing in. The other programmer would essentially sit and watch the code, suggest improvements, look something up. That was a really interesting dynamic. Now unfortunately, I’m an introvert, so I can do that for about five minutes before I want to jump off a bridge. So it doesn’t work for me as a full-time occupation, but AI allows me to have all the best of that experience all the time. Now, I think what’s really interesting what we said about, it makes it more fun. I hadn’t actually thought about that, but what it’s made more fun to me is to be a beginner again. It made it more fun to learn Bash successfully for the first time.

(01:38:14) Now, I had to do the detour where I let it write all the code for me, and I realized I wasn’t learning nearly as much as I hoped I would. That I started doing once I typed it out myself. But it gave me the confidence that, you know what? If I need to do some iOS programming myself … I haven’t done that in, probably six years was the last time I dabbled in it. I never really built anything for real. I feel highly confident now that I could sit down with AI and I could have something in the app store by the end of the week. I would not have that confidence unless I had a pair programming body like AI. I don’t actually use it very much for Ruby code. I’m occasionally impressed whenever I try it, like, oh, it got this one thing right, that is truly remarkable and it’s actually pretty good. And then I’ll ask two more questions and I go like, oh yeah, okay, if you were my junior programmer I’d start tapping my fingers and going like, you’ve got to shape up.

(01:39:05) Now, the great thing of course is, we can just wait five minutes. The Anthropic CEO seems to think that 90% of all code by the end of the year is going to be written by AI. I’m more than a little bit skeptical about that, but I’m open-minded about the prospect that programming potentially will turn into a horse when done manually. Something we do recreationally is no longer a mode of transportation to get around LA. You’re not going to saddle up and go to the grocery store and pick up stuff from Whole Foods in your saddlebags. That’s just not a thing anymore. That could be the future for programming, for manual programming, entirely possible. I also don’t care. Even though we have great renditions of all the best songs, as I said, there are millions of people who love to play the guitar. It may no longer have as much economic value as it once did. I think that I’m quite convinced is true, that we perhaps have seen the peak.

(01:40:01) Now, I understand the paradox, when the price of something goes down, actually the overall usage goes up, and total spend on that activity goes up. That could also happen maybe. But what we’re seeing right now is that a lot of the big shops, a lot of the big companies, are not hiring like they were five years ago. They’re not anticipating they’re going to need tons more programmers. Controversially, Toby actually put out a memo inside of Shopify asking everyone who’s considering hiring someone to ask the question, could this be done by AI? Now, he’s further ahead on this question than I am. I look at some of the code and [trenches 01:40:37] and I go like, I’d love to use AI more, and I see how it’s making us more productive. But it’s not yet at the level where I just go like, oh, we have this project, let me just give it to the AI agent and it’s going to go off and do it.

Lex Fridman (01:40:47) But let’s just be honest, you’re like a Clint Eastwood type character cowboy on a horse seeing cars going around. You’re like, well-

DHH (01:40:56) That’s part of it. I think it is important to have that humility, that what you are good at may no longer be what society values. This has happened a million times in history … that you could have been exceptionally good at saddle making, for example. That’s something that a lot of people used to care about because everyone rode a horse. And then suddenly riding a horse became this niche hobby, that there’s some people care about it, but not nearly as many. That’s okay. Now, the other thing of this is, I’ve had the good fortune to have been a programmer for nearly 30 years. That’s a great run. I try to look at life in this way, that I’ve already been blessed with decades of economically viable, highly valuable ways of translating what I like best in the working world, to write Ruby code. That that was so valuable that I could make millions and millions of dollars doing it, and if that’s over tomorrow, I shouldn’t look at that with regret. I should look at it with gratitude.

Lex Fridman (01:41:57) But you’re also a highly experienced, brilliant and opinionated human …

Lex Fridman (01:42:00) Brilliant and opinionated human being. So it’s really interesting to get your opinion on the future of the horse because there’s a lot of young people listening to this who love programming or who are excited by the possibility of building stuff with software, with Ruby on Rails, that kind of language and now the possibility.

Lex Fridman (01:42:25) Is it a career and how if indeed a single person can build more and more and more with the help of AI, how do they learn that skill? Is this a good skill to learn? I mean, that to me is the real mystery here because I think it’s still absolutely true that you have to learn how to program from scratch currently, but how do you balance those two skills? Because I too, as I’m thinking now, there is a scary slipping away of skill that happens in a matter of really minutes on a particular piece of code. It’s scary the way driving when you have a car drive for you doesn’t quite slip away that fast. So that really scares me. When somebody comes up to me and asks me how do I learn to program? I don’t know what the advice is because I think it’s not enough to just use Cursor or Copilot to generate code.

DHH (01:43:28) It’s absolutely not enough. Not if you want to learn, none of you want to become better at it. If you just become a tap monkey, maybe you’re productive in a second, but then you have to realize, well, can anyone just tap if that’s all we’re doing is just sitting around all day long tapping? Yes, yes, yes, yes, yes. That’s not a marketable skill. Now, I always preface this both to myself and when I speak to others about it, is rule number note one, nobody fucking knows anything. No one can predict even six months ahead.

Future of AI

(01:43:58) Right now, we’re probably at peak AI future hype because we see all the promise, because so much of it is real and so many people have experienced it themselves. This mind-boggling thing that the silicon is thinking in some way that feels eerily reminiscent of humans. I’d actually say the big thing for me wasn’t even ChatGPT, it wasn’t even Claude. It was DeepSeek. Running DeepSeek locally and seeing the think box where it converses with itself about how to formulate the response. I almost wanted to think, is this a gimmick? Is it doing this as a performance for my benefit? But that’s not actually how it thinks. If this is how it actually thinks. Okay, I’m a little scared. This is incredibly human how it thinks in this way, but where does that go? So in ’95, one of my favorite movies, one of my favorite B movies came out, The Lawnmower Man.

DHH (01:44:57) Incredible movie about virtual reality. Being an avatar and living in VR, the story was a mess, but the aesthetics, the world that build up was incredible and I thought, we’re five years away. I’m going to be living in VR now. I’m just going to be floating around. I’m going to be an avatar. This is where most humans can spend most of the day. That didn’t happen. We’re 30 years later, VR is still not here. It’s here for gaming. It’s here for some specialized applications. My oldest loves playing Gorilla Tag. I don’t know if you’ve tried that. That’s basically the hottest VR game. Wonderful. It’s great. It’s really hard to predict the future because we just don’t know. And then when you factor into AI and you have even the smartest people go like, “I don’t think we fully understand how this works.”

Lex Fridman (01:45:49) But then on the flip side, you have Moore’s law that seems to work for many, many, many years in decreasing the size of transistor, for example. Flash didn’t take over the internet, but Moore’s law worked, so we don’t know which one AI is.

DHH (01:46:07) It is what it is. And this is what I find so fascinating to, I forget who did this presentation, but someone in the web community, this great presentation on the history of the airplane. So you go from the Wright brothers flying in, what was 1903 or something like that, and 40 years later you have a jet flight, just an unbelievable amount of progress in four decades. Then in ’56, I think it was, the whole design for the Boeing 747 century precursor was designed and basically nothing has happened since. Just minor tweaks and improvements on the flying experience since the ’50s. Somehow, if you were to predict where flying was going to go and you were sitting in ’42 and you’d seen, you’d remember the Wright brothers flying in oh three and you were seeing that jet engines coming, you’re like, “We’re going to fly to the stars in another two decades.”

(01:47:04) We’re going to invent super mega hypersonic flights that’s going to traverse the earth in two hours, and then that didn’t happen. It tapped out. This is what’s so hard about predicting the future. We can be so excited in the moment because we’re drawing a line through early dots on a chart, and it looks like those early dots is just going up into the right and sometimes it’s just flattened out. This is also one of those things where we have so much critical infrastructure, for example, that still runs on COBOL, that about five humans around the world really understand truly, deeply that it’s possible for society to lose a competence it still needs because it’s chasing the future.

(01:47:44) COBOL is still with us. This is one of the things I think about with programming. Ruby on Rails is at such a level now that in 50 years from now, it’s exceedingly likely that there’s still a ton of Ruby on Rails systems running around now, very hard to predict what that exact world is going to be like, but yesterday’s weather tells us that if there’s still COBOL code from the ’70s operating social security today, and we haven’t figured out a clean way to convert that, let alone understand it, we should certainly be humble about predicting the future.

(01:48:16) I don’t think any of the programmers who wrote that COBOL code back in the ’70s had any idea that in 2025 checks were still being cut off the business logic that they had encoded back then. But that just brings me to the conclusion on the question for what should a young programmer do? You’re not going to be able to predict the future. No one’s going to be able to predict the future. If you like programming, you should learn programming. Now, is that going to be a career forever? I don’t know, but what’s going to be a career forever? Who knows? A second ago we thought that it was the blue-collar labor that was going to be abstracted. First, it was the robots that were going to take over. Then Gen AI comes out, and then all the artists suddenly look like, “Holy shit, is this going to do all animation now? Is going to do all music now?”

(01:48:59) They get real scared, and now I see the latest Tesla robot going like, “Oh, maybe we’re back now to blue-collar being in trouble because if it can dance like that, it can probably fix a toilet.” So no one knows anything, and you have to then position yourself for the future in such a way that it doesn’t matter that you pick a profession or path where if it turns out that you have to retool and re-skill, you’re not going to regret the path you took. That’s a general life principle. For me, how I look at all endeavors I involved myself in is I want to be content with all outcomes.

(01:49:39) When we start working on a new product at 37 Signals, I set up my mental model for success and I go, “Do you know what? If no one wants this, I will have had another opportunity to write beautiful Ruby code to explore greenfield domain, to learn something new, to build a system I want, even if no one else wants it.” What a blessing, what a privilege. If a bunch of people want it, that’s great. We can pay some salaries, we can keep the business running, and if it’s a blowaway success, wonderful. I get to impact a bunch of people.

Vibe coding

Lex Fridman (01:50:13) I think one of the big open questions to me is how far you can get with vibe coding, whether an approach for a young developer to invest most of the time into vibe coding or into writing code from scratch. So vibe coding, meaning I’m leaning into the meme a little bit, but the vibe coding, meaning you generate code, you have this idea of a thing you want to create, you generate the code and then you fix it with both natural language to the prompts and manually. You learn enough to manually fix it. So that’s the learning process. How you fix code that’s generated or you write code from scratch and have the LMS kind of tab, tab, tab, tab, add extra code, like which part do you lean on? I think to be safe, you should find the beauty and the artistry and skill in both, right? From scratch, so there should be some percent of your time just writing from scratch and some percent vibe coding.

DHH (01:51:16) There should be more of the time writing from scratch if you are interested in learning how to program. Unfortunately, you’re not going to get fit by watching fitness videos. You’re not going to learn how to play the guitar by watching YouTube guitar videos. You have to actually play yourself. You have to do the sit-ups. Programming, understanding, learning almost anything requires you to do. Humans are not built to absorb information in a way that transforms into skills by just watching others from afar. Now, ironically, it seems AI is actually quite good at that, but humans are not. If you want to learn how to become a competent programmer, you have to program. It’s really not that difficult to understand. Now, I understand the temptation and the temptation is there because vibe coding can produce things perhaps in this moment, especially in new domain, you’re not familiar with tools you don’t know perfectly well that’s better than what you could do or that you would take much longer to get at, but you’re not going to learn anything.

(01:52:15) You’re going to learn in this superficial way that feels like learning but is completely empty calories, and secondly, if you can just vibe code it, you’re not a programmer. Then anyone could do it, which may be wonderful. That’s essentially what happened with the Access database. That’s what happened with Excel. It took the capacity of accountants to become software developers because the tools became so accessible to them that they could build a model for how the business was going to do next week that required a programmer prior to Excel. Now, it didn’t because they could do it themselves by coding enables non-programmers to explore their ideas in a way that I find absolutely wonderful, but it doesn’t make you a programmer.

Lex Fridman (01:53:02) I agree with you, but I want to allow for room for both of us be wrong. For example, there could be vibe coding could actually be a skill that if you train it and by vibe coding, let’s include the step of correction, the iterative correction, it’s possible if you get really good at that, that you’re outperforming the people that write from scratch that you can come up with truly innovative things, especially at this moment in history while the LLMs are a little bit too dumb to create super novel things and a complete product, but they’re starting to creep close to that, so if you are investing time now into becoming a really good vibe coder, maybe this is the right thing to do. If it’s indeed a skill, we kind of meme about vibe coding, like sitting back and it’s in the name, but if you treat it seriously, a competitive vibe coder and get good at riding the wave of AI and get good at the skill of editing code versus writing code from scratch, it’s possible that you can actually get farther in the long term.

(01:54:12) Maybe editing is a fundamentally different task than writing from scratch if you take that seriously as a skill that you develop. I see. To me, that’s an open question. I just think I personally, now you’re on another level, but just personally, I’m not as good at editing the code that I didn’t write. That’s a different-

Lex Fridman (01:54:38) No one is of this generation, but maybe that’s a skill. Maybe if you get on the same page as the AI, because there’s a consistency to the AI. It’s like it really is a pair of programmers with a consistent style and structure and so on. Plus, with your own prompting, you can control the kind of code you write. I mean, it could legitimately be a skill.

DHH (01:54:59) That’s the dream of the prompt engineer. I think it’s complete pipe dream. I don’t think editors exist that aren’t good at writing. I’ve written a number of books. I’ve had a number of professional editors. Not all of them wrote their own great books, but all of them were great writers in some regard. You cannot give someone pointers if you don’t know how to do it. It’s very difficult for an editor to be able to spot what’s wrong with a problem if the data couldn’t make the solution themselves. The capacity to be a good editor is the reward you get from being a good doer. You have to be a doer first. Now, that’s not the same as saying that vibe coding, prompt engineering won’t be able to produce fully formed amazing systems even shortly. I think that’s entirely possible, but then there’s no skill left, which maybe is the greatest payoff at all.

(01:55:57) Wasn’t that the whole promise of AI anyway, that it was just all natural language that even my clumsy way of formulating a question could result in a beautiful succinct answer? That actually to me is a much more appealing vision that there’s going to be these special prompt engineering wizards who know how to tickle the AI just right to produce what they want. The beauty of AI is to think that someone who doesn’t know the first thing about how AI actually works is able to formulate their idea and their aspirations for what they want, and the AI could somehow take that messy clump of ideas and produce something that someone wants.

(01:56:35) That’s actually what programming has always been. There’s very often been people who didn’t know how to program, who wanted programs, who then hired programmers, who gave them messy descriptions of what they wanted, and then when the programmers delivered that back said, “Oh, no, actually that’s not what I meant. I want else.” AI may be able to provide that cycle if that happens to the fullest extent of it, yeah, there’s not going to be as many programmers around, but hopefully presumably someone still, at least for the foreseeable future, have to understand whether what the AI is producing actually works or not.

Lex Fridman (01:57:11) As an interesting case study, maybe a thought experiment, if I wanted to vibe code Basecamp or hey, some of the products you’ve built, what would be the bottlenecks? Where would I fail along the way?

DHH (01:57:30) What I’ve seen when I’ve been trying to do this, trying to use vibe coding to build something real is you actually fail really early. The vibe coding is able to build a veneer at the current present moment of something that looks like it works, but it’s flawed in all sorts of ways. There are the obvious ways, the meme ways that it’s leaking all your API keys, it’s storing your password in plain text. I think that’s ultimately solvable. It’s going to figure that out, or at least it’s going to get better at that, but its capacity to get lost in its own Labyrinth is very great right now. You let it code something and then you want to change something and it becomes a game of Whack-A-Mole real quick.

(01:58:09) Pieter Levels who’ve been doing this wonderful flight simulator was talking to that where at a certain scale the thing just keeps biting its own tail. You want to fix something and it breaks five other things, which I think is actually uniquely human because that’s how most bad programmers are at a certain level of complexity with the domain. They can’t fix one thing without breaking three other things, so in that way I’m actually in some way it’s almost a positive signal for that. The AI is going to figure this out because it’s done an extremely human trajectory right now. The kind of mistakes it’s making are the kind of mistakes that junior programmers make all the time.

Rails manifesto: Principles of a great programming language

Lex Fridman (01:58:43) Yeah. Can we zoom out and look at the vision, the manifesto, the doctrine of Rails? What are some of the things that make a programming language a framework? Great, especially for web development, so we talked about happiness.

Lex Fridman (01:59:00) The underlying objective of Ruby. What else?

DHH (01:59:04) So you’re looking at the nine points I wrote out in I think 2012 and first, before we dive into them, I want to say the reason I wrote it down is that if you want a community to endure, you have to record its values and you have to record its practices. If you don’t, eventually you’re going to get enough new people come in who have their own ideas of where this thing should go, and if we don’t have a guiding light helping us to make decisions, we’re going to start flailing. We’re going to start actually falling apart. I think this is one of the key reasons that institutions of all kinds start falling apart. We forget why Chesterton’s fence is there. We just go like, why is that fence there? Let’s yank it out. Oh, it was to keep the wolves out. Now we’re all dead.

(01:59:49) Oops. So I wanted to write these things down and if we just take them quick one by one, you talked about optimizing for programmer happiness. I put that at number one in homage of Matz, and that’s a lot about accepting that there is occasionally a trade-off between writing beautiful code and other things we want out of systems. There could be a runtime trade-off. There can be a performance trade-off, but we’re going to do it nonetheless. We’re also going to allow ambiguity in a way that many programmers by default are uncomfortable with. I give the example actually here of in the interactive Ruby Shell where you can play with the language or even interact with your domain model. You can quit it in two ways, at least that I found. You can write exit. Boom, you’re out of the program. You can write quit. Boom, you’re out of the program.

(02:00:38) They do the same thing. We just wrote both exit or the people who built that wrote both exit and quit because they knew humans were likely to pick one or the other. Python is the perfect contrast to this. In the Python interactive protocol, if you write exit, it won’t exit. It’ll give you a fucking lesson. It’ll basically tell you to read the fucking manual. It says, “Use exit() or Ctrl+D i.e. end of file to exit.” I’m like one is very human and another is very engineer, and I mean that both of them in the best possible way. Python is pedantic. Python’s the value from the start stated is that there should be preferably one and only one way to do a certain thing. Ruby is the complete opposite. No, we want the full expression that fits different human brains such that it seems like the language is guessing just what they want.

Lex Fridman (02:01:37) And part of that is also you described the principle of the least surprise, which is a difficult thing to engineer a language because it’s a subjective thing.

DHH (02:01:47) Which is why you can’t do it in one way, which is why I used the example of both exit and quit. The principle of least surprise for some people would be like, “Oh, exit. That’s how I get out of the prompt. For other people, it would be quit.” Why don’t we just do both?

Lex Fridman (02:02:01) Okay, so what’s the convention over configuration? That’s a big one.

DHH (02:02:05) That’s a big one. That’s a huge one. And it was born out of a frustration I had in the early days with especially Java frameworks where when you were setting up a web application framework for Java back in the day, it was not uncommon to literally write right hundreds if not thousands of lines of XML configuration files. Oh, I need this. I want the database to use the foreign keys as post underscore ID. No, no, no. I want it as post capital ID. Oh, no, no, no. You have to do a capital PID. There are all these ways where you can configure how foreign relation keys should work in a database and none of them matter. We just need to pick one and then that’s fine, and if pick one and we can depend on it, it becomes a convention. If it’s a convention, we don’t have to configure it if we don’t have to configure it, you can get started with you actually care about much quicker.

(02:02:57) Convention of a configuration is essentially to take that idea that the system should come pre-assembled. I’m not just handing you a box of fucking Legos and asking you to build the Millennium Falcon. I’m giving you a finished toy. You can edit, you can change it. It’s still build out a Legos. You can still take some pieces off and put in some other pieces, but I’m giving you the final product and this cuts against the grain of what most programmers love. They love a box of Legos. They love to put everything together from scratch. They love to make all these detailed little decisions that just don’t matter at all, and I want to elevate that up such that, hey, I’m not trying to take the decisions away from you. I just want you to focus on decisions that actually matter that you truly care about. No one cares about whether it’s post underscore ID or post ID or PID.

Lex Fridman (02:03:41) Yeah, great defaults.

Lex Fridman (02:03:44) It’s just a wonderful thing. You have all these aspirations, they’re going to do some kind of custom, most beautiful Legos castle that nobody’s ever built from these pieces, but in reality to be productive in most situations, you just need to build the basic thing and then on top of that is where your creativity comes.

DHH (02:04:03) Absolutely, and I think this is one of those, part of the doctrine that a lot of programmers who get to use Ruby on Rails begrudgingly will acknowledge it’s a nice thing. Even if they don’t really like it’s hard to beat the attraction to building with Legos from scratch out of programmers. That’s just what we like. This is why we’re programmers in the first place because we’d like to put these little pieces together, but we can direct that instinct towards a more productive end of the stack.

Lex Fridman (02:04:33) Okay. What are some of the other ones?

DHH (02:04:35) The menu is omakase. It actually comes out of the same principle that great defaults really matter. If you look at everything that’s wrong with the JavaScript ecosystem right now, for example, it is that no one is in charge of the menu. There are a billion different dishes and you can configure just your tailored specific configuration of it, but no one done the work to make sure it all fits together, so you have all these unique problems in the JavaScript ecosystem, for example, there’s probably 25 major ways of just doing the controller layer and then as many of how to talk to the database, so you get this permutation of N times N times N of no one is using the same thing.

(02:05:17) And if they are using the same thing, they’re only using the same thing for about five minutes, so we have no retained wisdom. We build up no durable skills. Rails goes the complete opposite way of saying do you know what? Rails is not just a web framework. It is a complete attempt at solving the web problem. It’s complete attempt at solving everything you need to build a great web application, and every piece of that puzzle should ideally be in the box pre-configured, pre-assembled.

(02:05:48) If you want to change some of those pieces later, that’s wonderful, but on day one you’ll get a full menu designed by a chef who really cared about every piece of the ingredient and you’re going to enjoy it, and that’s again one of those things where many programmers think like I know better and they do in some hyperlocal sense of it. Every programmer knows better. This is what Ruby is built on, that every programmer knows better in their specific situation. Maybe they can do something dangerous, maybe they think they know better and then they blow their foot off and then they truly will know better because they’ve blown their foot off once and won’t do it again. But the menu on omakase is that.

Lex Fridman (02:06:28) So you in general see the value in the monolith?

DHH (02:06:32) Yes. The integrated system.

DHH (02:06:35) That someone thought of the whole problem. This is one of the reasons why I’ve been on a crusade against microservices since the term was coined. Microservices was born out of essentially a good idea. What do you do at Netflix scale when you have thousands of engineers working on millions of lines of code? No one can keep that entire system in their head at one time. You have to break it down. Microservices can be a reasonable way to do that when you’re at Netflix scale. When you apply that pattern to a team of 20 programmers working on a code base of half a million lines of code, you’re an idiot. You just don’t need to turn method invocations into network calls. It is the first rule of distributed programming. Do not distribute your programming. It makes everything harder. All the failure conditions you have to consider as a programmer just becomes infinitely harder when there’s a network cable involved, so I hate the idea of premature decomposition and microservices is exactly that.

(02:07:35) The monolith says let’s try to focus on building a whole system that a single human can actually understand and push that paradigm as far as possible by compressing all the concepts such that more of it will fit into memory of a single operating human, and then we can have a system where I can actually understand all of Basecamp. I can actually understand all of HEY. Both of those systems are just over a hundred thousand lines of code. I’ve seen people do this that maybe twice, maybe three times that scale and then it starts breaking down. Once you get north of certainly half a million lines of code, no individual human can do it, and that’s when you get into maybe some degree of microservices can make sense.

Lex Fridman (02:08:12) Basecamp and HEY are both a hundred thousand?

DHH (02:08:14) A hundred thousand lines of code.

DHH (02:08:16) It’s considering the fact that Basecamp I think has something like 420 screens, different ways and configurations.

Lex Fridman (02:08:23) Do you include the front end in that?

DHH (02:08:25) No, that’s the Ruby code. Well, it’s front end in the sense that some of that Ruby code is beneficial to the front end, but it’s not JavaScript for example. Now, the other thing we might talk about later is we write very little JavaScript actually for all of our applications. HEY, which is a Gmail competitor. Gmail ships I think 28 of uncompressed JavaScript. If you compress it, I think it’s about six megabytes, 28 megabytes. Think about how many lines of code that is.

(02:08:48) When HEY launched, we shipped 40 kilobytes. It’s trying to solve the same problem. You can solve the email client problem with either 28 megabytes of uncompressed JavaScript or with 40 kilobytes if you do things differently, but that comes to the same problem essentially. This is why I have fiercely fought splitting front end and back end. Apart that in my opinion, this was one of the great crimes against web development that we are still atoning for that we separated and divided what was and should be a unified problem solving mechanism. When you are working both on front end and back end, you understand the whole system and you’re not going to get into these camps that decompose and eventually you end up with shit like GraphQL.

Lex Fridman (02:09:36) Okay. Let’s fly through the rest of the doctrine. No one paradigm.

DHH (02:09:44) No one paradigm goes to the fact that Ruby is a fiercely object-oriented programming language at its core, but it’s also a functional programming language. This five times I told you about, you can essentially do these anonymous function calls and you can chain them together very much in the spirit of how true functional programming languages work, Ruby has even moved closer towards the functional programming and of the scale by making strings immutable. There are ideas from all different disciplines of an all different paradigms of software development that can fit together. Smalltalk, for example, was only object-oriented and that was just it. Ruby tries to be mainly object-oriented, but borrow a little bit of functional programming, a little bit of imperative programming, be able to do all of that. Rails tries to do the same thing. We’re not just going to pick one paradigm and run it through everything.

(02:10:35) Object orientation is at the center of it, but it’s okay to invite all these other disciplines in. It’s okay to be inspired. It’s okay to remix it. I actually think one of the main benefits of Rails is that it’s a remix. I didn’t invent all these ideas. I didn’t come up with ActiveRecord. I didn’t come up with the MVC way of dividing an application. I took all the great ideas that I had learned and picked up from every different camp and I put it together. Not because there was going to be just one single overarching theory of everything, but I was going to have a cohesive unit that incorporated the best from everywhere.

Lex Fridman (02:11:10) Is that idea a bit at tension with the beauty of the monolith system?

DHH (02:11:15) I think the monolith can be thought of as quite roomy, quite as a big tent that the monolith needs actually to borrow a little bit of functional programming for the kinds of problems that that excels, that discipline excels its solving and that paradigm excels its solving. If you also want object orientation at its core, I actually think when I’ve looked at functional programming languages, there’s a lot to love and then I see some of the crazy contortions they have to go through when part of the problem they’re solving calls for mutating something and you go like, “Holy shit, this is a great paradigm from 90% of the problem, and then you’re twisting yourself completely out of shape when you try to solve the last 10.”

Lex Fridman (02:12:00) Ooh, Exalt beautiful code is the next one.

DHH (02:12:03) We’ve talked about that at length and here’s a great example that really summarizes the main specific language quality of Ruby on Rails that you can make code actually pleasant to write and read, which is really funny to me because as we talked about when I started learning programming, it wasn’t even a consideration. I didn’t even know that that could be part of the premise, that that could be part of the solution that writing code could feel as good as writing a poem.

Lex Fridman (02:12:31) Class project, application record belongs to account has many participants, class name person, validates presence of name.

DHH (02:12:41) See, you could read it out. You didn’t even change anything.

Lex Fridman (02:12:44) Like a haiku or something.

DHH (02:12:45) Right. Isn’t that beautiful?

Lex Fridman (02:12:47) Yeah, it’s nice. It’s really nice. There’s an intuitive nature to it. Okay, so I have specific questions there. I mean ActiveRecord, just to take that tangent, that has to be your favorite feature.

DHH (02:13:00) It’s the crown jewel of Rails. It really is. It’s the defining characteristic of how to work with Ruby on Rails. And it’s born in an interesting level of controversy because it actually uses a pattern that had been described by Martin Fowler in the patterns of enterprise application architecture. One of the greatest books for anyone working on business systems and if you had not read it, you must pick it up immediately. Patterns of enterprise application architecture, I think it was published in 2001. It is one of the very few programming books that I have read many times over. It’s incredible in it. Martin describes a bunch of different patterns of how to build business systems essentially. An ActiveRecord is a little bit of a footnote in there. The pattern is literally called ActiveRecord. You can look it up. It’s called ActiveRecord. I wouldn’t even creative enough to come up a name of my own, but it allows the creation, the marriage of database and object orientation in a way that a lot of programmers find a little off-putting.

(02:14:04) They don’t actually want to pollute the beautiful object-oriented nature of that kind of programming with SQL. There was a rant by Uncle Bob the other day about how SQL is the worst thing ever. Okay, fine, whatever. I don’t care. This is practical. We are making crud applications. You’re taking things out of an HTML form and you’re sticking them into a database. It’s not more complicated than that. The more abstractions you put in between those two ends of the spectrum, the more you’re just fooling yourself. This is what we’re doing. We’re talking to SQL databases.

(02:14:39) By the way, quick aside, SQL was one of those things that have endured the onslaught of NoSQL databases structured list data for a better part of a decade and still reign supreme. SQL was a good thing to invest your time in learning. Every program I’m working with the web should know SQL to a fair degree, even if they’re working with an ORM, an object relational mapper as ActiveRecord, you still need to understand SQL. What ActiveRecord does is not so much try to abstract the SQL away behind a different kind of paradigm. It’s just making it less cumbersome to write, making it more amenable to build domain models on top of other domain models in a way, since you don’t have to write every SQL statement by hand.

Lex Fridman (02:15:23) Let’s just say that ActiveRecord is an ORM, which is a layer that makes it intuitive and human interpretable to communicate with a database.

DHH (02:15:33) Even simpler than that. It turns tables into classes and rows into objects. I actually think SQL is very easy to understand most of it. You can write some SQL golf too, that’s very hard to understand, but SQL at its base and much of the criticism against SQL was it was written for human consumption. It’s actually quite verbose, especially if you’re doing things like inserts over and over again. It’s quite verbose. Insert into table, parentheses, enumerate every column you want to insert, values, parentheses.

DHH (02:16:00) In every column you want to insert values, parentheses, every value that fits with that column, it gets tedious to write SQL by hand, but it’s actually very humanly readable. ActiveRecord just takes that tediousness away, it makes it possible to combine things in a way that a humanly describable language just doesn’t. It composes things into methods and you can combine these methods and you can build structures around them. I don’t dislike SQL, I just like a lot of things in programming, I try to get rid of them. SQL wasn’t really one of them, it was just a sense of, “I don’t want to write the same thing over and over again.” It was a, “Can we be a little more succinct? Can we match it just slightly better to the object orientation without trying to hide away the fact that we’re persisting these objects into a database?”

(02:16:47) That’s where I think a lot of ORMs went wrong. They tried to live in the pure world of objects, never to consider that those objects had to be consistent into a SQL database, and then they came up with convoluted way of translating back and forth. ActiveRecord says, “You know what? Just accept it.” This record, this object is not going to get saved into some no-SQL database, it’s going to be saved into SQL database, so just structure the whole thing around that. It’s going to have attributes, those attributes are going to respond to columns in the database. It’s not more complicated than that stuff making it so.

Lex Fridman (02:17:22) Yeah, but I should say, I personally love SQL, because I’m an algorithms person, so I love optimization, I love to know how the databases actually work, so I can match the SQL queries and the design of the tables such that there is optimal… Squeeze the optimal performance out of the table. Okay. Based on the actual way that that table is used. I think that pushes to the point that there is value in understanding SQL. I wonder, because I started looking at ActiveRecord and it looks really awesome. Does that make you lazy? Not you, but a person that rolls in and starts using Rails, you can probably get away with never really learning SQL, right?

DHH (02:18:10) As long as you want to stay at the entry level of competence. This is actually my overarching mission with Rails, is to lower the barrier of entry so far down that someone can start seeing stuff on their browser without basically understanding anything. They can run Rails, new blog, run a couple of generators. They have a whole system… They don’t understand anything, but it’s an invitation to learn more. Where I get fired up, and this ties back to the AI discussion, is when that’s turned into this meme that programmers no longer have to be competent. “The AI is going to figure it out, the generators is going to figure it out. I don’t need to know SQL, ActiveRecord is going to abstract it away from me.” No, no, no. Dude, hold up. The path here is competence. I’m trying to teach you things.

(02:18:58) I understand I can’t teach you everything in five minutes. No one who’s ever become good at anything worthwhile could be taught everything in five minutes. If you want to be a fully well-rounded application developer, that takes years, but you can actually become somewhat productive in a few days, you can have fun in a few days. For sure, you’re going to have fun in a few minutes, in a few hours, and over time, I can teach you a little more. ActiveRecord says like, “Yeah, yeah. All right, start here and then, next week, we’ll do a class on SQL.”

Lex Fridman (02:19:30) Actually, you have this beautiful expression that I love. That a great programming language, like Ruby, has a soft ramp, but the ramp goes to infinity.

Lex Fridman (02:19:40) Yeah. It’s super accessible, super easy to get started-

DHH (02:19:45) There’s always more to learn. This is one of the reasons I’m still having fun programming, that I’m still learning new things, I can still incorporate new things. The web is deep enough as a domain, you never going to learn all of it.

Lex Fridman (02:19:56) Provide sharp knives.

DHH (02:19:58) This is a good one, because another way of saying this… The opposite way of saying this, the Java way of saying is, “Do not provide foot guns,” right?

DHH (02:20:06) I don’t want to give you a sharp knife. You’re a child, you can’t handle a sharp knife. Here’s a dull butter knife, cut your damn steak, right? That’s a very frustrating experience. You want a sharp knife, even though you might be able to cut yourself. I trust humans in the same way that maths trust humans. Maybe you cut off a finger. All right, you’re not going to do that again. Thankfully, if it was a virtual finger, it’s going to grow back out. Your competence is going to grow, it’s more fun to work with sharp tools.

Lex Fridman (02:20:35) That actually contributes to the ramp that goes to infinity.

Lex Fridman (02:20:39) Value-integrated systems.

DHH (02:20:42) We hit on that one. Rails is trying to solve the whole problem of the web, not just one little component. It’s not leaving you a bunch of pieces you have to put together yourself.

Lex Fridman (02:20:51) Progress over stability.

DHH (02:20:52) You know what? If there’s one that’s dated, it’s probably that one. At this stage, Rails has been incredibly stable over many, many generations. The last major release, Rails 8, was basically a no-op upgrade for anyone running Rails 7. Rails 7 was almost a no-op upgrade for anyone running Rails 6. I used to think it required more churn to get progress, to stay on the leading edge of new stuff, and I wrote this before I experienced the indignity of the 2010s in the JavaScript community, where it seemed like stability was not just unvalued, it was actually despised. The churn in and of itself was a value we should be pursuing. If you were still working with the same framework three months later, you were an idiot, and I saw that and I actually recoiled. If I was going to write the doctrine today, I’d write that differently. I wouldn’t say, “Progress over stability.”

Lex Fridman (02:21:50) Maybe it’d be a function of the age of the programming language also.

DHH (02:21:55) Maybe or a deeper understanding of the problem. I think part of what’s so fascinating about technology is that we have this perception that everything constantly moves so fast. No, it doesn’t. Everything moves at a glacial pace. There is occasionally a paradigm shift, like what’s happening with AI right now, like what happened with the introduction of the iPhone in 2007, like what happened with the internet in ’95. That’s basically the total sum of my career, three things changed. Everything else in between was incremental small improvements. You can recognize a Rails application written in 2003. I know, because the Basecamp I wrote back then is still operating, making millions of dollars in ARR, servicing customers on the initial version that was launched back then, and it looks like the Rails code, if I squint a little, that I would write today. Most things don’t change, even in computing, and that’s actually a good thing. We saw with the JavaScript ecosystem, what happens when everyone gets just mad about constant churn. Things don’t change that often.

Lex Fridman (02:23:00) By the way, on that small tangent, you just visibly verbally changed your mind with the you of 15 years ago?

Why managers are useless

Lex Fridman (02:23:10) That’s interesting. Have you noticed yourself changing your mind quite a bit over the years?

DHH (02:23:17) I would say, “Oh, yes,” and then also, “Oh, no,” in the sense that there are absolutely fundamental things both about human nature, about institutions, about programming, about business that I’ve changed my mind on, and then I’ve also had experiences that are almost even more interesting, where I thought I had changed my mind and I tried it a new way, realized why I had the original opinion in the first place, and then gone back to it. It happens both ways. An example of the later part, for example, was managers at 37 Signals. For the longest time, I would rail against engineering managers as an unnecessary burden on a small or even medium-sized company, and at one point, I actually started doubting myself a little bit. I started thinking like, “Do you know what? Maybe all programmers do need a one-on-one therapy session every week with their engineering manager to be a whole individual.”

(02:24:11) We tried that for a couple of years where we hired some very good engineering managers who did engineering management the way you’re supposed to do it, the way it’s done all over the place, and after that, I thought, “No. No, I was right. This was correct, we should not have had managers.” Not every programmer needs a therapy session with an engineering manager every week, we don’t need these endlessly scheduled huddles, we don’t need all these meetings. We just need to leave people the hell alone to work on problems that they enjoy for long stretches of uninterrupted time. That is where happiness is found, that’s where productivity is found, and if you can get away with it, you absolutely should. Engineering management is a necessary evil when that breaks down.

Lex Fridman (02:24:54) What’s the case for managers then?

DHH (02:24:57) The case for managers is that, if you do have a lot of people, there’s a bunch of work that just crops up. The one-on-one is one example, that programmers need someone to check in with, there’s another idealized version that someone needs to guide the career of juniors, for example, to give them redirecting feedback, and all this other stuff. It’s not that, in the abstract, I don’t agree with some of those things, but in practice, I’ve found that they often create more problems that they solve. A good example here is, can you get feedback from someone who’s not better at your job than you are? You can get some feedback, you can get feedback on how you show up at work. Are you being courteous to others? Are you being a good communicator? Okay, yes, but you can’t get feedback on your work, and that’s more important.

(02:25:44) It’s more important that you work under and with someone who’s better at your job than you are if you wish to progress in your career, and every single programmer I’ve ever worked with was far more interested in progressing in their career on that metric, getting better at their craft, than they were in picking up pointers that a middle manager could teach them. That’s not saying that there isn’t value in it, it’s not saying there isn’t value in being a better person or a better communicator. Of course, there is all those things, but if I have to choose one or the other, I value competence higher. Again, I cavit this a million times, because I know what people sometimes hear, they hear the genius asshole is just fine, and that’s great and you should excuse all sorts of malicious behavior if someone’s just really good at what they do.

(02:26:30) I’m not saying that at all. What I am saying is that the history of competence is a history of learning from people who are better than you, and that relationship should take precedence over all else. That relationship gets put aside a bit when engineering manager’s introduced. Now, the funny thing is this conversation ties back to the earlier things we were talking about. Most engineering managers are actually former programmers. They at least know program to some extent, but what I’ve seen time and again is that they lose their touch, their feel with it very, very quickly and turn into pointy-haired bosses very, very quickly who are really good at checking for updates, “Just seeing where we are on project A here if you need anything,” or, “We’re really to deliver?” Okay, yes. Also, no. Shut up, leave me the hell alone. Let me program and then I’ll come up for air.

(02:27:22) I’ll talk with other programmers who I can spar with, that we can learn something with, where I can turn the problems over with and we can move forward. If you look back on the history of computer industry, all the great innovation that’s happened, it’s all been done by tiny teams with no engineering managers. Just full of highly-skilled individuals. You’ve had John Carmack on here. I used to look up to its software so much, not just because I love Quake, not just because I loved what they were doing, but because he shared a bit about how the company worked. There were no managers or maybe they had one business guy doing some business stuff, but that was just to get paid. Everything else was basically just designers and programmers, and there were about eight of them and they created goddamn Quake 2. Why do you need all these people again?

(02:28:09) Why do you need all these managers again? I think, again, at a certain scale, it does break down. It’s hard to just have 100,000 programmers running around wild without any product mommies or daddies telling them what to do. I understand that. Then even as I say that, I also don’t understand it, because if you look at something like Gmail for example, that was like a side project done by Buchheit at Google at the time. So much of the enduring long-term value of even all these huge companies were created by people who didn’t have a god damn manager, and that’s not an accident. That’s a direct cause and effect. I’ve turned in some way even more militant over the years against this notion of management, at least for myself and knowing who I am and how I want to work, because the other part of this is I don’t want to be a manager, and maybe this is just me projecting the fact that I’m an introvert who don’t like to talk to people on one-on-one calls every week, but it also encapsulates how I was able to progress my career.

(02:29:06) I did not really go to the next level with Ruby or otherwise until I had a door I could close and no one could bother me for six hours straight.

Lex Fridman (02:29:15) In companies probably one of the reasons is it’s very easy to hire managers, and managers also delegate responsibility from you, so if you just have a bunch of programmers running around, your response… It’s work, it’s intellectual work to have to deal with the first principles of every problem that’s going on.

Lex Fridman (02:29:39) Manager’s like, “You can relax, all will be taken care of,” but they then hire their own managers, and it just multiplies and multiplies and multiplies. I would love it if some of the great companies we have in the United States, if there was an extra side branch that we could always run… Maybe physicists can come up how to split the simulation to where it just all the managers are removed. Just in that branch, just the PR and the comms people also, and even the lawyers. Just the engineers and let’s just see, and then we merge it back.

DHH (02:30:16) I have a sense you run that branch at 37 singles for 20 years. I’ve experimented with forking back on the other side, I’ve experimented with having a full-time lawyer on staff, I’ve experimented with having engineering managers, and I can tell you life is much better at 50, 60 people when none of those individuals or none of those roles… It’s never about the individuals, it’s about the roles. None of those roles are in your organization full-time. Occasionally, you need a manager. Occasionally, you need a lawyer. I can play the role of manager occasionally, fine, and then I can set it back down to zero. It’s almost like a cloud surface. I want a manager service I can call on for seven hours this week and then I want to take it down to zero for the next three months.

Lex Fridman (02:31:01) Yeah, I read, I don’t know if this is still the case, that Basecamp is an LLC and doesn’t have a CFO, like a full-time accountant. Is that [inaudible 02:31:10].

DHH (02:31:10) These days, we do have a head of finance. We did not for the first 19 years of life, I think. We got away with basically just having an accountant do our books in the same way you would do a small ice cream shop, except we would, over time, have done hundreds of millions of dollars in revenue. The scale seemed quirky and, at some point, you can also fall in love with your own quirkiness to a degree that isn’t actually healthy, and I’ve certainly done that over time, and we should have had count the beans a little more diligently, a little earlier. This was part of a blessing of just being wildly profitable and selling software that can have infinite margins, basically, that you can get away with a bunch of stuff that you perhaps shouldn’t. What partially taught me this lesson was when we realized we had not been collecting sales tax in different US states where we had Nexus, and it took us about two years and $5 million in settlements and cleanups to get out of that mess. After that, I went like, “Okay, fine, we can hire a finance person.”

DHH (02:32:11) We now have a wonderful finance person, Ron, who actually ended up replacing something else we used to have. We used to have a full-time data analytics person who would do all sorts of insight mining for, “Why are people signing up for this thing?” We ran that for 10 years and realized, “You know what? If I can have either a data analytics person or an accountant, I’m picking the accountant.”

Small teams

Lex Fridman (02:32:30) I love this so much on so many levels. Can we just linger on that advice that you’ve given, that small teams are better? I think that’s really less… Less is more. What did you say before? “Worse is better”? Okay, I’m sorry.

DHH (02:32:47) Worse is better on adoption with technology a lot of times.

DHH (02:32:51) I think it actually comes out of the same thing. It comes out of the fact that many of the great breakthroughs are created by not even just tiny teams, but individuals, individuals writing something. An individual writing something on some parameter, what they do is worse. Of course, it’s worse when one person has to make something that a huge company have hundreds if not thousands of developers that they can have work on that problem, but in so many other parameters, that worstness is the value, that less is the value. In Getting Real, which we wrote back in 2006, we talk about this notion of less software. When we first got started with Basecamp back in 2004, people would ask us all the time, “Aren’t you petrified of Microsoft? They have so many more resources, they have so many more programmers. What if they take a liking to your little niche here and they show up and they just throw a thousand programmers at the problem?”

(02:33:46) My answer, perhaps partly because I was like 24 was, first of all, “No, no care in the world,” but the real answer was they’re not going to produce the same thing. You cannot produce the software that Basecamp is with a team of a 1,000 people. You will build the software that 1,000 people build, and that’s not the same thing at all. So much of the main breakthrough in both end-user systems but also in open-source systems and fundamental systems, they’re done by individuals or very small teams. Even all these classical histories of Apple has always been like, well, there’s a big organization, but then you had the team that was actually working on the breakthrough. It was four people, it was eight people, it was never 200.

Lex Fridman (02:34:32) The large team seems to slow things down.

Lex Fridman (02:34:37) It’s so fascinating, part of it’s the manager thing.

DHH (02:34:40) Because humans don’t scale, communication between humans certainly don’t scale. You basically get the network-cost effect. Every time you add a new node, it goes up exponentially. This is perhaps the key thing of why I get to be so fond of having no managers at Basecamp, because our default team size is two. One programmer, one designer, one feature. When you’re operating at that level of scale, you don’t need sophistication, you don’t need advanced methodologies, you don’t need multiple layers of management, because you can just do. The magic of small teams is that they just do. They don’t have to argue, because we don’t have to set direction, we won’t have to worry about the road map. We can just sit down and make something, and then see if it’s good. When you can get away with just making things, you don’t have to plan, and if you can get out of planning, you can follow the truth that emerges from the code, from the product, from the thing you’re working on in the moment.

(02:35:43) You know far more about what the great next step is when you’re one step behind, rather than if you try 18 months in advance to map out all the steps. “How do we get from here to very far away?” You know what? That’s difficult to imagine in advance, because humans are very poor at that. Maybe AI one day will be much better than us, but humans can put one foot in front of each other. That’s not that hard, and that allows you to get away with all that sophistication. The process has become much simpler, you need far fewer people, it compounds, you need much less process, you need to waste less time in meetings. You can just spend these long glorious days and weeks of uninterrupted time solving real problems you care about and that are valuable, and you’re going to find that that’s what the market actually wants.

(02:36:33) No one is buying something because there’s a huge company behind it, most of the time. They’re buying something because it’s good, and the way you get something good is you don’t sit around and have a meeting about it, you try stuff, you build stuff.

Lex Fridman (02:36:48) It really is incredible what one person, honestly one person can do in 100 hours of deep work, of focused work. Even less.

DHH (02:36:58) I’ll tell you this, I tracked exactly the number of hours I spent on the first version of Basecamp. I was doing this, because at the time, I was working on a contract basis for Jason. He was paying me… I was going to say $15 an hour, that’s what I got paid when we first got started. I think he had bumped my pay to a glorious $25, but I was billing him, and I know that the invoice for the first version of Basecamp was 400 hours. That’s what it took for one sole individual in 2004 to create an entire system that has then gone on to gross hundreds of millions of dollars and continues to do extremely well. One person, just me setting up everything. Part of that story is Ruby, part of that story’s Rails, but a lot of it is also just me plus Jason plus Ryan plus Matt.

(02:37:46) That was the entire company at the time, and we could create something of sheer sustaining value with such a tiny team, because we were a tiny team. Not despite off. Small is not a stepping stone. This is the other thing that people get into their head, this is one of the big topics about a rework, that it gave entrepreneurs the permission to embrace being a small team not as a waypoint, not as, “I’m trying to become 1,000 people.” No, I actually like being a small team. Small teams are more fun. If you ask almost anyone, I’m sure Toby would say this too, even at his scale, the sheer enjoyment of building something is in the enjoyment of building it with a tiny team. Now, you can have impact at a different scale when you have a huge company, I fully recognize that and I see the appeal of it, but in the actual building of things, it’s always small teams. Always.

Jeff Bezos

Lex Fridman (02:38:39) How do you protect the small team? Basecamp has successfully stayed small. What’s been the dragon you had to fight off? Basically, you make a lot of money, there’s a temptation to grow, so how do you not grow?

DHH (02:38:55) Don’t take venture capital.

Lex Fridman (02:38:56) Okay, that that’s step one.

Lex Fridman (02:39:01) … everybody takes venture capital, so you already went.

DHH (02:39:05) That’s been the answer for the longest time, because the problem isn’t just venture capital, it’s other people’s money. Once you take other people’s money, completely understandably, they want a return, and they would prefer to have the largest return possible, because it’s not them sitting in the code, it’s not them getting the daily satisfaction out of building something, chiseling beautiful code poems out of the editor, right? They don’t get that satisfaction. They get the satisfaction maybe of seeing something nice put into the world, that’s fair, but they certainly also get a satisfaction of a higher return. There is this sense, certainly in venture capital, stated in venture capital, that the whole point of you taking the money is to get to $1 billion or more.

(02:39:44) Now, the path to that usually does go through running established playbooks, and then when it comes to software, the enterprise sales playbook is that playbook. If you’re doing B2B, software SaaS, you will try to find product market fit, and the second you have it, you will abandon your small and medium-sized accounts to chase the big whales with a huge sales force and, by then, you’re 1,000 people and life sucks.

Lex Fridman (02:40:10) That said, people are just curious about this. Have gotten a chance to get to know Jeff Bezos. He invested in Basecamp, not controlling…

DHH (02:40:22) He bought secondaries. This was the funny thing, is that when… Investing have these two dual meanings. Normally, when people think about investing, they think you’re putting in growth capital, because you want the business to hire more people, to do more R&D, so they can grow bigger. Bezos didn’t do that, actually. He bought an ownership stake directly from Jason and I, and 100% of the proceeds of that purchase went into my and Jason’s bank account. Personal bank accounts. Not a single cent went into the account of the company, because we didn’t need the money to grow. What we needed or what we certainly enjoyed was, to some extent, maybe the vote of confidence, but more so the security of taking a little bit off the tables is that we dared turn down the big bucks from venture capitals.

(02:41:14) It was essentially a vaccine against wanting to take a larger check from people who then wanted to take the company to something enormous that we didn’t want to go with it. Jeff gave Jason and I just enough money that we were comfortable turning all these people down in a way where, if it had turned belly up six months later, we wouldn’t have been kicking ourselves and gone, “We had something here that was worth millions, and now we have nothing and I have to worry about rent and groceries again.”

Lex Fridman (02:41:44) It is a vote of confidence. I’d love to hear Jeff’s side of this story of why, because he doesn’t need the money. I think it probably is just believing in people and wanting to have cool stuff be created in the world and make money off of it, but not like-

DHH (02:42:05) 100% the motivation for Jeff wasn’t a return, because he actually has a team, his private office, that runs these investments, who did the calculus on the investment pitch we gave him, which was so ridiculous that Jason and I were laughing our asses off when we were writing down our metrics. I was like, “No one’s going to pay this. No one is going to give us this multiple of this amount of revenue, and that’s fine.” I mean, we took the call essentially out of an awe that Jeff Bezos even wanted to look at us. “Do you know what? We don’t want venture capital, we don’t need other people’s money, but let’s just give him a bullshit number that no sane person would actually say yes to, and then we can each go our own way.”

(02:42:48) His investment team said like, “Jeff, no way. This makes no economic sense at all, they’re asking for way too much money with way too little revenue,” and Jeff just went like, “I don’t care, I want to invest in this guy,” because to him, at the time, it was chump change. Jason and I each got a few million dollars, whatever the currency swing between the yen and the dollar that day probably moved 10X for his net worth than our investment did. Jeff seemed genuinely interested in being around interesting people, interesting companies, helping someone go to distance. I actually look back on that relationship with some degree of regret, because I took that vote of confidence for granted in ways that I’m a little bit ashamed of. Over the years, I’ve been more critical about some of the things that Amazon had done that I feel now is justified.

(02:43:41) That’s just part of that processing of it, but on the economic sense, he gave us that confidence. He gave us the economic confidence, but then he also gave us the confidence of a CEO running, perhaps at the time the most important internet business in the US, showing up to our calls, which we would have with him once a year, and basically, just going like, “Yeah, you guys are doing awesome stuff. You should just keep doing awesome stuff. I read your book, it’s awesome. You launched this thing, it’s awesome. You should just do more of that. I don’t actually know how to run your business, you guys know.”

Lex Fridman (02:44:13) The book was out. From a fan perspective, I’m curious about how Jeff Bezos is able to see… Because to me, you and Jason are special humans in the space of tech, and the fact that Jeff was able to see that, right? How hard is it to see that?

DHH (02:44:29) He certainly saw it very early, and I think this is something that Jeff does better than almost anyone else. He spots that opportunity so far in advance of anyone else even opened their eyes to it, or certainly is willing to bet on it far early and far harder than anyone else is, and he’s just right time and again. We were not the only investment that he made and, certainly, Amazon had an extremely long-term vision, far longer than I have ever had the gumption to keep… I think of myself as a long-term thinker, I’m playing a child’s game compared to the game that Jeff is playing. When I looked at Amazon’s economics around the dot-com boom and bust, they looked ridiculous. They were losing so much money, they were so hated by the market. No one believed that it was going to turn into what it is, but Jeff did in a way that, that level of conviction, I really aspire to.

(02:45:23) I think that’s one of the main things I’ve taken away from that relationship is that you can just believe in yourself. To that degree against those odds? That’s ridiculous. He did that so many times at our level that it’s pathetic if I’m doubting myself.

Lex Fridman (02:45:42) Yeah. I think Amazon is one of those companies. It’s come under a bunch of criticism over the years. This is something about humans that I don’t appreciate so much, that we take for granted the positive that a thing brings real quick, and then we just start criticizing the thing. It’s the Wi-Fi and the airplanes.

Lex Fridman (02:46:04) I think Amazon, there could be a case made that Amazon is one of the greatest companies in the last 100 years.

DHH (02:46:15) For sure, I think it’s an easy case to make. What I also think is that the price you pay to be one of the greatest companies in the last 100 years is a lot of detractors, a lot of pushback, a lot of criticism. That this is actually order restored in the universe. One of my favorite teachers in all the time I’ve been on the internet is Kathy Sierra. I don’t know if you know her work, but she was active for only a few short years before the cruel internet ran her off, but she wrote a blog called Creating Passionate Users, and she carved into my brain this notion of balance in the universe. If you’re creating something of value that a lot of people love, you must create an equal and opposite force of haters. You cannot have people who love what you do without also having people who hate what you do.

(02:47:05) The only escape from that is mediocrity. If you are so boring and so uninteresting that no one gives a damn whether you exist or not, yeah, you don’t get the haters, but you also don’t get the impact of people who really enjoy your work. I think Amazon is that just at the massive scale, right? They’ve brought so much value and change to technology, to commerce that they must simply have a black hole size of haters. Otherwise, the universe is simply going to tip over.

Lex Fridman (02:47:34) Let me ask you about small teams. You mentioned Jason a bunch of times, Jason Fried. You have been partners for a long, long time. Perhaps it’s fair to say he’s more on the the design, business side and you’re the tech, the engineering wizard. How have you guys over all these years, creating so many amazing products, not murder each other? It’s a great story of partnership. What can you say about collaboration? What can you say about Jason that you love, that you’ve learned from? Why does this work?

DHH (02:48:07) First, I’ll say we have tried to murder each other several times over the years, but far less, I think in the last decade. In the early days, our product discussions were so fierce that, when we were having them in the office and there were other employees around, some of them were legitimately worried that the company was about to fall apart, because the volume coming out of the room would be so high and sound so acrimonious that they were legitimately worried the whole thing was going to fall apart. You know what’s funny? Is that it never felt like that in the moment. It always felt like just a peak vigorous search for something better, and that we were able to stomach that level of adversity on the merits of an idea, because it was about the idea. It wasn’t about the person and it never really got personal. Not even never, really, it didn’t get personal. It wasn’t like, “Jason, you’re an asshole.” It was like, “Jason, you’re an idiot, and you’re an idiot because you’re looking at this problem the wrong way, and let me tell you the right way to do it.”

Lex Fridman (02:49:21) As a small tangent, let me say that some people have said, we’ll probably return to this, that you sometimes can have flights of temper on the internet and so on. I never take it that way, because it is the same kind of ilk. Maybe I haven’t seen the right traces of temper, but usually, it’s about the idea, and it’s just excited, passionate human.

DHH (02:49:46) That’s exactly what I like to think of it as. It doesn’t always come across as that and I can see why spectators in particular sometimes would see something that looks like I’m going after the man rather than the ball. I do think I’ve tried to get better at that, but in my relationship with-

DHH (02:50:00) I do think I’ve tried to get better at that, but in my relationship with Jason, I think it’s worked so well because we have our own distinct areas of competence, where we fully trust each other. Jason trusts me to make the correct technical decisions. I trust him to make the correct design and product direction decisions, and then we can overlap and share on the business, on marketing, on writing, on other aspects of it. So that’s one thing, is that if you’re starting a business with someone where you do exactly the same as they do, and you’re constantly contesting who’s the more competent person, I think that’s far more difficult and far more volatile. So if you’re starting a business and you’re both programmers and you both work on the same kind of programming, good luck. I think that’s hard.

(02:50:49) I tried to pick an easier path, working with a designer, where I knew that at least half of the time I could just delegate to his experience and competence and say like, do you know what? I may have an opinion. I have an opinion all the time on design, but I don’t have to win the argument because I trust you. Now, occasionally we would have overlaps on business or direction where we’d both feel like we had a strong stake in the game and we both had a claim to competence in that area, but then for whatever reason, we also both had a long-term vision, where I would go, do you know what? I think we’re wrong here, but as I learned from Jeff Bezos, by the way, I’m going to disagree and commit. That was one of those early lessons he gave us, that was absolutely crucial and perhaps even instrumental in ensuring that Jason and I have been working together for a quarter of a century. Disagree and commit is one of the all time Jeff Bezos’ greats.

Lex Fridman (02:51:42) I’m just surprised that Yoko Ono hasn’t come along. You know what I mean? There’s so many Yokos in this world.

DHH (02:51:51) It might’ve happened if not in part because we don’t sit on each other’s lap all the time. Most of our careers, we haven’t even lived in the same city. I lived in Chicago for a couple of years while we were getting going after I’d moved to the US in 2005, but then I moved to Malibu and then I lived in Spain and then I lived in Copenhagen. And Jason and I, from the foundation of our relationship learned how to work together in a remarkably efficient way where we didn’t have to actually talk that much. On any given week, I’d be surprised if Jason and I spent more than two hours of direct exchange and communication.

Lex Fridman (02:52:33) Yeah. Sometimes it’s the basic human frictions that just accumulate all time.

DHH (02:52:37) Yes. I think if you rub up against another person, that person damn well better be your spouse, if it’s too much for too long.

Lex Fridman (02:52:43) Yeah. But even there, COVID has really tested the relationship. It’s fascinating to watch.

DHH (02:52:48) It has, and I do think that having some separation, which is kind of counterintuitive because I think a lot of people think the more collaboration you can have, the better. The more ideas that can bounce back and forth, the better. And both Jason and I, for whatever reason came to the conclusion early on in careers, absolutely not. That’s complete baloney. This is why we were huge proponents of remote work. This is why I enjoy working in my home office where I can close the door and not see another human for six hours at the time. I don’t want to bounce ideas off you all the time. I want to bounce ideas off you occasionally and then I want to go off and implement those ideas.

(02:53:24) There’s way too much bouncing going on and not enough scoring, not enough dunking, and I think this is one of the great traps of executive rule. Once a founder elevates themselves all the way up to an executive, where what they’re doing is just telling other people what to do, that’s the realm they live in 24/7. They just live in the idea realm. Oh, I can just tell more people, more things what to do and we can just see it happen. If you actually have to be part of implementing that, you slow your horse. Do you know what? I had a good idea last week. I’m going to save the rest of my good ideas until next month.

Why meetings are toxic

Lex Fridman (02:53:58) There is a temptation for the managers and for the people in the executive layer to do something, which that’s something usually means a meeting. And so that’s why you say-

DHH (02:54:11) Yes. Their job is telling other people what to do.

Lex Fridman (02:54:13) Yeah. And the meeting, so this is one of the big things you’re against is meeting-

DHH (02:54:17) Meetings are toxic. And this really I think ties into this with Jason and I. If I had to count out the total number of meetings we’ve had in 24 years of collaborations, where we in person sat in front of each other and discussed a topic, probably it’d be less than whatever three months at a fan company. We just haven’t done that that much. We haven’t worn it out. One of this funny metaphors that Trump came up with at one point was, a human has a limited number of steps in their life. That’s the longevity argument here. You can do so much activity and then you run out.

(02:54:53) There’s some kernel in that idea that can be applied to relationship. There’s some amount of exchange we can have. There’s some amount of time we can spend together, where you can wear it out. Jason and I were diligent about not wearing each other out, and I think that is absolutely key to the longevity of the relationship combined with that level of trust and then just combining with the level that we really like the work itself. We don’t just like the brainstorming the [inaudible 02:55:21] where we just come up with good ideas. Now we like to do the ideas, and we like to be part of that process directly ourselves. I like to program, he likes to do design. We could go off and do our little things for long stretches of time. In case you come together and go like, hey, let’s launch a great product.

Lex Fridman (02:55:35) This might sound like I’m asking you to do therapy, but I find myself to sometimes want or long for a meeting because I’m lonely. Remote work is just sitting by yourself, I don’t know, it can get really lonely for long stretches of time.

DHH (02:55:56) Let me give you a tip. Get a wife.

Lex Fridman (02:56:00) Yes. God, damn it.

DHH (02:56:05) Family really is the great antidote to loneliness, and I mean that as sincerely as I can possibly say it. I certainly had exactly that feeling you described early in my career when I was working remotely, and I was just like me living in an apartment, a total stereotype, where for the longest time when I first moved to Chicago, all I had on the floor was a mattress. And then I bought this big TV and I didn’t even mount it, and then I had a stack of DVDs. And I was basically, I was working a lot of time and then I would just go home and I’d do that, and it wasn’t great. It really wasn’t. I do think that humans need humans. And if you can’t get them at work, and I actually sort of kind of don’t want them at work, at least I don’t want them for 40 hours a week. That’s not what I prefer.

(02:56:51) You need something else. You need other relationships in your life, and there is no greater depth of relationship if you can find someone that you actually just want to spend a lot of time with. That’s key to it and I think it’s key for both Jason and I that we’ve had families for quite a long time, and it grounds you to in a way where the sprint of a startup can get traded in for the marathon of an enduring company, and you get settled in a way. We talked briefly about sometimes I get fired up. I mean, a lot of times, maybe even most of the times I get fired up about topics, but I don’t get fired up in the same way now as I used to when I was 24. I’m still extremely passionate about ideas and trying to find the right things, but having a family, meeting my wife, building a life around that has just mellowed everything out in a completely cliche way, but I think it’s actually key.

(02:57:51) I think if we could get more even younger people not to wait until they were in their god-damn 30s or early 40s to hitch up with someone, we’d be better off and we’d have more stable business relationships as well, because folks would get that nurturing human relation somewhere else. Now, when I say all of that, I also accept that there are plenty of great businesses that’s been built over the years that have not been built remote, that have been built by a gang of hooligans sitting in an office for immense hours at time.

(02:58:23) I mean, both John Carmack and Tim Sweeney talked about that in the ’90s with their careers that that was just basically work, sleep, hang out with the guys at the office, right? Totally fair. That never appealed to me. Both Jason and I saw eye to eye on the idea that 40 hours a week dedicated to work was enough that if we were going to go to distance for not just the five to seven years it takes to build a VC case up to an exit, but for potentially 10 years, 20 years or further, we needed to become whole humans, because only that whole human-ness was going to go to distance, which included building up friendships outside of work, having hobbies, finding a mate and having a family. And that entire existence, those legs of the stool that work is not the only thing in life is completely related to the fact that we’ve been around for 25 years. There’s way too much, especially in America of false trade-offs. Oh, you want to build a successful business? Well, you can either have money enjoyment or family or health, pick one.

(02:59:40) What? Why do we have to give up all of this? Now, again, I’m not saying, and there are moments, prayers, life where you can sprint, but I am saying if that sprint turns into a decade, you’re going to pay for it. And you’re going to pay for it in ways I’ve seen time and again, seemed like a very bad trade, that even if it works. And by the way most of the time it does not. Most of the time startups go bust. Most of the times people spend five, seven years or something that does not pan out, and they don’t get the payout. And then they just sit with regret of like, what the fuck happened to my 20s? Early on, Jason and I basically made the pact that working together was not going to lead to that kind of regret, that we were going to allow ourselves and each other to build a whole life outside of work. And the fact that that worked is something I feel is almost like forbidden knowledge.

(03:00:38) Certainly in technology circles in US, it’s something that we’ve tried to champion for 20 years and we still get slacked for. Just two days ago, I had another Twitter beef with someone saying like, “Oh, well, okay, maybe it worked, but you didn’t turn into Atlassian, so you’re a failure. Basecamp isn’t Jira, so why are you even bothering?” And it’s such a fascinating winner-takes- all mentality that unless you dominate everyone else in all the ways, you’ve lost. When so much of life is far more open to multiple winners, where we can end up with a business that have made hundreds of millions of dollars over the years and we’ve kept much of that to do whatever we want and that that’s enough. That’s good. That’s great. That’s actually something worth aspiring to. Certainly, it should be a path for someone to consider choosing rather than the VC unicorn of bust mentality that dominates everything.

Case against retirement

Lex Fridman (03:01:39) Yeah. I’d love to ask you about this exchange so you can explain to me the whole saga, but so just a link on that a little bit is, I think there’s a notion that success for tech founder is like work for a few years all out and then exit, sell your company for, I don’t know, hundreds of millions of dollars. That’s success. When it seems in reality, when you look at who the people like you, like really smart, creative humans, who they actually are and what happiness entails, it actually entails working your whole life a little bit. Because you actually love the programming, you love the building, you love the designer and you don’t want to exit, and that’s something you’ve talked about really, really eloquently about. So you actually want to create a life, where you’re always doing the building and doing it in a way that’s not completely taken over your life.

DHH (03:02:40) Mojito Island is a mirage. It always was. There is no retirement for ambitious people. There is no just sitting back on the beach and sipping a mojito for what, for two weeks before you go damn crazy and want to get back into the action. That’s exactly what happens to most people who have the capacity to build those kinds of exits. I’ve never seen, I shouldn’t say never. I’ve almost never seen anyone be able to pull that off, yet so many think that that’s why they’re doing it. That’s why they’re sacrificing everything because once I get to the finish line, I’m golden, I’ve won, I can retire, I can sit back, I can just relax. And you find out that that kind of relaxation is actually hell. It’s hell for creative people to squander their God-given creative juices and capacities. And I was really lucky to read the book Flow by Mihaly Csikszentmihalyi early on [inaudible 03:03:39].

Lex Fridman (03:03:38) Nice, the pronunciations.

DHH (03:03:40) Do you know what? I had to practice that with AI over the last few days because I knew I was going to cite him and I butchered his name several times. So AI taught me how to pronounce that at least somewhat correctly. But his main work over his career was essentially the concept of flow that came out of a search for understanding happiness. Why are some people happy? When are they happy? And what he learned was quite illuminating. He learned that people aren’t happy when they sit on Mojito Island. They’re not happy when they’re free of all obligations and responsibilities. No. They’re happy in these moments where they’re reaching and stretching their capacities just beyond what they can currently do. In those moments of flow, they can forget time and space. They can sit in front of the keyboard, program a hard problem, think 20 minutes have passed and suddenly it’s been three hours.

(03:04:36) They look back upon those moments with the greatest amount of joy, and that is what peak happiness is. If you take away the pursuit of those kinds of problems, if you eliminate all the problems from your plate, you’re going to get depressed. You’re not going to have a good time. Now, there are people who can do that, but they’re not the same kind of people who built these kinds of companies. So you have to accept the kind of individual you are. If you are on this path, don’t bullshit yourself. Don’t bullshit yourself into thinking, I’m just going to sacrifice everything, my health, my family, my hobbies, my friends, but in 10 years I’m going to make it all up, because in 10 years I can do it.

(03:05:15) It never works out like that. It doesn’t work out on both ends of it. It does not work out if you’re successful and you sell your company, because you’ll get bored out of your mind after two weeks on retirement. It doesn’t work out if the company is a failure and you regret the last 10 years spent for nothing. It doesn’t work out if it all works and you stay in the business because it never gets any easier. So you’re going to fail on all metrics if you just go, there’s only work and nothing else. And I didn’t want that. I wanted the happiness of flow. I understood that insight was true, but I wanted to do it in a way where I could sustain the journey for 40 or 50 years.

Lex Fridman (03:05:53) And there’s other interesting caveat that I’ve heard you say is that if you do exit and you sell your company, and you want to stay in, you want to do another company, that’s going to usually not be as fulfilling because really your first baby like…

DHH (03:06:09) You can’t do it again or most people can’t do it again. A, because their second idea is not going to be as good as the first one. It is so rare to capture lightning in the bottle like we have, for example with Basecamp. I know this from experience because if you’re trying to build a lot of other businesses since, and some of them have been moderate successes, even good successes, none of them have been Basecamp. It’s really difficult to do that twice. But founders are arrogant pricks, including myself, and we like to think that, do you know what we succeeded in large part because we’re just awesome. We’re just so much better than everyone else. And in some ways that’s true some of the time, but you can also be really good at something that matters for a hot moment. That door is open, the door closes. Now you’re still good at the thing, but it doesn’t matter. No one cares.

(03:06:54) There’s that part of it. And then there’s the part of it that going back to experience things for the first time only happens the first time. You can’t do it again. I don’t know if I have it in me to go through the bullshit of the early days again. And I say bullshit in the sense of the most endearing sense. It’s all great to do it. I know too much. This is one of the reasons why whenever I’m asked the questions, if you could tell your younger self something that would really, what would you say to your younger self? I would fucking not say a thing. I would not rob my younger self of all the life experiences that I’ve been blessed with due to the ignorance of how the world works. Building up the wisdom about how the world works is a joy, and you got to build it one break at a time.

(03:07:40) If you just handed all the results, it’s like, oh, should we watch your movie? Here’s how it ends. I don’t want to watch the movie now. You spoiled it. I don’t want you to spoil my business experience. I don’t want to spoil any of my ignorance. The greatest blessing half the time when you’re starting something new is A, you don’t know how hard it’s going to be. B, you don’t know what you don’t know. The adventure is to pay off. The responsibility is to pay off. This is something Jordan Peterson has really taught me to articulate. This notion that responsibility is actually key to meaning.

(03:08:16) Man’s Search for Meaning, Viktor Frankl talks about this as well, that we can endure any hardship if there’s a reason why. Now, he talked about it in truly life altering concentration camp ways, but you can also apply at a smaller scale with less criticality of even just your daily life that all that hardship in building the original business that is responsibility you take upon yourself. The appeal, the reason you take that on you is in part because you don’t know fully what it entails. If you had known upfront, if I had known upfront how hard it would be, how much frustration there’d be along the way, if you just told me that in a narrative before I got started, I would’ve been like, eh, maybe I should just go get a job.

Hard work

Lex Fridman (03:09:00) You said so many smart things there. Just to pick one, it’s funny that sometimes the advice givers, the wisdom givers have gone through all the bullshit, and so there is a degree to which you want to make the mistake. So I think I would still give the advice of you want to have a stretch of your life, where you work too hard, including anything that fails. I don’t think you can learn the lessons why that’s a bad idea in any other way except by doing it. There is a degree, but of course you don’t…

DHH (03:09:37) I think you should stretch. Should you have to stretch for a decade? I’m not so sure.

Lex Fridman (03:09:40) Yeah. The decade thing is 20s is a special time.

DHH (03:09:43) It’s a lot to trade. You don’t get your 20s back, you don’t get your 30s back, you don’t get your 40s back. I would’ve regret it personally if I hadn’t done the other things I did in my 20s. If I hadn’t had the fun I had, if I hadn’t had the friends I had, if I hadn’t built up the hobbies that I did, if I hadn’t started driving race cars at an early enough age to actually get really good at it, if I had just gone all in on business because I would’ve got the same out in the end. This is something Derek Sivers really taught me, is he has this great essay about how when he went for a bike ride, he could go really hard all out and he could do the ride, I think, in whatever 19 minutes, or he could enjoy the ride, go 5% slower, do the ride in 21 minutes and realize there’s only two minutes apart.

(03:10:32) Either I go all in all the time, there’s nothing else, I’m completely exhausted at the [inaudible 03:10:37] or I traveled the same distance and I arrived maybe two minutes later, but I got to enjoy the scenery, listen to the birds, smell the flowers. That journey is also valuable. Now, I say that while accepting and celebrating that if you want to be the best at one thing in the world, no, you have to sacrifice everything. You have to be obsessed with just that thing. There is no instant of someone who’s the best in the world at something who’s not completely obsessed. I didn’t need to be best at anything. This was a rare blessing of humility I had early on is like, do you know what? I am not that smart. I’m not that good. I’m not that talented. I can do interesting things by combining different aspects and elements that I know, but I’m not going to be the best at anything.

(03:11:27) And that released me from this singular obsession with just going, I’m going to be the best programmer in the world. I know I’m not. I fucking failed at it twice before I even got how conditional it’s worked. I’m not smart enough to be the best at anything. I’m not dedicated enough to do that. That’s a bit of a blessing. And I think as a society, we have to straddle both celebrating peak excellence, which we do all the time, and celebrating the peak intensity of mission it takes to become that. And then also going like, do you know what? We don’t all need to be Michael Jordan. There’s only going to be one of those.

Lex Fridman (03:12:04) Well, we should say that there’s certain pursuits where a singular obsession is required. Basketball is one of them. By the way, probably racing. If you want to be the best at F-1 in the world-

DHH (03:12:17) If you want to be Senna, you got to be a maniac.

Lex Fridman (03:12:20) But I would argue that there’s most disciplines like programming allows if you want to be, quote, unquote, “the best,” whatever that means. I think that’s judged at the end of your life. And usually if you look at that path, it’s going to be a nonlinear one. You’re not going to look like the life of an Olympic athlete who’s singular focused. There’s going to be some acid there in the 20s or there’s going to be several detours, which should the true greats, there’s going to be detours, and sometimes they’re not going to be Steve Jobs’ asset type of situation. There’ll be just different companies you’ve worked for different careers or different efforts you allocated your life to, but it’s going to be nonlinear. It’s not going to be a singular focus.

DHH (03:13:09) The way I think about this sometimes is I want a good bargain on learning. I can become in the top 5% of whatever I defined as good at something, much, much easier. Perhaps it’s 20 times easier, a hundred times easier to get into the top 5% than it is to get into the top 0.1%. That’s almost impossibly hard to get into that. But if I’m content just being at the top 5%, I could be at the top 5% on five things at once. I can get really good at writing. I can get decent at driving a race car. I can become pretty good at programming, I can run a company, I can have a family.

(03:13:48) I can do a lot of things at the same time that gives me sort of that variety that almost was idealized. Karl Marx has this idea, oh, I’m going to fish in the morning and hammer in the evening and paint on the weekends, right? That there’s a sense for me at least, where his diagnosis of alienation was true, that just that tunnel vision, there’s just this one thing I’m just going to focus on that gives me a sense of alienation. I can’t stomach.

(03:14:15) When I’m really deep on programming. And sometimes I go deep for weeks, maybe even in a few cases months, I have to come up for air and I have to go do something else like, all right, that was programming for this year. I’ve done my part, and I’m going to go off riding or annoy people on the internet or drive some race cars to do something else, and then I can do the programming thing with full intensity again next year.

Why we left the cloud

Lex Fridman (03:14:38) Speaking of annoying people on the internet, you got to explain to me this drama. Okay, so what is this guy that said, “Imagine losing to Jira, but boasting they have a couple million dollars per year.” So this had to do with this almost now a meme decision to leave the cloud. DHH left the cloud. I think that’s literally a meme, but it’s also a fascinating decision. Can you talk through the full saga of DHH leaves the cloud, leaving AWS, saving money, and I guess the case this person is making now?

DHH (03:15:14) Is that we wasted our time optimizing a business that could have been a hundred times bigger if we’d just gone for the moon.

Lex Fridman (03:15:20) And for the moon includes?

DHH (03:15:22) Venture Capital includes other things, not caring about cost.

Lex Fridman (03:15:26) But also because AGI is around the corner, you should have been investing into AI, right? Is this just part of-

DHH (03:15:32) Sort of [inaudible 03:15:33]. I think it’s a bit of a muddy argument, but if we just take it at its peak ideal, which I actually think is a reasonable point, is that you can get myopically focused on counting pennies when you should be focused on getting pounds that I’ve optimized our spend on infrastructure by getting out of the cloud, and that took some time and I could have taken that time and spend it on making more features that would attract more customers or spend even more time with AI or done other things. Opportunity cost is real. I’m not denying that. I’m pushing back on the idea that for a company of our size saving $2 million a year on our infrastructure bill, which is about somewhere between 1/2 to 2/3 goes directly to the bottom line, which means its return to Jason or I as owners and our employees part of our profit sharing plan is totally worth doing.

(03:16:34) This idea that cost don’t matter is a very Silicon Valley way of thinking that I again understand at the scale of something maybe, but I also actually think it’s aesthetically unpleasing. I find an inefficient business as I find an inefficient program full of line noise to just be a splinter in my brain. I hate looking at an expense report and just seeing disproportionate waste. And when I was looking at our spend at 37signals a while back, a few years back, I saw bills that did not pass my smell test. I remembered how much we used to spend on infrastructure before the cloud, and I saw numbers I could not recognize in proportion to what we needed. The fact that computers had gotten so much faster over time, shouldn’t things be getting cheaper? Why are we spending more and more money servicing more customers? Yes, but with much faster computers. Moore’s law should be lowering the costs, and the opposite is happening. Why is that happening? And that started a journey of unwinding why the cloud isn’t as great as the deal as people like to think [inaudible 03:17:48].

AWS

Lex Fridman (03:17:48) Yeah. Can we look at the specifics just for people who don’t know the story and then generalize to what it means about the role of the cloud in the tech business? So the specifics is you were using AWS S3.

DHH (03:18:03) We were using AWS for everything. Hey.com launches an entirely cloud app. It was completely on AWS for compute, for databases, for all of it. We were using all the systems as they’re best prescribed that we should. Our total cloud bill for Basecamp, our total spend with AWS was I think 3.2 million or 3.4 million at its peak. That’s kind of a lot of money, 3. 4 million. I mean we have a ton of users and customers, but still that just struck me as unreasonable. And the reason why it was so unreasonable was because I had the pitch for the cloud ringing in my ears, hey, this is going to be faster. This is going to be easier. This is going to be cheaper. Why are you trying to produce your own power? Do you have your own power plant? Why would you do that? Leave the computers to the hyperscalers. They’re much better at it anyway.

(03:18:58) I actually thought that was a compelling pitch. I bought in on that pitch for several years and thought, do you know what? I’m done ever owning a server again. We are just going to rent our capacity, and Amazon is going to be able to offer us services much cheaper than we could buy them themselves because they’re going to have these economies of scale. And I was thinking Jeff’s word ringing, “My competitor’s margin is my opportunity.” That was something he used to drive amazon.com with, that if he could just make 2% when the other guy was trying to make 4%, he would end up with all the money and on volume he would still win.

(03:19:34) So I thought that was the operating ethos for AWS. It turns out that’s not true at all. AWS, by the way, operates at almost 40% margin. So just in that, there’s a clue that competitors are not able to do the competitive thing we like about capitalism, which is to lower costs and so forth. So the cloud pitch in my optics, it’s fundamentally false. It did not get easier, first of all. I don’t know if you’ve used AWS recently. It is hella complicated. If you think Linux is hard, you’ve never tried to set up IAM rules or access parameters or whatever for AWS.

Lex Fridman (03:20:13) AWS was always difficult. It was always [inaudible 03:20:15].

DHH (03:20:14) Well, I think it’s gotten even more difficult, but yes, now some of that is, it’s difficult because it’s very capable and you have a bunch of capacity on tap, and there are reasons I don’t think they’re good enough to justify how complicated the whole jing-a-ma-jing has become. But what’s certainly true is that it’s no longer easier, it’s not easier to use AWS than it is to run your own machines, which we learned when we pulled out the cloud and didn’t hire a single extra person. Even though we operate all our own hardware, the team stayed exactly the same. So you have this three-way pitch, right? It’s going to be easier, it’s going to be cheaper. Certainly wasn’t cheaper. We’ve just proved that by cutting our spend on infrastructure by 1/2 to 2/3 and it’s going to be faster. The last bit was true, but way too many people overestimated the value of that speed.

(03:21:05) If you need a thousand computers online in the next 15 minutes, nothing beats the cloud. How would you even procure that? If we just need another 20 servers, it’s going to take a week or two to get boxes shipped on pallets, delivered to a data center and unwrapped and racked and all that stuff. But how often do we need to do that? And how often do we need to do that if buying those servers is way, way cheaper so we get vastly more compute for the same amount of money? Could we just buy more servers and not even care about the fact that we’re not hyper-optimized on the compute utility, that we don’t have to use things like automatic scaling to figure things out because we have to reduce costs? Yes, we can. So we went through this journey over a realization in early 2023, when I had finally had enough with our bills.

(03:21:57) I wanted to get rid of them. I wanted to spend less money. I wanted to keep more of the money ourselves. And in just over six months, we moved seven major applications out of the cloud in terms of compute, caching, databases to works onto our own servers. A glorious, beautiful new fleet bought from the king of servers, Michael Dell, who really, by the way, is another icon of mine. I saw he just celebrated 41 years in business. 41 years, this man has been selling awesome servers that we’ve been using for our entire existence. But anyway, these pallets arrive in a couple of weeks and we rack them up and get everything going, and we were out, at least with the compute part. We then had a long multi-year commitment to S3, because the only way to get decent pricing in the cloud, by the way, is not to buy on a day-to-day basis, not to rent on a day-to-day basis, but to bind yourself up to multi-year contracts. With compute, it’s often a year. That was in our case.

(03:22:58) And with storage, this was four years. We signed a four-year contract to store our petabytes of customer files in the cloud to be able to get something just halfway decent affordable. So all of these projects came together to the sense that we’re now saving literally millions of dollars, projected about 10 million over five years. It’s always hard. How do you do the accounting exactly and TOC this, that and the other thing, but it’s millions of dollars. But it’s not just that. It’s also the fact that getting out of the cloud meant returning to more of an original idea of the internet. The internet was not the sign such that three computers should run everything. It was a distributed network such that the individual nodes could disappear and the whole thing would still carry on. DARPA designed this such that the Russians could take out Washington and they could still fight back from New York, that the entire communication infrastructure wouldn’t disappear because there was no hub and spoke. It was a network. I always found that an immensely beautiful vision, that you could have this glorious…

DHH (03:24:00) An immensely beautiful vision that you could have this glorious internet and no single node was in control of everything and we’ve returned to much more of a single node controlling everything idea with these hyperscalers. When US-East one, the main and original region for AWS goes offline, which has happened more than a few times over the years, seemingly a third of the internet is offline. That in itself is just an insult to DARPA’s design. It doesn’t detract from the fact that what AWS built was marvelous, I think the Cloud has moved so many things so far forward especially around virtualization, automation, setup, it’s all those giant leaps forward for system administration that’s allowing us now to be able to run things on-prem in a way that smells and feels much like the Cloud just at half the cost or less and with the autonomy and the satisfaction of owning hardware.

(03:24:59) I don’t know the last time you looked at an actual server and took it apart and looked inside of, these things are gorgeous. I posted a couple of pictures of our racks out in the data center and people always go crazy for them because we’ve gotten so abstracted from what the underlying metal looks like in this Cloud age that most people have no idea. They have no idea how powerful a modern CPU is, they have no idea how much RAM you can fit into a 1U rack. Progress in computing has been really exciting especially, I’d say, in the last four to five years after TSMC, with Apple’s help, really pushed the envelope. We sat still there for a while while Intel was spinning their wheels going nowhere and then TSMC, with Apple propelling them, really move things forward and now servers are exciting again. You’re getting jumps year over year in the 15, 20% rather than the single digit we were stuck with for a while and that all means that owning your own hardware is a more feasible proposition than it’s ever been, that you need fewer machines to run ever more and that more people should do it because, as much as I love Jeff and Amazon, he doesn’t need another, whatever, 40% margin on all the tech stuff that I buy to run our business.

(03:26:19) And this is just something I’ve been focused on both because of the ideology around honoring DARPA’s original design, the practicality of running our own hardware, seeing how fast we can push things with the latest machines and then saving the money. And that has all been so enjoyable to do but also so counterintuitive for a lot of people because it seemed, I think, for a lot of people in the industry, that we’d all decided that we were done buying computers, that that was something we would just delegate to AWS and Azure and Google Cloud, that we didn’t have to own these things anymore. So, I think there’s a little bit of whiplash for some people that, oh, I thought we agreed we were done with that and then along come us and say, “Ah, you know what? Maybe you should have a computer.”

Owning your own servers

Lex Fridman (03:27:07) Is there some pain points to running your own servers?

DHH (03:27:10) Oh, plenty. There’s pain points to operating computers of all kind. Have you tried using a personal computer these days? Half the time, when my kids or my wife have a problem, I go like, “Have you tried turning it just off and on again?” Computers are inherently painful to humans. Owning your own computer though makes some of that pain worth it, there’s a responsibility that comes with actually owning the hardware that, to me, at least make the burden of operating that hardware seems slightly more enjoyable. Now, there are things you have to learn, certainly at our scale too. We’re not just buying a single computer and plugging it into an Ethernet, we have to have racks and racks of them and you’ve got to set it up with network cabling and there is some specialized expertise in that but it’s not like that expertise is building nuclear rockets, it’s not widely distributed.

(03:27:58) Literally, the entire internet was built on people knowing how to plug in a computer to the internet. Oh, ethernet cable goes here, power cable goes here, let’s boot up Linux. That’s how everyone put anything online until 10, 12 years ago when the Cloud took over. So, the expertise is there and can be rediscovered, you too can learn how to operate a Linux computer.

Lex Fridman (03:28:21) Yeah. And when you get a bunch of them, there’s a bunch of flashing LEDs and it’s just so exciting.

DHH (03:28:26) Well, that’s beautiful, calming, amazing. Computers are really fun. This is actually something I’ve gotten into even deeper after we moved out of the Cloud. Now, my next tingle is that, if you could move out of the Cloud, can you also move out of the data center? Personal servers have gotten really scarily quick inefficient and personal internet connections rival what we connected data centers with just a decade or two ago. So, there’s a whole community around this concept of homelabbing which is essentially installing server hardware in your own apartment, connecting it to the internet and exposing that directly to the internet that harks back to those glorious days of the ’90s when people building for the internet would host the actual website on their actual computer in the closet.

(03:29:20) And I’m pretty fired up about that, I’m doing a bunch of experiments, I’ve ordered a bunch of home servers for my own apartment. I marvel at the fact that I can get a five gigabit fiber connection now, I think. Do you know what five gigabit, that could have taken Basecamp to multiple millions of MRR in the way that back then I ran the whole business on a single box with 2004 technology and probably 100 megabit cable. The capacity we have access to, both in terms of compute and connectivity, is something that people haven’t readjusted to. And this happens sometimes in technology where progress sneaks up on you, this happened with SSDs, I love that by the way.

(03:30:04) We designed so much of our technology and storage approach and database design around spinning metal disks that had certain seek rate properties and then we went to NVMe and SSDs and it took quite a while for people to realize that the systems had to be built fundamentally different now. That the difference between memory and disk was now far smaller when you weren’t spinning these metal plates around with a little head that had to read off them, you were essentially just dealing with another type of memory. I think we’re a little bit in that same phase when it comes to the capacity of new businesses to be launched literally out of your bedroom.

Lex Fridman (03:30:45) So, you can get pretty far with a large user base with homelabbing.

Lex Fridman (03:30:51) That’s exciting. That’s like the old school. That’s really exciting, right?

DHH (03:30:54) It’s bringing back the start-up in the garage in the literal physical sense of the word. Now, some of that is do we need to, you can get relatively cheap Cloud capacity if you don’t need very much.

Lex Fridman (03:31:07) Hell, yes, we need to. The feeling of doing that by yourself, of seeing the LED lights in your own home, there’s nothing like that.

DHH (03:31:17) There’s just an aesthetic to it that I am completely in love with and I want to try to push on. Now, it’s not going to be the same thing as getting out of the Cloud? I’m not sure. Our exit out of the cloud was not the exit out of the data center. We basically just bought hardware, shipped it to a professionally managed data center that we didn’t even actually touch. This is the other misconception people have about moving out of the Cloud, that we have a bunch of people who are constantly driving to a data center somewhere to rack new boxes and change dead RAM, that’s not how things happen in the modern world at all. We have a company called Summit, previously Deft, that is what we call white gloves, they work in the data center.

(03:31:54) When we need something like, “Hey, Deft, can you go down and swap the dead SSD in box number six?” They do it and what we see is akin to what someone working with the Cloud would see. You see IP addresses coming online, you see drives coming online, it’s not that different but it is a whole heck of a lot cheaper when you are operating at our scale. And of course it is, of course it’s cheaper to own things if you need those things for years rather than it is to rent it. In no other domain would we confuse those two things that it’s cheaper to own for the long duration than it is to rent.

Lex Fridman (03:32:29) There is some gray area, I’ve gotten a chance to interact with the XAI team a bunch, I’m probably going back out there in Memphis to do a big podcast associated with the Grok release. And those folks, in order to achieve the speed of building up the cluster and to solve some of the novel aspects that have to do with the GPU, with the training, they have to be a little bit more hands-on, it’s less white glove.

DHH (03:32:54) Oh, and I love that. They’re dealing with a frontier problem and they’re dealing with it not by renting a bunch of GPUs at a huge markup from their main competitor, they’re going like, “No, screw that. We’re going to put 100,000 GPUs in our own tents and build it in absolute record time.” So, I think, if anything, this is testament to the idea that owning hardware can give you an advantage both at the small scale, at the medium scale and at the pioneer levels of computing.

Elon Musk

Lex Fridman (03:33:20) By the way, speaking of teams, XAI, Tesla are large companies but all those folks … I don’t know what it is about. You said Jeff is really good at finding good people, at seeing strength in people. Elon is also extremely … I don’t know what that is. Actually, I’ve never actually seen, maybe you could speak to that, he’s good at finding greatness.

DHH (03:33:48) I don’t think he’s finding as much as he’s attracting. He’s attracting the talent because of the audaciousness of his goals and his mission, the clarity by which he states it. He doesn’t have to go scour earth to find the best people, the best people come to him because he is, talking about Elon here, one of the singular most invigorating figures in both the same order of the universe here, haters and lovers. He’s having such an impact at such a scale that of course he’s got to have literally millions of people think he’s the worst person in the world and he’s also going to have millions of people thinking he’s the greatest gift to humanity. Depending on the day, I’m somewhere in between but I’m more on the greatest gift to humanity end of the scale than I’m on the other end of the scale. And I think that really inspires people in a way that we’ve almost forgotten that that level of audacity is so rare that, when we see it, we don’t fully know how to analyze it.

(03:34:48) We think of Elon as finding great talent, and I’m sure he is also good at that, but I also think that this beacon of the mission. We’re going to fucking Mars, we’re going to transform transportation into using electricity, we’re going to cover the earth in internet is so grand that there are days where I wake up and go like, “What the fuck am I doing with these to-do lists?” Like, “Jesus, should I go sign up for something like that?”

DHH (03:35:17) That sounds invigorating in a sense I can only imagine a Viking back in 1050 going, “Should we go to Normandy? You may die along the way but, oh, boy, does that sound like a journey and an adventure.”

Lex Fridman (03:35:31) There’s a few components there, one definitely this bigger than life mission and really believing it. Every other sentence is about Mars, really believing it. It doesn’t really matter what anybody else, the criticism, anything, there’s a very singular focused big mission. But I think it also has to do a bunch of the other components like being able to hire well once the people, once wants to beacon attracts. And I’ve just seen people that don’t necessarily on paper have a resume with a track record, I’ve seen who now turned out to be legendary people who basically tosses on the ball of leadership, sees something in them and says and gives them the ownership and they run with it and that happens at every scale that, there’s a real meritocracy.

(03:36:23) And there’s just you could see the flourishing of human intellect in these meetings, in these group getting together where the energy is palpable. It’s exciting for me to just be around that because there’s not many companies I’ve seen that in because, when a company becomes successful and larger, it somehow suffocates that energy that, I guess, you see in start-ups at the early stages but it’s cool to see it at a large company that’s actually able to achieve scale.

DHH (03:37:01) I think part of the secret there is that Elon actually knows things and, when you know things, you can evaluate the quality of work products. And when you can evaluate the quality of work products, you can very quickly tell who’s full of shit and who will actually take you to Mars and you can fire the people who is full of shit and you can bet on the people who’ll get us to Mars. That capacity to directly evaluate the competency of individuals is actually a little bit rare. It’s not widely distributed amongst managers, hiring managers. It’s not something you can easily delegate to people who are not very skilled at the work itself. And Elon obviously knows a lot about a lot and he can smell who knows stuff for real.

(03:37:51) And is this, at our tiny scale, something I’ve tried to do in the same order where, when we hire programmers, for example, it’s going to be interesting now with AI as the new challenge, but up until this point, the main pivot point for getting hired was not your resume, was not the schooling you’ve had, it was not your grades, it was not your pedigree, it was how well you did on two things. A, your cover letter because I can only work with people remotely if they’re good writers. So, if you can’t pen a proper cover letter and can’t bother to put in the effort to write it specifically for us, you’re out. Two, you have to be able to program really well to the degree that I can look at your code and go like, “Yeah, I want to work with that person.” Not only I want to work with that person, I want to work on that person’s code when I have to see it again in five years to fix some damn bug.

(03:38:44) So, we’re going to give you a programming test that simulates the way we work for real and we’re going to see how you do. And I’ve been surprised time and again where I thought for sure this candidate is a shoe-in, they sound just right, the CV is just right and then you see the code getting turned in and I’m like, “No way. No way are we hiring this person.” And the other way has been true as well. I’d go like, “I don’t know about this guy or this woman Eeh, I don’t know.” and then they turn in their code stuff and I’m like, “Holy shit, can that person be on my team tomorrow preferably?” The capacity to evaluate work product is a superpower when it comes to hiring.

Lex Fridman (03:39:24) There’s a step that I’ve seen Elon do really well which is be able to show up and say this can be done simpler.

Lex Fridman (03:39:32) But he knows what he’s talking about and then the engineer, because Elon knows enough, the engineer’s first reaction, you can tell, it’s almost like rolling your eyes if your parent tells you something, this is not, no, I’ve been working on this for a month, you don’t … But then, when you have that conversation a little more, you realize, no, it can be done simpler, find the way. So, there’s a good … When two engineers are talking, one might not have perfect information but if the senior engineer has good instinct that’s been battle earned, then you can say simplify and it actually will result in simplification.

DHH (03:40:17) And I think this is the hallmark of the true greats that they, not only have the insight into what’s required to do the work, but they also have the transcendent vision to go beyond what the engineer would do, the programmer would do. I think if we are looking at these rarities, obviously, the myth of Steve Jobs was also this. Even though perhaps he was less technical than Elon is in many ways, he had the same capacity to show up to a product team and really challenge them to look harder for the simplification or for making things greater in a way that would garner disbelief from the people who are supposed to do it. This guy is full of, this is crazy, we can never … And then, two months later, this.

(03:41:05) So, there is something of this where you need the vision, you need it anchored by the reality of knowing enough about what’s possible, knowing enough about physics, knowing enough about software that you’re not just building bullshit. There are plenty of people who can tell a group of engineers, “No, just do it faster,” but that’s not a skill, it’s got to be anchored in something real. But it’s also got to be anchored in, it’s a tired word, but a passion for the outcome to a degree where you get personally insulted if a bad job is done. This is what I’ve been writing about lately with Apple, they’ve lost that asshole who would show up and tell engineers that what they did was not good enough in ways that would actually perhaps make them feel a little small in the moment but would spark that zest to really fix it. Now they have a logistics person who’s very good at sourcing components and lining up production Gantt charts but you’re not getting that magic.

(03:42:12) Now, what’s interesting with that whole scenario was I actually thought how well Tim Cook ran things and has run things at Apple for so long that maybe we were wrong, maybe we were wrong about the criticality of Steve Jobs to the whole mission, maybe you could get away with not having it. I think the bill was just going to come later and now it has, Apple is failing in all these ways that someone who would blow up Steve’s ghost and really exalt him would say like, “See, this is what’s happening now.” So, the other thing here too, of course, is it’s impossible to divorce your perception of what’s a critical component of the system and the messy reality of a million different moving parts in the reality of life and you should be skeptical about your own analysis and your own thesis at all time.

Apple

Lex Fridman (03:43:02) Since you mentioned Apple, I have to ask, somebody in the internet submitted the question. Does DHH still hate Apple? I believe the question is. So, there was a time when Basecamp went to war with Apple over the 30%, can you tell the saga of that battle?

DHH (03:43:25) Yes, but first I’ll tell you how I fell in love with Apple which was all the way back in also early 2000s. When Microsoft was dominating the industry in a way we now see Apple and Google dominate mobile phones, Microsoft was just everything when it came to personal computers and I really did not like the Microsoft of the ’90s. The Microsoft of the ’90s was the cut off the air supply to Netscape kind of characters, was the Bill Gates sitting defiant in an interview with the DOJ asking about what the definition of what is and just overall unpleasant, I think. You can have respect for what was achieved but I certainly didn’t like it. And as we’ve talked about, I came begrudgingly to the PC after Commodore fell apart and I couldn’t continue to use the Amiga so I already had a bit of a bone to pick with PCs just over the fact that I love my Amiga so much.

(03:44:23) But then in the early 2000s, Apple emerged as a credible alternative because they bet the new generation of Macs on Unix underpinnings and that allowed me to escape from Microsoft and suddenly I became one of the biggest boosters of Apple. I was in my graduating class at the Copenhagen Business School, I started with the first white iBook, first person using Mac and, by the time we were done in graduating, I had basically converted half the class to using Apple computers because I would evangelize them so hard and demonstrate them and do all the things that a super fan would do and I continued that work over many years.

(03:45:07) Jason and I actually in, I think, 2004, 2005, did an ad for Apple that they posted on the developer side where we were all about Apple is so integral to everything that we do and we look up to them and we are inspired by them. And that love relationship actually continued for a very long time, I basically just became a Mac person for 20 years. I didn’t even care about looking at PCs, it seemed irrelevant to me whatever Microsoft was doing which felt like such a relief because in the ’90s I felt like I couldn’t escape Microsoft and suddenly I had found my escape. And now I was with Apple and it was glorious and they shared so many of my sensibilities and my aesthetics and they kept pushing the envelope and there was so much to be proud of, so much to look up to.

(03:45:53) And then that started to change with the iPhone which is weird because the iPhone is what made modern Apple. It’s what I lined up in 2007 together with Jason for five hours to stand in the line to buy a first generation product where Apple staff would clap at you when you walked out the store, I don’t know if you remember that. It was a whole ceremony and it was part of that myth and mystique and awe of Apple. So, I wasn’t in the market for other computers, I wasn’t in the market for other computer ideas, I thought perhaps I’d be with the Mac until the end of days. But as Apple discovered the gold mine it is to operate a toll booth where you don’t have to innovate, where you don’t actually even have to make anything, where you can just take 30% of other people’s business, there was a rot that crept in to the foundation of Apple and that started all the way back from the initial launch of the app store.

(03:46:55) But I don’t think we saw at the time, I didn’t see at the time, just how critical the mobile phone would become to computing in general. I thought when the iPhone came out that like, “Oh, it’s like a mobile phone, I’ve had a mobile phone since the early ’90s.” Well, it wasn’t a mobile phone, it was a mobile computer and, even more than that, it was the most important computer or it would become the most important computer for most people around the world which meant that, if you like to make software and wanted to sell it to people, you had to go through that computer. And if going through that computer meant going through Apple’s toll booth and not just having to ask them permission which in and of itself was just an indignity. When you’re used to the internet where you don’t have to ask anyone for permission about anything, you buy a domain and you launch a business and, if customers show up, boom, you’re a success and, if they don’t, well, you’re a failure.

(03:47:47) Now, suddenly, before you could even launch, you’d have to ask Apple for permission? That always sat wrong with me. But it wasn’t until we launched HEY in 2001 that I saw the full extent of the rot that has snuck into Apple’s apple.

Lex Fridman (03:48:05) For people who don’t know and we’ll talk about it, HEY is this amazing attempt to solve the email problem.

DHH (03:48:14) Yes. I like to pitch it as what Gmail would’ve been with 20 years of lessons applied in a way where they could actually ship. Gmail was incredible when it launched in 2004 and it still is a great product but it’s also trapped in its initial success. You can’t redesign Gmail today, it just has way too many users. So, if you want fresh thinking on email, I wanted fresh thinking on email, I needed to build my own email system. And not just my own email client, that’s what a lot of people have done over the years, they build a client for Gmail but you’re severely constrained if you don’t control the email server as well. If you really want to move the ball forward with email, you have to control both the server and the client and that was the audacious mission we set out to do with HEY.

(03:49:00) And that was what’s funny, I thought our main obstacle here would be Gmail, it’s the 800-pound gorilla in the email space. Something like 70% of all email in the US is sent through Gmail, I think their world rates are probably in that neighborhood as well, they’re just absolutely huge. And trying to attack an enormous established competitor like that who’s so, actually, still loved by plenty of people and it’s free seems like a suicide mission. And it was only a mission we signed up for because we had grown ambitious enough after making Basecamp for 20 years that we thought we could tackle that problem. So, I thought, hey, this is dumb, I would not advise anyone to go head to head with Gmail, that seems like a suicide mission. We’re going to try anyway because, you know what, if we fail, it’s going to be fine, we’re just going to build a better email experience for me and Jason and the people at the company and our cat and that’ll be okay because we can afford to do so.

(03:50:03) But when we got ready to launch after spending two years building this product, millions of dollars in investment to it, we obviously needed mobile apps. You’re not going to be a serious contender with email if you’re not on a mobile phone and you need to be there with a native client. So, we had built a great native client for both iOS and for Android and, as we were getting ready to launch, we submitted both of them to the app stores, got both of them approved on, I think, Friday afternoon for the iOS app and we then went live on Monday and we were so excited. Hey, world, we’ve been working on this new thing, I’d love for you to check it out. And of course, as with anything when you launch a new product, there are some bugs so we quickly found a few in the iOS client and submitted a new build to Apple. Hey, here’s our bug fixes, can you please update and that’s when all hell broke loose.

(03:50:56) Not only were they not going to approve our update, they said, “Oh, wait a minute, we gave you permission to be in the app store but, I’m sorry, that was a mistake. We see that you’re not using our in-app payment system which means that we don’t get 30% of your business, you will have to rectify that or you can’t be in the app store.” And first I thought, well, it got approved already, we’re running on the same model we’ve run Basecamp on in the app store for a decade, if you’re not signing up through the app and we’re signing up our own customers on our own website and they’re just going to the app store to download their companion app, we’re going to be fine. That was the truth, right? That was why I never got so fired up about the app store. Even as Apple started tightening the screws, it was like, “My business was okay.”

(03:51:42) Now, suddenly, my business wasn’t okay. Apple was willing to destroy HEY if we did not agree to give them 30% of all the signups that came through the iOS app. And it wasn’t just about the 30%, it was also about splitting and not longer having a direct relationship with our customers. When you sell an app in the app store, you’re not selling an app to a customer, you’re selling an app to inventory at Apple and then Apple sells an app to that customer. That customer has a purchasing relationship with Apple so, if you want to give discounts or refunds or whatever, it’s complete hell. If you want to easily support multi-platform, that’s complete hell. If someone signs up for HEY on their iPhone and they want to switch to Android but, that billing relationship, it’s tied to Apple, it’s complete hell. For a million reasons, I did not want to hand my business over to Apple, I did not want to hand 30% of our revenue over to Apple so we decided to do something that seemingly Apple had never heard before, we said no.

(03:52:48) We’re not going to add the in-app payment. I don’t care if you’re threatening us, this is not fair, this is not reasonable, please approve. And of course they didn’t and it escalated and, after a couple of days, we realized, you know what, this isn’t a mistake, this isn’t going away, we’re going to be dead if they go through with this. If we’re not going to yield and give them the 30%, they’re going to kick us off unless we make such a racket, such noise that they will regret it and that’s exactly what then happened. We were blessed by the fact that we launched HEY one week before the WWDC, the Worldwide Developer Conference, where Apple loves to get up on stage and harp on how much they do for developers, how much they love them and why you should build their new devices and so on and so forth.

(03:53:44) And then we also just happened to have a platform on the internet which is very convenient when you need to go to war with a $3 trillion company. So, I started kicking and screaming-

DHH (03:53:55) … and essentially turning it up to 11 in terms of the fight and going public with our denial to be in the app store. And that turned into a prolonged two-week battle with Apple that essentially ended in the best possible outcome we could have gotten as David fighting Goliath which was a bit of a truce. We wouldn’t hand 30% over to Apple, they wouldn’t kick us out of the app store but we had to build some bullshit dummy accounts such that the app did something when you downloaded it. That was a rule that Phil Schiller seemingly made up on the fly when pressed for the fifth time by the media about why we couldn’t be in the app store when a million other companion apps could. But we just happened to be able to create so much pain and noise for Apple that it was easier for them to just let us be than to keep on fighting.

Tim Sweeney

Lex Fridman (03:54:48) What do you think about Tim Sweeney’s victory with Epic over Apple?

DHH (03:54:54) I think it is incredible and the entire developer ecosystem, not just on iOS but on Android as well, owe Epic, Tim Sweeney and Mark Rein, an enormous debt of gratitude for taking on the only battle that has ever inflicted a serious wound on Apple in this entire sordid campaign of monopoly enforcement and that is Epic’s fight versus them. Tim recently revealed that it has cost well over $100 million in legal fees to carry on this battle against Apple. We, for a hot moment, considered suing Apple when they were threatening to kick us out. We shopped the case around with a few law firms and perhaps, of course, they would tell us you have a good case, they’re trying to sell a product here, but they would also tell us it’s going to cost a minimum of $10 million and it’s going to take five to seven years through all the appeals.

(03:55:54) Now, we now learn the actual price tag was 10 times higher, right? Epic spend over 100 million. It would’ve destroyed us to take on Apple in the legal realm, only a company like Epic could do it. And only a company run by founders like Tim, like Mark could risk the business in the way that they did, the audacity they had to provoke the fight in the first place, which I thought was just incredible, and to stick with it for the long term. No board would’ve signed off on this lawsuit to a professional CEO, no freaking way. So, the fact that they’ve been able to beat Apple in also the most hilarious way possible, I think it’s just incredible. Because, remember, their first victory in the case was actually not much of a victory, there were about 11 counts in the trial, Apple basically won 10 of them and the judge awarded Epic this one little win that Apple couldn’t tell them not to link up to the internet to be able to do the payment processing.

(03:57:04) So, they want this one little thing and, Apple, instead of just taking the 10 out of 11 wins and going, fine, you can have your little links but all these other rules stay in place decided to essentially commit criminal contempt of court as they’ve now been referred to for prosecution and angered the judge to such a degree that the rule of law in the US now is that you can launch an app in the app store and you don’t have to use in-app payment but you can have a direct billing relationship with a customer if you just link out to the open internet when you take the credit card and then hop back into the app. And we owe all of that to Tim and Mark, we owe all of that to Epic. We’re going to launch new apps any minute now, I hope, actually, in the next week to take advantage of this that revamp the HEY app so that people who download the HEY app off the Apple app store can sign up in the app and can then use the web to put in their credit card so we don’t-

DHH (03:58:00) And can then use the web to put in their credit cards so we don’t have to pay 30% of the time. We have a direct billing relationship and such that they can take that subscription to Android, to PCs, whatever, without any hassle. And we have Tim and Mark to thank for it.

Lex Fridman (03:58:16) Yeah, Tim … I mean, like you said, founders, but also specific kind of founders because I think … Maybe you can educate me on this, but Tim is somebody who maintains to this day the unreasonableness of principles.

Lex Fridman (03:58:33) I think sometimes maybe even with founders, you can get worn down. It’s a large company.

Lex Fridman (03:58:38) There’s a lot of smart “people” around you, lawyers, and just whispering your ear over time, and you’re like, “Well, just be reasonable.” This is a different thing to maintain … I mean, Steve Jobs did this. Still are the asshole.

Lex Fridman (03:58:57) Who says, “No, this whole company, I’ll sink this whole fucking company over this.”

DHH (03:59:02) That’s the exact language, basically, I used in our original campaign. I will burn this business down before I hand over 30% of it to Apple. And that indignation, that actual rage, is something I try to be a little careful about tapping into because it is a little bit of a volatile compound because, I mean, I have a bunch of employees, we have a bunch of customers. It would be pretty sad if the journey of 37 singles after 25 years would come to an end because Apple would burn us down or I would burn the business down over this fight with Apple. But I think you also need that level of conviction to be able to even drive the day-to-day decisions.

(03:59:42) One of the other Apple examples … And I know we’re racking on Apple a little bit here, and I don’t actually hate them. I really don’t. I am tremendously disappointed at the squandered relationship that did not need to be sold away for so little. Now I understand that the app store toll booth is actually a pretty big business. It’s multiple billions, but Apple is a trillion-dollar company. And I think in the lens of history, this is going to come off as a tremendous mistake, and I think it’s already coming off as a tremendous mistake. The flop that was the Vision Pro was partly because Apple had pissed off every other developer.

(04:00:20) No one was eager to come build the kind of experiences for their new hardware that would perhaps have made it a success. So when you’re on top and you have all the cards, you can dilute yourself into thinking that you can dictate all terms at all times and there are no long-term consequences. Apple is learning, finally, the fact that there are long-term consequences and that developers actually are important to Apple’s business and the relationship is not entirely one-sided. We don’t owe our existence to Apple and Apple alone. We’ve built our own customer bases.

(04:00:53) Apple has been beneficial to the industry. I’m glad the iPhone exists, da da da da. It’s not that it doesn’t go both ways, but Apple wants it only one way. And I think that is a mistake and it’s a mistake that was avoidable and, A, that’s disappointing. Certainly disappointing for me. I’ve literally spent 20 years evangelizing this shit, right? I’ve spent so much money buying Apple hardware, excusing a bunch of things they’ve done over the years, and then for what? For the fact that you wanted 30% of something that I created in the most unreasonable way possible. Couldn’t we have found a better way to do this? I think they’re going to get forced to do a better way. But did you also have to go through the indignity of having a criminal contempt charge against you getting referred to prosecution? It just seems so beneath Apple, but it also seems so in line with what happens to huge companies who are run by “professional managers” rather than founders and unreasonable people.

Lex Fridman (04:02:01) Well, we should probably also say that the thing you love about Apple, the great spirit of Apple, I think, still persists and there’s a case to be made that this 30% thing’s a particular slice of a company, not a defining aspect of the company and that Apple is still on top in the hardware that it makes and a lot of things that it makes. And this is … That could be just a hiccup in a long story of a great company that does a lot of awesome stuff for humanity. So Apple is a truly special company. We mentioned Amazon. There is no company like Apple.

DHH (04:02:40) I agree. This is why the disappointment is all greater.

DHH (04:02:44) Because we had such high aspirations and expectations to Apple, that they were the shining city on the hill and they were guiding the industry in a million positive ways. I think, as we talked about earlier, hardware is exciting again in large part because Apple bought PA Semi and pursued a against all odds mission to get ARM up to the level it is today. And we have these incredible M chips now because of it. And the design sensibilities that Apple bring to the table are unparalleled. No one has taste certainly at the hardware level like Apple does. Even at the software level, I’d say there’s a lot of taste left in Apple, but there’s also some real sour taste now.

(04:03:34) So they have to wash that off first, I think, before they find their way back. But Apple’s been in a mora as before. I mean, Wozniak and Steve Jobs started this thing in the garage, has great success with the Apple II. He hands the company over to a sugar drink salesman who tanks the company into the ’90s. He doesn’t learn the lesson, spends the next 20 years building up this amazing company, then hands the company over again to a logistics person who presumably had more redeeming qualities than the first guy who put in charge, but still ends up leading the company astray.

(04:04:13) Now this is the norm. The norm is that great companies don’t last forever. In the long arc of history, almost no company lasts forever. There are very few companies around that was here a hundred years ago, even fewer 200 years ago, and virtually nothing that are a thousand years old outside of a handful of Japanese swords makers or something like that, right? So you can get deluded into thinking that something is forever when you’re in the moment and they seem so large.

(04:04:43) Apple could absolutely stumble and I think they have more reason to stumble now than ever. They’re behind on AI, terribly behind. Their software quality is faltering in a bunch of ways. The competition is catching up on the hardware game in part because TSMC is not an Apple subsidiary, but a foundry that services AMD and Nvidia, and others who were now able to use the same kind of advanced processes. This is something I learned after not looking at PC hardware for the longest time, that holy smokes, AMD actually makes CPUs that are just as fast, if not faster, than Apple’s. They’re not quite as efficient yet because ARM has some fundamental efficiencies over x86, but they’re still pretty good.

(04:05:27) So Apple should have reason to worry. Apple shareholders should have reason to be concerned, not just about all these stumbles, but also by the fact that Apple is run by old people. Apple’s board has an average age of, I think, 75. Their entire executive team is above 60. Now, that sounds horribly ageist. And in some ways, it a little bit is, in the same way I’m ageist against myself. I’m 45 now. And I have to force myself to really get into AI because it is such a paradigm shift and a lot of people, when they reach a certain age, are just happy to stay with what they know. They don’t want to go back to being a beginner. They don’t want to go back to having to relearn everything. And I think this is a little hard for me at 45. How the hell do you do that at 75?

Fatherhood

Lex Fridman (04:06:22) I have to come back to it. You mentioned it earlier, you’re a parent. Can you speak to the impact that becoming a father has had on your life?

DHH (04:06:32) I think what’s funny about fatherhood is that, for me, I wasn’t even sure it’s something I wanted. It took meeting the right woman and letting her convince me that this was the right idea before we even got started. I didn’t have starting my own family on the list of priorities in my late 20s or even early 30s. It was really the impetus of meeting my wife, Jamie, and her telling me, “This is what I want. I want to have a family, I want to get married, I want to have kids. I want to have three.” And me going for a second like, “Whoa, whoa, whoa.” And then, “All right, let’s do it.” And I think that’s the kind of happy accident where some parts of my life have been very driven, where I knew exactly what I wanted and how to push forward to it, and what the payoff was going to be. But when it comes to having a family, that always felt like a very fuzzy, abstract idea that, sure, someday maybe. And then it became very concrete because I met a woman who knew what she wanted.

(04:07:55) And looking back on it now, it almost seems crazy, like there’s this fork in the road of reality where if that hadn’t happened and I had been sitting here now not being a father, not having a family, the level of regret knowing what I know now about the joys of having that family would have been existential. I don’t know if they would have been devastating. I think men have a little bit of a longer window to pursue these things than women do. There are just certain biological facts, but ending up with the family I have now, ending up with my three boys, have been just a transformative experience in the sense that here’s something that turned out to be the most important thing. And it was an open secret. Not even an open secret. It was an open truth through all of history.

(04:08:59) You listen to anyone who’s ever had children, they will all say, “My children are the most important to me.” Yet somehow that wisdom couldn’t sink in until you were in the situation yourself. I find those truths fascinating when you can’t actually relay them with words. I can tell you, “Hey, Lex, what are you doing? Get a wife, make some kids, get a move on it.” And these are just words. They’re not communicating the gravity of what it actually feels to go through the experience. And you can’t really learn it without going through it.

(04:09:33) Now, of course, you can be influenced and whatever, we can all help contribute and little sparks and little seeds can grow in your mind about it, but it still has to happen. And now that I am in this situation and just the sheer joy on a daily basis where you think your level of life satisfaction is on a scale of one to 10.

DHH (04:09:57) And then the satisfaction of seeing your children understand something, accomplish something, learn something, do something, just be, just goes like, oh my God, the scale doesn’t go from one to 10, it goes from one to a hundred. And I’ve been playing down here in the one to 10 range all this time and there’s a one to a hundred. That has been humbling in a way that is impactful in and of itself. This whole idea that I thought I had a fair understanding of the boundaries of life in my early 30s, like what is this about? I mean, I’ve been on this earth long enough now here to know something.

(04:10:39) And you realize, “I don’t know.” I did not know. I did not know that the scale was much broader. And I’ve often talked about the joys of having kids and just seeing your own DNA, which is remarkable to me because literally that’s been the pursuit of humans since the dawn of time. I am here today because, whatever, 30,000 years ago, some Neanderthal had the same realization that I should procreate and I should continue my bloodline. And that all amounts to me sitting here now, but it didn’t become a practical reality to me before meeting the right woman. And I think that that’s sometimes not part of the conversation enough that there’s something broken at the moment about how people pair up in the western world.

DHH (04:11:33) And it’s at the source of why we’re not having enough children because there’s not enough couples, there’s not enough marriage, there’s not enough of all these traditional values that even 50, 60, 70 years ago was just taken for granted. We’re in this grand experiment of what happens if we just remove a bunch of institutions? What happens if we no longer value marriage as something to aspire to? What happened if parenthood is now seen in some camps as almost something weird or against your own self-expression? It’s a grand experiment that I’m curious how it turns out. I prefer to watch it as a movie, like The Children of Men, that was a good show. I wish that wasn’t reality, but we’re seeing that reality play out while I’m sitting here in a very traditional two-parent loving household with three children and going, “This is now at the top.”

(04:12:38) I’ve done a lot of things in my life. I’ve built software, I’ve built companies, I’ve raced cars, I’ve done all sorts of things, and I would trade all of it in a heartbeat for my kids. That’s just a really fascinating human experience, that the depth of that bond is something you can’t appreciate before you have it. But I also think there is a role to play to talk it up because we’re being bombarded constantly with reasons why not to. Oh, it’s too expensive.

(04:13:14) Well, you could get divorced and then you might lose half. There’s all these voices constantly articulating the case against marriage, the case against having children, that those of us who’ve chosen to do the traditional thing, to get married and to have children, have an obligation to talk it up a little bit, which would have seen ridiculous again 50 years ago that you’d have to talk up something so fundamental of that.

(04:13:42) But I have become obligated in that sense to do just that, to talk it up, to say, “You know what? You can look at everything that I’ve done and if you like some of those parts, realize that to me, in the situation, the kids, the family, the wife is more important than all of it.” And it sounds like a cliche because you’ve heard it a thousand times before, and by becoming a cliché, maybe you start believing it’s not true, that it’s just something people say, but it is reality.

(04:14:16) I know almost no parents that I have personal relationships with that don’t consider their children to be the most important thing in their life.

Lex Fridman (04:14:23) So there’s a lot of interesting things you said. So one, it does seem to be … I know a lot of parents, perhaps more interestingly, I know a lot of super successful people who are parents who really love their kids and who say that the kids even help them to be more successful. Now, the interesting thing, speaking to what you’re saying, is it does seem for us humans, it’s easier to articulate the negatives because they’re concrete, pragmatic. It costs more, it takes some time. They can be crying all over the place. They’re tiny narcissists running around or whatever.

DHH (04:15:07) Which is all true, by the way.

Lex Fridman (04:15:08) Yeah, pooping everywhere, that kind of stuff. But to articulate the thing you were speaking to of there’s this little creature that you love more than anything you’ve ever loved in your life, it’s hard to convert that into words. You have to really experience it. But I believe it and I want to experience that, but I believe, because just from a scientific method, have seen a lot of people who are not honestly not very capable of love, fall completely in love with their kids.

Lex Fridman (04:15:40) Very sort of, let’s just call it what it is, engineers that are very like beep boop bop.

Lex Fridman (04:15:47) They just fall in love and it’s like, all right. People who, just like you said, they don’t really care or don’t really think about having kids, that kind of stuff, once they do, it changes everything. But it’s hard to convert into words.

DHH (04:16:03) One of the reasons I think it’s also difficult is … I mean, I like kids, not that I actively dislike them, but when I was around other people’s kids, I didn’t have a emotional reaction. Some women have. They see a baby and they go, “Oh.” I never had any emotion of that. I mean, I could appreciate, I’m glad for you that you have children. It did not provoke anything in me. The emotions that are provoked in me when I look at my own children, this doesn’t exist in the same universe, so you don’t have a complete parallel or at least a lot of men, or at least me, I didn’t have a framework to put it into, what would it be like to have my own child?

(04:16:41) And then you experience it. It’s like the poof. And it happened so quickly, too. This is what I found fascinating. It happens before that little human is even able to return any words to you that the love you develop to an infant, it happens quite quickly, not necessarily immediately. I don’t know, different people have different experiences, but it took me a little bit. But then once it hit, it just hit like kick of a horse. And I love that it’s also just such a universal experience that you can be the most successful person in the world, you can be the poorest person in the world, you can be somewhere in the middle, and we share this experience that being a parent, for most of them, turns out to be the most important thing in their life.

Lex Fridman (04:17:33) But it is really nice to do that kind of experience with the right partner. But I think because I’m such an empath, the cost of having the wrong partner is high for me. But then I also realized, man … I have a friend of mine who’s divorced happily and he still loves the shit out of his kids and it’s still beautiful. It’s a mess, but all of that love is still there and you just have to make it work. It’s just that, I don’t know, that kind of divorce would destroy me.

DHH (04:18:02) You should listen to The School of Life. He has this great bit on YouTube, you’ll marry the wrong person. If you accept upfront that you will marry the wrong person, that every potential person you can marry is going to be the wrong person on some dimension. They’re going to annoy you. They’re going to be not what you hoped in certain dimensions. The romantic ideal that everything’s just perfect all the time is not very conducive to the reality of hitching up and making babies. Because I think as you just accounted, even when it turns to shit, I find that most of the people I personally know where things have fallen apart and have turned to shit never in a million years would they go, “I regret it. I would rather my children did not exist because a relationship turned sour.” I mean, I think you should try very hard and I think this is also one of those things where we didn’t fully understand those fences, and when we pulled them up and celebrated how easy it is to get divorced, for example, that that wasn’t going to have some negative consequences.

(04:19:12) I’m not saying you shouldn’t have divorces. I’m not saying return to times past. I am saying, though, that civilization over thousands of years developed certain technologies for ensuring the continuation of its own institutions and its own life that perhaps we didn’t fully appreciate. I mean, again, this is something Jordan Peterson and others are far more articulate to speak about, and that I’ve learned a lot to just analyze my own situation. Why is it that this incredible burden it is, to be responsible for someone else’s life that you brought into this world is also the most rewarding part of existence? That’s just curious. Before I heard Peterson articulate the value of taking on the greatest burden you know how to carry, I always thought about burdens as a negative things. Why would I want the burden of a child? I might screw it up. I might be a bad parent. They might have bad … All this stuff, right? All the reasons why you shouldn’t. And so few voices articulating why you should.

Lex Fridman (04:20:21) Yeah, but I should also add on top of that, the thing you mentioned currently, perhaps in the West, the matchmaking process …

Lex Fridman (04:20:29) … is broken and technology made it worse. It’s fascinating, this whole thing that hasn’t been solved. So hiring great teams, that’s probably been solved the best out of matchmaking, finding great people to hire.

Lex Fridman (04:20:45) Second, finding great friends. That also hasn’t been solved.

Lex Fridman (04:20:50) It’s breaking down. And the third is matchmaking for relationships. That’s the worst. And in fact, technology made it even worse.

DHH (04:20:59) It is. It’s a great example again of how all the greatest intentions still led us straight to hell. I really enjoyed Louise Perry’s analysis of the sexual revolution not being an unqualified good, which was something I hadn’t thought about at all before she articulated it, that, of course, women should be able to have freedom and self-determination and abortions, and all of these things. And Louise Perry is not arguing against that either, of course. But there are second order facts that we don’t appreciate at the time, and we may not have ready-made solutions for, and that’s just interesting.

(04:21:40) You make life better in a million different ways and somehow we end up more miserable. Why is that? Why is it that humans find meaning in hardship? And I think some of that is that it’s a difficult question to answer through science. And again, Peterson articulates well this idea that you have to find some of it through art, some of it through authors, some of it through different … I was just about to say modes of knowing before I stopped myself because that sounds like woo bullshit. But there are different ways to acquire those deep lessons that paper is not going to tell you.

Lex Fridman (04:22:33) I mean, this is really … The point also applies to religion, for example. If you remove from society the software of religion, you better have a good replacement.

DHH (04:22:45) And we’ve had a bunch of bad replacements, especially over the last few decades. Religion is one of those things I’ve struggled with a lot because I’m not religious, but I wish I was. I can now fully appreciate the enormous value having an operating system like that brings, not just at the individual level, but rather at a societal level. And it’s not clear at all what the answer is. I think we’ve tried a lot of dead ends when it came to replacements and people have been filling that void in a million different ways that seem worse than all the religions, despite their faults in a myriad of ways have been able to deliver.

Lex Fridman (04:23:28) Yeah, religions like the cobalt code. It’s just-

DHH (04:23:33) Yes. It’s the institutions where we don’t fully understand the rules and why they’re there and what’s going to happen if we remove them. Some of them seems obvious to me are just bullshit of the time. Oh, you should need, whatever, shellfish, because in that region of the world, there was something, something, something. Okay, fine. But there’s a bunch of other things that are pivotal to keeping society functioning for the long term, and we don’t fully understand which is which. What’s the bullshit and what’s the load-bearing pillars of society?

Lex Fridman (04:24:04) Can you speak to the hit on productivity that kids have? Did they increase your productivity, decrease it, or is that even the wrong question to ask?

DHH (04:24:13) I think it’s one of the reasons why ambitious people are often afraid of having children because they think I have so much more to do and I barely have enough time now. How would I possibly be able to accomplish the things I want to accomplish if I add another human into the mix? Now, A, we’ve always worked 40 hours a week, not 80 or a hundred or 120. I think that’s very beneficial. B, kids don’t exist in this vacuum of just them alone being entered into your life. Hopefully, there’s a partner. And in my life, I’m married to a wonderful woman who decided to stop working her corporate job when we got together and have been able to carry a huge part of that responsibility.

(04:25:02) I was just about to say burden, and I think that’s exactly how it often gets presented, especially from a feminist perspective, that carrying for your own children is some unpaid labor that has to be compensated for in some specific way beyond the compensation of what bringing life into this world, raising wonderful humans. There’s something screwy about that analysis that I actually think the modern trad movement is a reply against. Whether they have all the answers, I’m certainly not sure of either, but there’s something that’s just not right in the analysis that children are a burden and that if woman chooses to stay at home with the kids, that that’s some failure mode of feminist ambition. I think that’s actually a complete dead end. Now, depends on different people, different circumstances. I can just speak to my life being married to a wonderful woman who have decided to be home with the kids, at least at their early age, and taken on a lot of those responsibilities. Now, it doesn’t mean there isn’t plenty of ways that I have to be part of that and have to chip in, but it’s allowed me to continue to work the 40 hours a week that I’ve always worked. But it’s made the 40 hours more strict. I have a schedule where I wake up, whatever, 6:30, and we have to get out of the door a little before 8:00. I usually have to play at least one or two rounds of Fortnite with my youngest and sometimes middle child.

(04:26:48) Then take the kids to school, get in, start work at, I don’t know, 8:39, then work until 5:00, 5:30, sometimes 6:00, but then it’s dinner and I have to be there for that, and then I have to read to the kids. And by the time that’s done, I don’t want to go back to work. So my work time really is 9:00 to 5:00, 9:00 to 6:00, depending of whatever is going on. Sometimes there’s emergencies and you have to tend to them, but it’s made it more structured and I found some benefit in that and I found some productivity in that, that I can’t goof around quite as much, that the day will end at around 5:36. That’s just if I didn’t accomplish what I wanted to do today, if I get to that time, it’s done. I’m over. I have to try again tomorrow. Whereas before having a family and before having kids, I could just not do it and just make it up in the evening.

(04:27:45) So in that way, it’s made me more structured, but it hasn’t really changed my volume of work all that much. I still work about the same amount of hours. And that’s, by the way, enough. This is one of the key points we make in It Doesn’t Have to Be Crazy at Work, the latest book we wrote, is that there’s enough time. 40 hours a week is actually a ton if you don’t piss it away. Most people do piss it away. They piss it away in meetings, they piss it away on just stuff that doesn’t matter when even three hours, four hours of concentrated uninterrupted time every day would move the goals they truly care about way down the field.

Lex Fridman (04:28:26) I think kids do make you more productive in that way for people who need it, especially people like me, they create their urgency.

Lex Fridman (04:28:34) If you have to be done by 5:00, it’s maybe counterintuitive notion, but for people like me who like to work, you can really fill the day with fluff of work. And if you have to be done by 5:00, you’re going to have to do the deep work and get it done, really focus singular work. And then you’re just going to cut off all the pressure-

DHH (04:29:02) It just keeps you honest. It keeps you honest because you can squander one day, you can squander two days, but if I squander a whole week, I feel terrible. Now, that’s just some drive I have in me where I feel content and full meaning if I actually do stuff that matters, if I can look back upon the week and go like, “That was a nice week.” Really, we moved forward. Maybe we didn’t get done, but we moved forward and everything got better. And I think kids really helped just time bucks things in that way. And a lot of people need that because I find just so much of the celebration of overwork to be so tiresome. Oh, I work 60 hours or 80 hours, 100 hours a week, and just like, first of all, no, you don’t. No, you don’t.

(04:29:50) Those 80 hours are full of all sorts of fluff that you label work, but that I would laugh at, and that most people laugh at, that you would laugh at if you actually did the analysis of where’s that time going. Most of the important stuff that have to be done is done in these uninterrupted chunks of two hours here or four hours there or five hours there. The hard part is making sure you get them in the whole piece. So don’t give me that. There’s time enough. And also, what’s so important that it ranks above continuing your lineage? I think there’s just some ancient honor in the fact that, again, this DNA that’s sitting on this chair traveled 30,000 years to get here, and you’re going to squander all that away just so you can send a few more emails.

Lex Fridman (04:30:41) There is something that’s also hard to convert into words of just the kind of fun you can have just playing with your kids. I don’t know what that on the surface it’s like, I can have that kind of fun just playing video games by myself, but no, it’s like there’s something magical about it, right?

DHH (04:31:00) I have a thousand hours logged in Fortnite since 19, I think, all of it with my kids. I’d never be playing Fortnite. Well, I don’t know if I never would be. I wouldn’t be playing a thousand hours of Fortnite if it wasn’t for my kids. The enjoyment for me is to do something with them that I also happen to enjoy. I really love Fortnite. It’s a phenomenal game. I don’t have to force myself to play that with them. I often ask like, “Hey, do you want to play Fortnite?” But still, it’s an activity that I get to share with them. It’s a passion that I get to share with them. I’ve started doing go-karting with my oldest. I’ve been driving race cars for a long time, and now they’re getting into go-karting, and just being at the go-kart track, seeing them go around, seeing them get faster, seeing them learn that skill, you just go look at what else would I be doing with my life. At my age, 45, I’m standing here truly enjoying life I brought into this world. What else was so important at this stage that I would otherwise be spending my time on?

DHH (04:32:00) … so important at this stage that I would otherwise be spending my time on.

Racing

Lex Fridman (04:32:04) All right. Like you mentioned, you like to race cars and you do it at a world-class competitive level, which is incredible. So how’d you get into it? What attracts you to racing? What do you love about it?

DHH (04:32:17) The funny thing about getting into racing is I did not get my driver’s license until I was 25. I grew up in Copenhagen, Denmark where the tax on cars is basically over 200%. So you pay for three cars and you get one, and I didn’t even have the money for one car, let alone three. So I could not afford a car growing up. We did not have a car growing up, but Copenhagen is a nice city to be able to get around on a bike or with a bus or as I did for a long period of time, on rollerblades.

(04:32:53) But when I was 25, I realized I wanted to spend more time in the U.S. I wasn’t sure yet that I was going to move there. That turned out later to be true, but I knew that if I wanted to spend time in the U.S., I needed to have a driver’s license. I was not going to get around very well if I didn’t know how to drive a car.

(04:33:10) So I got a driver’s license at 25. Then ended up moving to the U.S. later that year, and I’d always been into video games, racing video games. Metropolitan Street Racer on the Dreamcast was one of those games that really sucked me into … It was the precursor to Project Gotham, which was the precursor to essentially, Forza Horizon, I think.

DHH (04:33:37) I think that’s how the lineage goes. It’s just a great game. I actually just fired it up on an emulator a few weeks ago and it still sort of, kind of holds up because it has enough real car dynamics that it smells a little bit like driving a real car. It’s not just like an arcade racer like Sega Rally or something like that, but I’d always been into that.

(04:33:57) Then I got my driver’s license at 25 and moved to the U.S., and then two years later a friend that I’d met in Chicago took me to the Chicago Autobahn Country Club, which is this great track about 45 minutes from Chicago. And I sat in a race car and I drove a race car for the first time, and I had the same kind of pseudo-religious experience I did as when I started working on Ruby, where I did maybe 20 laps in this basically, a Mazda race car from, I think it was the ’90s or something, a pretty cheap race car, but a real race car. Single-seater, manual gearbox, but exposed slick wheels, all the stuff.

(04:34:42) And after having had that experience, first of all it was just the most amazing thing ever. The physical sensation of driving a race car is really unique. And I think if you’ve driven a car fast, you have maybe a 2% taste of it. The exposure to the elements that you get in a single-seat race car, especially one like that where your head is actually out in the elements, you can see the individual wheels and sensation of speed is just so much higher, is at a completely different level.

Lex Fridman (04:35:13) So can you actually speak to that? So even in that Mazda, so you can feel … What, can you feel the track reverberating? You feel the grip?

DHH (04:35:22) Oh, yeah. Not only can you see the bumps because you’re literally looking straight at the wheels, you can feel all the bumps because you’re running a slick tire and it’s a really stiff setup. It’s nothing like taking a fast street car out on a racetrack and tried to driving a little bit around.

Lex Fridman (04:35:37) So can you feel the slipping, the traction?

DHH (04:35:38) Yeah, you’d feel the slipping. That’s a huge part of the satisfaction of driving a race car, is driving in at the edge of adhesion as we call it, where the car’s actually sliding a little bit. A couple of percent of slip angle is the fastest way to drive a race car. You don’t want to slide it too much. That looks great, lots of smoke, but it’s not fast.

(04:35:58) How you want to drive it is just at the limit of adhesion where you’re rotating the car as much as your tires can manage and then slightly more than that. And playing at it, keeping it just at that level because when you’re at the level of, or at the limit of adhesion, you’re essentially just a tiny movement away from spinning out. I mean, it doesn’t take much. Then the car starts rotating. Once it starts rotating, you lose grip and you’re going for the wall.

(04:36:28) That balance of danger and skill is what’s so intoxicating, and it’s so much better than racing video games too because the criticality is taken up two notches. I often think about people who really like gambling, where I think, “Aren’t you just playing poker? No, the point is not poker. Poker is maybe part of it, but the point is that I could lose my house.” Right? That’s the addiction that some people get to gambling, that there’s something real on the line.

(04:36:58) When you’re in a race car, there’s something very real on the line. If you get it wrong, at the very least you’re going to spin out and probably hit a wall and it’s going to be expensive. At the very worst, you’re not getting out alive. And even if modern race cars have gotten way safer than they used to be, there is that element of danger that’s real, that there are people who still get seriously hurt or even killed in a race car.

(04:37:25) It’s mercifully rare compared to what it used to be when those maniacs in the ’60s would do Formula 1 and whatever, 13% of the grid wouldn’t make it to the end of the year because they’d just die in a fiery flaming fireball, but there’s still some of it there.

(04:37:42) And I think that since that there’s something on the line really contributes to it, but it’s more than that. It’s not just a physical sensation. There’s activation of all your forces. There’s the flow, and I think that really cements why I got addicted, because I love that flow I got out of programming, but getting flow out of programming is a very inconsistent process.

(04:38:06) I can’t just sit down in front of a keyboard and go like, “All right, let’s get the flow going.” It doesn’t happen like that. The problem has to be just right. It has to meet my skills in just the right moment. It’s a bit of a lottery.

(04:38:19) In a race car, it’s not a lottery at all. You sit down in that car, you turn the ignition, you go out on track and I get flow virtually guaranteed because you need, or I need at least 100% of my brain processing power to be able to go at the speed I go without crashing. So there’s no time to think about dinner tonight or the meeting next week or product launch. It’s completely zen in actually, the literal sense of the word.

(04:38:49) I think of someone who’s really good at meditation, that’s probably kind of state they get into where it’s just clear you’re in the now, there’s nothing but you and the next corner. That’s a really addictive experience.

(04:39:02) So after I’ve had that, I couldn’t get enough. I kept going to the track every opportunity I got. Every single weekend for about four years, I would go to the track. And by the end of that time, I’d finally worked up enough skill and enough success with the company that I could afford to go “real racing.”

(04:39:20) So I started doing that. I started driving these Porsches, and then as soon as I got into that, as soon as I got into “real competition,” I was like, “I wonder how far you can take this?” And it didn’t take that long before I decided, “You know what? I can take this all the way.”

(04:39:34) My great hero in racing is Tom Kristensen, fellow Dane. The Mr. Le Mans, as they call him, the greatest endurance race in the world. The 24 Hours of Le Mans has been won more times than any other by Tom Kristensen. He won the race nine times. So Tom just really turned me on to Le Mans. I’d been watching Le Mans since, I think, the ’80s. I have my earliest memories of watching that on TV. The race has been going since, I think, ’20s, but in the ’80s I got kind of into it.

(04:40:07) And then in the late ’90s, early 2000s when Tom started winning, I, like pretty much every other Dane started watching the race almost religiously. So I thought, “You know what? I want to get to Le Mans.”

(04:40:18) This is the magic thing about racing, that if I get into basketball, I can’t set a realistic expectation that I’m going to play in the NBA, that I’m going to go to the finals, or I get into tennis and I’m going to play at Wimbledon. That just doesn’t happen. But racing is special in this way because it requires a fair amount of money to keep these cars running. It’s really expensive. It’s like having a small startup. You need to fly a bunch of people around the world and buy expensive equipment and so forth. So you need a bunch of capital, and I had some through the success of the company so I could do it, which meant that I could get to Le Mans.

(04:40:50) So I set that as my goal. “I want to get to Le Mans,” and I started racing in real competition 2009, and three years later in 2012, I was at the grid of Le Mans for the first time.

Lex Fridman (04:41:02) We should say, so Le Mans, 24-hour race, endurance. I mean, this is insane.

DHH (04:41:10) There are three drivers, mind you. So it’s not like one guy just drives for 24 hours straight, but still it’s a pretty tough race, both physically and mentally, especially mentally. When you’ve been up for 24 plus hours, you’re not quite as sharp as when you first wake up.

(04:41:28) And this is funny about Le Mans too, it starts at around 4:00 in the afternoon, so you’ve already been up for half a day by the time the race starts and then there’s 24 hours to go before you’re done, and you’ll be in the car for anywhere from usually an hour and a half to a maximum of four hours. The regulations say four out of six is the max you can do.

(04:41:46) I’ve spent perhaps two and a half hours in a single stint at Le Mans. It’s pretty taxing. You’re going 200 miles an hour into some of these turns and there’s another 60 cars on track. Whenever I’m in my normal category, which is the LMP2 category, I have GT cars which are more like a Ferrari and a Porsche that I have to overtake, and then I have these hyper cars, which is the top-class that are overtaking me.

(04:42:14) So you got a lot going on and you got to stay sharp for two and a half hours straight to do that. That is just a guaranteed way to get incredible flow for long, long stretches of time. That’s why you get addicted to it. That was why I got addicted.

Lex Fridman (04:42:27) You got to talk me through this video, this video of you in these LMP2s.

Lex Fridman (04:42:31) This is such a cool … This is so cool.

DHH (04:42:34) Yeah, this was probably my favorite battle of my career.

Speaker 1 (04:42:41) And Heinemeier Hansson has beat past to add five-

DHH (04:42:42) Yeah, so this is me driving against Nico Müller at the Shanghai International Circuit.

Lex Fridman (04:42:47) You’re on the outside here?

DHH (04:42:48) I’m on the outside in the blue and white and we go a whole track around with basically a piece of paper between us. See, down this back straight, I get so close to him because I want to force him over on the other side of the track such that he can’t just box me in, and we’ve been fighting already at this point for basically 40 minutes straight.

(04:43:06) I’ve been managing to keep this professional driver behind me for 40 minutes, and he finally passes me, but we just keep the battle on for the whole time. And it really just shows both these kinds of cars, the Le Mans Prototypes. We don’t actually ever touch. We get within about an inch and keep going around the Shanghai Circuit to-

Lex Fridman (04:43:26) How did you get so good? I mean, that’s a fascinating story, right, that you are able to get so good?

DHH (04:43:34) I’m pretty good for the kind of driver I am, which is called the gentleman driver, which means I’m not a professional driver. And like many good gentlemen drivers, when we’re at our really best, we can be quite competitive with even professional drivers who have been doing this their whole life.

(04:43:50) The difference between us and the professionals is the professionals can do it every time, or more or less every time. So I can’t be this good all the time. When everything is just right, I can be competitive with professional drivers, but that’s not how you win championships. That’s not how you get paid by factories to drive. You got to be good every time you go out.

(04:44:07) So that’s a huge difference. But some of it was also just, I really put my mind to it. By the time I realized race cars is what I want to do as my serious hobby, I put in thousands of hours.

Lex Fridman (04:44:21) Have you crashed? What’s the worst crash?

DHH (04:44:23) I’ve had a lot of crashes, but thankfully, knock on wood, I haven’t had any crashes where I’ve gotten really seriously hurt.

Lex Fridman (04:44:30) Have you wrecked the car?

DHH (04:44:31) Oh, yes. Oh, yes. I’ve wrecked many a cars.

Lex Fridman (04:44:34) So what’s that feel like, just you wreck a car? How do you get-

DHH (04:44:37) It feels like total shit if you’re in a real race and other people depend on you. It’s not even so much the car, although it’s also sometimes that these cars are expensive to repair and that sucks and it feels so wasteful in a way when you crash some of these cars, but the sense that you’re letting a team down.

(04:44:55) Endurance racing is a team sport. Not only do you have your mechanics, you usually have co- drivers. So when I crash, I just feel like, “Damn it, I could have avoided this.”

Lex Fridman (04:45:05) Yeah, but also you could have died.

DHH (04:45:08) Do you know what’s funny? I never think about that. I don’t think you can because I think the moment you start thinking about being able to die, you can’t do it. You can’t go fast.

Lex Fridman (04:45:18) Well, I’m sure, not to go all Carl Jung and Freud here, but I’m sure that’s always present in the back of your mind somewhere. You’re not just bringing it to the surface.

DHH (04:45:31) It is in the sense that it’s part of the appeal. It’s part of the sense that there’s something on the line, that this isn’t just virtual. I can’t just hit reset, restart, reboot. If I crash this car, we’re going to be out, or we’re going to be disadvantaged, or it’s going to get destroyed, or I might get hurt.

(04:45:49) I’ve gotten lightly hurt a few times. I actually had, the year we won 24 Hours of Le Mans in our class, I’d been training in this Formula 3.5 car. It’s a really fast car, it’s a really nice exercise to do, but it’s also, it doesn’t have power steering. So some of these race cars, especially the open-seaters, they don’t have power steering, which means that the steering wheel is basically, directly connected to the front wheels.

(04:46:19) So if you crash one of those cars and the front wheels suddenly turn, you’re really going to hurt your hands if you don’t get your hands off the wheel. I hadn’t raced enough of those cars to know that I had to get, or to have the instinct, to have developed the instinct that I had to get my hands off the wheel, so I didn’t and I really hurt my hand.

(04:46:36) This was just, I think a month before the 24 Hours of Le Mans. So I thought, “Oh man, I’m going to have to miss it this year.” I had, not a cast. It was just seriously sprained. And then somehow, miraculously a week before the event, I was like, “Oh yeah, actually it’s okay now.” So, got to do it.

(04:46:51) And that would’ve been grave regret if I would’ve seen my team go on to win the race and I would have to sit on the sidelines. But I really have been quite fortunate in the sense that most of my crashes have just been expensive or sporting-inconvenient. They’ve never been something where I got seriously hurt, but I’ve seen plenty of people who have.

(04:47:13) In fact, my co-driver this year, and for several years, Pietro Fittipaldi drove a race car at Spa. Spa is one of the great racetracks of all time and it has this iconic corner called Eau Rouge, which is probably the most famous corner in all of Motorsports that has a great compression before you climb uphill.

(04:47:34) It’s extremely fast, very difficult corner. And just as he does the compression, his car basically sets out and he loses his power steering and he drives straight into the wall and breaks both his legs and basically, face the prospect that maybe his career was over. I’ve had other teammates and people I know have serious injuries that’s really hurt them.

(04:47:57) And yet what’s funny, as you say, you’d think that would sink in. The year before we won in 2014, that same car had a Danish driver in it at Le Mans at the race I was driving, who died. He lost control of the car when there was a bit of rain on the track, and the track was unfortunately designed in such a poor way that there was a very big tree right behind the railing. And he hit that tree at full speed, pulled 90gs and was dead on the spot, which was just such an extremely awful experience to go through.

(04:48:42) I finished second that year, which should have been cause for a bunch of celebration, but it was just tainted by the fact that not only did a driver die, a fellow Dane died, a guy I knew died. That was pretty tough.

Lex Fridman (04:49:01) So throw that into the pile of the things that have to be considered, is the weather conditions, like you mentioned of the track, whether it’s dry or wet.

DHH (04:49:12) It’s a huge part of it. Even just last year at Le Mans, it was raining and I was out and I hadn’t made a serious mistake at 24 Hours of Le Mans since I did the first race in 2012, where I put it in the sand trap with four hours to go. And we lost a couple of laps getting pulled out, but it didn’t actually change anything for our result because that was just how the field was spread out.

(04:49:41) I’d made minor mistakes over the years, but nothing that really set us out. And at the race last year when it was raining, I first clobbered a Ford Mustang when I made an overambitious pass on a damp part of the track and couldn’t stop in time and then felt absolutely awful as I sat in the gravel pit for two laps and knew that our race was over, a race where we were highly competitive.

(04:50:07) You’re not blessed with a competitive car, a competitive team and competitive setup every year. I know how rare that is. So to know that we had had a chance that year and I sort of squandered it felt really bad. But that got compounded when I got back on track, barely made it another stint and then put it into gravel trap again when it started raining on the entrance into Porsche.

(04:50:29) So this is part of why racing is so addicting too because the highs are very, very high. When you win a race like the 24 Hours of Le Mans, it feels just incredible. There’s so much emotion, but if you fuck it up, the lows are very, very low.

Lex Fridman (04:50:44) What are the things you’re paying attention to when you’re driving? What are the parameters? What are you loading in? Are you feeling the grip? Are you basically increasing the speed and seeing a constant feedback system effect it has on the grip, and you’re trying to manage that and trying to find that optimal slip angle?

(04:51:09) Are you looking around using your eyes? Are you smelling things? Are you listening, just feeling the wind or are you looking at the field, too? How’d you not hit that guy at all? You get close within inches, right? So you have to pay attention to that, too.

DHH (04:51:26) It’s really interesting about that specific battle where we’re literally a few inches apart. I can’t fully explain it, but humans can develop an incredible sense of space where I can’t see the edge of the back of my car, but I can know exactly where it is. I can have a mental model in my head that gives me the exact dimensions of this car such that I can run within a few inches of a competitor car or within a few inches of the wall and not hit either when things go well.

(04:51:57) The car is about two meters wide and it’s quite long, five meters and you can’t see everything. The mirrors are actually kind of shit. There’s no rear-view mirror in these cars. You can’t see out the back. You can only see through your two side mirrors, but you form this intuitive mental model when you get good enough at this.

(04:52:14) But what I actually pay attention to most is I run a program. What I try to do when I go to a racetrack is I try to load up the best program I know how for every single corner. What’s my brake point? What’s my acceleration point? What’s my brake trailing curve? And I try to pick up that program in part just by finding it myself and how fast I can go. But even more so than that by copying my professional competitors, or not competitors, co-drivers.

(04:52:45) So I usually always race with a pro, and modern race cars produce an absolute enormous amount of data, and you can analyze all that data after each outing. You can see an exact trace of how much you pushed the brake pedal, how much you did in terms of steering inputs, when you got on the gas. You can see every millisecond you’re losing is evident in those charts.

(04:53:09) So what I try to do is I try to look at the chart and then I try to load that in, and that’s what I got to do. “Oh, in this corner 17, I have to be 10 bar lighter on the brake,” so I try to load that program in and then I try to repeat it.

(04:53:23) Now, then there are all the things that changes. Your tires change quite a lot. These tires are made to only last 40 minutes in many cases. Sometimes at Le Mans we can go longer, but at some racetracks they’ll last as little as 40 minutes before they really fall off. So you got to manage that, that the grip is constantly changing, so your program have to suddenly fit those changing circumstances.

(04:53:45) And then in endurance racing, you’re constantly interacting with other cars because you’re passing slower classes or you’re getting passed by a faster class. So that’s part of the equation. And then you’re trying to dance the car around the limit of adhesion.

(04:53:59) So you got all those factors playing at the same time. But above all else for me is to try to become a robot. How can I repeat this set of steps exactly as I’m supposed to for two and a half hours straight without making 100 milliseconds worth of mistakes?

Lex Fridman (04:54:17) Yeah. Low latency algorithm.

DHH (04:54:20) That’s really a huge part of it actually. Your latency is enormously important in terms of being able to catch when the car starts slipping. You get this sensation in your body that the G-forces are a little off, the slip angle is a little off and then you have to counter steer.

(04:54:38) And obviously, the best race car drivers just feel like an intuition. I have some intuition. I don’t have all of it, so I do occasionally spin my car, but that’s the challenge.

Lex Fridman (04:54:48) From everything you’ve studied and understand, what does it take to achieve mastery in racing? What does it take to become the best race car driver in the world?

DHH (04:54:58) Obsession is part of it. When I read and hear about Senna and the other greats, they were just singularly focused. Max Verstappen is the current champion of the world and he is the same kind. Max has been fascinating to watch. I mean, he’s a phenomenal race car driver, but he also literally does nothing else. When he’s not at the racetrack, he’s driving sim racing. He’s literally in video games doing more racing when he’s not doing all the racing he’s already doing.

Lex Fridman (04:55:30) Is there a specific skill they have that stands out to you as supernatural through all of that obsession? Is it a bunch of factors or are they actually able to, like you said, develop a sense? Is it, they’re able to get to the very edge of the slip?

DHH (04:55:45) They’re able to develop very fine-tuned sensibilities for when the car is sliding. They can feel just these tiny moments or movements in the chassis that transports up usually through their ass. That’s why you call it a butt meter that goes up and you feel like the car is loose, or you feel like you’re just about to lock up. You can really hone that tuning.

(04:56:10) Then the other thing is you have to have really good reaction time. And when you look at great Formula 1 drivers, they can generally have a reaction time of just under 200 milliseconds, which is awesome, and even 10 milliseconds’ difference makes a huge difference.

(04:56:26) You’ll see it when the Formula 1 grid, for example, they do a standing start and you see the five red lights come on. And when the last light goes out, they’re supposed to release the clutch and get going, and they can time this. So you can see exactly who has the reaction time.

(04:56:40) And even being off by 20 milliseconds can make the difference of whether you’re in front or behind at the first corner.

Lex Fridman (04:56:48) How much of winning is also just the strategy of jostling for position?

DHH (04:56:53) There’s some of that, and some of it is also just nerve. Who wants it more? That’s exactly when that sense of danger comes in. There’s a great quote from Fernando Alonso when he was driving at Suzuka against Schumacher, I think.

(04:57:09) They’re coming up to this incredibly fast corner. It’s very dangerous, and Alonso basically accounts, “I was going to make the pass because I knew he had a wife and kids at home.”

Lex Fridman (04:57:22) That’s so gangster.

DHH (04:57:23) Just absolutely ruthless, right?

DHH (04:57:26) That, “I knew he valued life more than I did.” So there’s a bit of poker sometimes in that, who’s going to yield? There’s a bit of chicken raised in that regard, and sometimes it doesn’t work. No one yields and you both crash, but very often one person will blink first.

Lex Fridman (04:57:41) Can the pass be both on the inside and the outside or is it-

DHH (04:57:44) You can pass wherever you want as long as you have just a slight part of the car on the racetrack.

Lex Fridman (04:57:50) And then you just improvise and take risks. What a sport. And then Senna, of course is a legendary risk-taker.

DHH (04:58:00) Yes. And even before him. By the time … I mean, he died in the ’90s, but by the time we got to the ’90s, racing was already a lot safer than it was when Nik Lauda raced in the ’60s. That level of danger is no longer there. There’s still just a remnant of it and it is still dangerous, but nothing like that.

(04:58:21) And it’s a little hard to compare through the ages who’s the greatest driver of all time. I think there’s a fair argument that Senna is, but we don’t have the data. We don’t know who he was up against. How would he fare if we pitted him against Max Verstappen today?

(04:58:35) I do think sometimes that you can have a bit of a nostalgia for the all-time greats, but the world moves forward and new records are being set all the time and the professionalism keeps improving, sometimes to the detriment of the sport, I think.

(04:58:48) There’s a lot of professional drivers who are not only just very good at driving, but are very good at being corporate spokespeople, and it used to be quite different. There used to be more characters in racing that had a bit more personality that they were allowed to shine because there weren’t a billion sponsorships on the line that they were afraid to lose.

Cars

Lex Fridman (04:59:06) Ridiculous question, what’s the greatest car ever made, or maybe what’s the funnest one to drive?

DHH (04:59:11) The greatest car for me of all time is the Pagani Zonda.

Lex Fridman (04:59:15) Okay, I’m looking this up, Pagani Zonda.

DHH (04:59:18) So the Pagani Zonda was made by this wonderful Argentinian called Horacio Pagani.

Lex Fridman (04:59:25) My God, that’s a beautiful car. Wow.

DHH (04:59:26) It’s a gorgeous car. You can look up mine. It’s the Pagani Zonda HH. Yep. So, that’s a car I had made in 2010 after we visited the factory in Modena, and by sheer accident ended up with this car, but it became my favorite car in the world basically. When I watched an episode of Top Gear, I think in 2005, where one of the presenters was driving the Pagani Zonda F around and I just thought, “That’s the most beautiful car in the world. It is the most incredibly sounding car in the world. If I one day have the option, this is what I want.”

(05:00:14) And then I had the option in 2010. I’ve had the car ever since. I’m never ever going to sell it. It’s truly a masterpiece that’s stood the test of time. There’s some great cars from history that are recognized as being great in their time. This car is still great.

Lex Fridman (05:00:30) Have you taken it on the racetrack?

DHH (05:00:32) I have. It’s terrible at that. But I don’t want to say it’s terrible at that. That’s not what it’s designed for. It’s designed for the road and that’s why it’s great. There are a lot of fast cars that are straddling their race car for the road. You don’t actually want a race car for the world. A race car for the world is a pain in the ass. It’s way too stiff. It’s way too loud. It’s way too uncomfortable. You can’t actually take it on a road trip.

Lex Fridman (05:00:55) So this actually feels good driving on normal roads?

Lex Fridman (05:00:59) And you, of course always go to speed limit?

DHH (05:01:00) Always. This is why I love having this car in Spain because they’re a little more relaxed. Not entirely relaxed, but more relaxed than they are in a lot of places. In Denmark, I kid you not, if you are on the highway and you go more than twice the speed limit, they confiscate your car and keep it. You’re not getting it back. They don’t even care if it’s your car or not. If you were borrowing my car and you went twice the speed limit, it’s gone.

(05:01:26) So they don’t do that in Spain. I mean, in most places, except for the German Autobahn, they get pissy if you go twice the speed limit for all sorts of fair reasons. I’m not advocating that you should be going much more than that, but there are certain special roads where you can’t open things up and no one’s in harm’s way, and that’s an incredible sensation. And I do think that some of those speed limits actually are kind of silly, and I’m not just saying that in a vacuum.

(05:01:50) In Germany, they have the glorious Autobahn, and on the Autobahn there is no speed limit in a bunch of segments. And they’re so committed to their speed-limitless Autobahn, which is by the way, very weird of Germans. They usually love rules. They’re usually very precise about it, and then they have this glorious thing called the Autobahn.

(05:02:09) There was a great case a couple of years ago where a guy took out a Bugatti Chiron, went 400 kilometers an hour on the Autobahn, and he filmed it and put it on YouTube and a case was brought against him because even though they don’t have a speed limit, they do have rules that you can’t drive recklessly, and he won the case. He wasn’t driving recklessly. He was just going very, very fast.

(05:02:32) I’ve done the Autobahn a couple of times. My wife and I went on a road trip in Europe in 2009, and I got the Lamborghini Gallardo we were driving up to 200 miles an hour. And I’d driven 200 miles an hour or close to it on a racetrack before. That feels like one thing. Driving on a public road 200 miles an hour feels really, really fast.

DHH (05:02:54) Actually a little scary, yes, because you constantly think, on a racetrack you know the road, you know the surface. You can walk the track in most of the time. You can know if there’s a dip. On a public road you can’t know if there’s suddenly a pothole. Presumably there’s not going to be a pothole on the German Autobahn, but it does feel a little scary, but also exhilarating.

(05:03:13) Speed is just intrinsically, really fun. I don’t know anyone I’ve taken out in a fast car … Well, actually I do know a few people. Most people I take out in a fast car, they grin. It’s a human reaction to grin when you go really fast.

Lex Fridman (05:03:28) Do you know what’s the fastest you’ve ever gone?

DHH (05:03:31) It was probably at Le Mans, I think when the LMP2s were at their maximum power and had 600 horsepower and really sticky tires, we were going 340 kilometers an hour, which is just over 200 miles an hour, a bit over 200 miles an hour. That does feel fast.

(05:03:47) And it’s really interesting with speed, is that the difference between going, let’s say 150 and 160 doesn’t feel that much actually, those 10 miles an hour. But the difference between going 190 and 200 feels crazy faster, which as a percentage change is actually less than going from 150 to 160, but there’s some sense of exponentiality once you get up to those limits, where it’s just on a completely different level.

Lex Fridman (05:04:16) Yeah, because to me, 110, 120 feels fast. 200, that’s crazy.

Programming setup

Lex Fridman (05:04:26) I got to ask you about the details of your programming setup, the IDE, all that kind of stuff. Let’s paint the picture of the perfect programming setup. Do you have a programming setup that you enjoy? Are you very flexible? How many monitors? What kind of keyboard? What kind of chair? What kind of desk?

DHH (05:04:51) It’s funny because if you’d asked me, let’s see, a year and a half ago, I would’ve given you the same answer as I would’ve given anyone for basically 20 years. I want a Mac. I like the Magic Keyboard. I like the single monitor. Apple makes an awesome 6K 32-inch XDR screen that I still haven’t found anyone who’ve beaten that I still use. Even though I switched away from Apple computers, I still use their monitor because it’s just fantastic. But I’ve always been a single screen kind of guy.

(05:05:25) I do like a big screen, but I don’t want multiple screens. I’ve never found that, that really works with my perception. I want to be able to just focus on a single thing. I don’t want all of it all over the place, and I’ve always used multiple virtual desktops and being able to switch back and forth between those things.

(05:05:41) But the setup I have today is Linux, I switched to a little over a year ago after I finally got fed up with Apple enough that I couldn’t do that anymore. And then I use this low-profile mechanical keyboard called the Lofree Flow84, which is just a …

DHH (05:06:01) … Flow84, which is just the most glorious-sounding keyboard I’ve ever heard. I know there are a lot of connoisseurs of mechanical keyboards that’ll probably contest me on this. This is too thocky or too clicky or too clacky or whatever. But for me, the Lofree Flow84 is just a delight that I did not even know existed, which is so funny because I’ve been programming for a long time. Mechanical keyboards have been a thing for a long time.

(05:06:31) And the keyboard, when you look at it like this, it looks plain. It doesn’t look extravagant. But the tactile sensation you get out of pushing those keys, the thocky sound that you hear when the keys hit the board, it’s just sublime. And I’m kicking myself that I was in this Mac bubble for so long that I wasn’t even in the market to find this.

(05:06:57) I knew mechanical keyboards existed, but to be blunt, I thought it was a bit of a nerd thing that only real nerds that were much more nerdy than me would ever care about. And then I got out of the Apple bubble and suddenly, I had to find everything again. I had to find a new mouse, I had to find a new keyboard, I had to find everything. And I thought, “All right. Let me give mechanical keyboards a try.” And I gave quite a few of them a try.

(05:07:19) The Keychron is one of the big brands in that. I didn’t like that at all. I tried a bunch of other keyboards. And then I finally found this keyboard and I just went like… Angels are singing. Where have you been my whole life? We spend, as programmers, so much of our time interacting with those keys. It really kind of matters.

(05:07:36) In a way, I didn’t fully appreciate it. I used to defend the Apple Magic Keyboard like, “Hey, it’s great. It’s actually a great keyboard.” And I think for what it is, this ultra-low profile, ultra-low travel, it’s actually a really nice keyboard. But once you’ve tried a longer-travel mechanical keyboard, there’s no going back.

Lex Fridman (05:07:54) You do have to remember, in many ways, both on the software side and the hardware side, that you do spend a lot of hours-

Lex Fridman (05:08:01) … behind the computer. It’s worth-

Lex Fridman (05:08:04) And also worth exploring until you find the thing where the angels start singing, whatever.

DHH (05:08:09) That’s exactly right. And I actually do regret that a little bit, especially with this damn keyboard. I could have been listening to these beautiful thocky keys for years and years. But sometimes you have to get really pissed off before you open your eyes and see that something else exists.

(05:08:26) I feel the same way about Linux. So I’ve been using Linux on the server since late ’90s probably. We ran servers on Linux back then. I never seriously considered it as a desktop option. I never ran Linux before directly myself. I always thought, “Do you know what? I want to focus on programming. I don’t have time for all these configuration files and all this setup bullshit and whatnot. And Apple is close enough. It’s built on Unix underpinnings. Why do I need to bother with Linux?”

(05:08:56) And again, it was one of those things. I needed to try new things and try something else to realize that there is other things other than Apple. And again, it’s not because I hate Apple. I think they still make good computers. I think a lot of the software is still also pretty okay. But I have come to realize that as a web developer, Linux is just better.

DHH (05:09:20) Linux is just better. It’s closer to what I deploy on. The tooling is actually phenomenal. And if you spend a bit of time setting it up, you can record a reproducible environment that I’ve now done with this Omakub concept or project that I’ve done, that I can set up a new Linux machine in less than 30 minutes and it’s perfect.

(05:09:41) It’s not pretty good. It’s not like I still need to spend two hours on it. It’s perfect. Because you can encode all aspects of the development environment into this. And I didn’t know. I didn’t even know, to be fair, that Linux could look as good as it can.

(05:09:56) If you look at a stock Ubuntu or Fedora boot, I mean, not that it’s ugly, but I’d pick the Mac any day of the week. You look at Omakub, I mean, I’m biased here, of course, because I built it with my own sensibilities, but I look at that and go like, “This is better. This is beautiful.”

(05:10:13) And then you look at some of those true Linux ricing setups where people go nuts with everything. And you go, “Oh, yeah, I remember when computers used to be fun in this way,” when there was this individuality and this setup, and it wasn’t just all bland, the sameness. And I think that’s the flip side sometimes of something like Apple, where they have really strong opinions and they have really good opinions and they have very good taste, and it looks very nice, and it also looks totally the same.

(05:10:40) And Linux has far more variety and far more texture and flavor, sometimes also annoyances and bugs and whatever. But I run Linux now. It’s Ubuntu-based with the Omakub stuff on top, the Lofree keyboard. I use a Logitech. What’s it called? The MX 3 mouse, which I love how it feels in my hand. I don’t love how it looks.

(05:11:03) I actually was a Magic Mouse stan for the longest time. I thought it was genius that Apple integrated the trackpad into a mouse, and I used that. And I always thought it was ridiculous that people would slag it just because you had to charge it by flipping it over because the battery would last for three months and then you’d charge it for half an hour.

(05:11:23) I thought that was a perfect compatibility with my sensibilities. I don’t mind giving up a little inconvenience if something is beautiful, and that Magic Mouse is beautiful. But it wasn’t going to work on Linux, so I found something else. The MX 3 is nice, but I sometimes do wish the Magic Mouse… That’s pretty good.

Lex Fridman (05:11:40) Yeah. Linux is really great for customizing everything, for tiling, for macros, for all of that. I also do the same in Windows with AutoHotKey, where you just customize the whole thing to your preferences.

DHH (05:11:52) If you’re a developer, you should learn how to control your environment with the keyboard. It’s faster, it’s more fluid. I think one of those silly things I’ve come to truly appreciate about my Omakub setup is that I can, in whatever time it takes to refresh the screen, probably five milliseconds, switch from one virtual desktop to another.

(05:12:14) Even on Windows, you can’t get it that smooth. You can get close. You can’t get it that smooth. On macOS, for whatever reason, Apple insists on having this infuriating animation when you switch between virtual desktops, which makes it just that you don’t want to. You don’t want to run full-screen apps because it’s too cumbersome to switch between the virtual desktops. The kind of immediacy that you can get from a wonderful Linux setup in that regard is just next-level.

Lex Fridman (05:12:43) Yeah. And it seems like a subtle thing, but a difference of milliseconds and latency between switching the virtual desktops, for example, I don’t know, it changes-

DHH (05:12:53) It changes how you use the computer. It really does.

Lex Fridman (05:12:55) Similar thing with VR, right? If there’s some kind of latency, it just completely takes you out of it. Yeah.

DHH (05:13:01) And it’s funny. I actually had to watch… I think it was ThePrimeagen on YouTube when he was showing off his setup, and I was seeing how quickly he was switching between those virtual desktops. And I’d always been using virtual desktops, but I didn’t like switching too much because just of that latency. And it’s like, “Oh, you can do that on Linux? Oh, that’s pretty cool.”

DHH (05:13:21) So I run that. And then my editor of choice now is Neovim.

Lex Fridman (05:13:24) Oh, good. All right. Well, we’re out of time. No. All right. You did, for many, many years, used, what is it? TextMate.

DHH (05:13:34) TextMate. That was the main blocker of moving away from Apple. Everything else, I thought, “Do you know what? I can swing it.” But TextMate was and is a wonderful editor, one I helped birth into this world. The programmer, Allan Odgaard, is a good friend of mine, all the way back from the party days when we were lugging our computers around.

DHH (05:13:55) And he was a big Mac guy. And in 2005, he was writing this editor, and I helped him with the project management of keeping him on track, keeping him focused, and getting something released because I really wanted it for myself. And I thought this was the last editor. I thought I was never going to switch.

Lex Fridman (05:14:14) Forgive me for not knowing, but how featureful is this editor?

DHH (05:14:20) It’s quite featureful, but it’s a GUI-driven editor in some regards. It was really early on with ways of recording macros and having sophisticated syntax highlighting, and it did a bunch of firsts. And it was just a really pleasant editing experience.

(05:14:40) I think these days, a lot of people would just use VS Code. VS Code exists in the same universe as TextMate in some ways. And actually, I think it’s compatible with the original TextMate bundles, the original TextMate format. So it really trailed a path there, but it also just didn’t evolve.

(05:14:58) Now, a lot of people saw a huge problem with that. They were like, “Oh, it needs to have more features. It needs to have all these things.” I was like, I’m happy with this text editor that hasn’t changed at all basically when Allan stopped working on it for a decade or more. I don’t need anything else. Because as our original discussion went, I don’t want an IDE. I don’t want the editor to write code for me. I want a text editor. I want to interact with characters directly.

(05:15:25) And Neovim allows me to do that in some ways that are even better than TextMate, and I love TextMate. But Vi, as you know, once you learn the commands, and it sounds… I sometimes feel like Vi fans overplay how difficult it is to learn because it makes them perhaps seem kind of more awesome that they were able to do it. It’s not that difficult. And it doesn’t take that long, in my opinion, to learn just enough combo moves to get that high of, “Holy shit. I could not do this in any other editor.”

Lex Fridman (05:15:56) How long did it take you? And by the way, I don’t know. I haven’t yet… Well, I know intellectually, but just like with kids, I haven’t gone in all the way in. I haven’t used Vim.

DHH (05:16:08) You have a treat in mind. Well, I switched in about… When I switched here about a year ago, I had three days of cursing, where I thought it was absolutely terrible and it was never going to happen, and I had three days of annoyance. And already, the next week, I was like, “This is sweet. I’m not going anywhere.”

DHH (05:16:26) But I also had a bit of a headstart. About 20 years ago in the early 2000s, I tried Vim for a summer and it didn’t stick. I didn’t, for whatever reason, love it at the time. But Neovim is really good.

(05:16:40) The key to Neovim is to realize that you don’t have to build the whole damn editor yourself. So a lot of Neovim stans are like, “Here’s how to write the config from scratch.” Over 17 episodes, that’s going to take you three weeks. I don’t care that much.

(05:16:54) I love a great editor, I love to tailor it a little bit, but not that much. So you have to pair Neovim with this thing called LazyVim. LazyVim.org is a distribution for Neovim that takes all the drudgery out of getting an amazing editor experience right out of the box.

Lex Fridman (05:17:14) Ridiculous question. We talked about a bunch of programming languages. You told us how much you love JavaScript. It’s your second favorite programming language. Would TypeScript be the third then?

DHH (05:17:26) TypeScript wouldn’t even be in this universe. I hate TypeScript as much as I like JavaScript.

Lex Fridman (05:17:33) You hate… Oh, man. I’m not smart enough to understand the math of that. Okay. Before I ask about other programming languages, if you can encapsulate your hatred of TypeScript into something that could be human-interpretable, what would be the reasoning?

DHH (05:17:50) JavaScript smells a lot like Ruby when it comes to some aspects of its metaprogramming, and TypeScript just complicates that to an infuriating degree when you’re trying to write that kind of code. And even when you’re trying to write the normal kind of code, none of the benefits that accrue to people who like it, like auto-completion, is something I care about. I don’t care about auto-completion because I’m not using an IDE.

(05:18:14) Now, I understand that that is part of what separates it and why I don’t see the benefits. I only see the costs. I see the extra typing, I see the type gymnastics that you sometimes have to do and where a bunch of people give up and just do any instead, right? That they don’t actually use the type system because it’s just too frustrating to use.

(05:18:35) So I’ve ever only felt the frustration of TypeScript and the obfuscation of TypeScript in the code that gave me no payoff. Again, I understand that there is a payoff. I don’t want the payoff. So for my situation, I’m not willing to make the trade and I’m not willing to take a language that underneath is as dynamic of a language as Ruby is and then turn it into this pretend statically typed language. I find that just intellectually insulting.

Lex Fridman (05:19:08) Do you think it will and do you think it should die, TypeScript?

DHH (05:19:12) I don’t want to take something away from people who enjoy it. So if you like TypeScript, all the power to you. If you’re using TypeScript because you think that’s what a professional program is supposed to do, here’s my permission; you don’t have to use TypeScript.

Lex Fridman (05:19:24) There’s something deeply enjoyable about a brilliant programmer such as yourself, DHH, talking shit. It’s one of my favorite things in life. What are the top three programming languages everyone should learn if you’re talking to a beginner?

Programming language for beginners

DHH (05:19:41) I would 100% start with Ruby. It is magic for beginners in terms of just understanding the core concepts of conditionals and loops and whatever, because it makes it so easy. Even if you’re just making a shell program that’s outputting to the terminal, getting hello-world running in Ruby is basically puts, P-U-T-S, space, start quotes, “Hello world,” end quotes, you’re done, right? There’s no fluff, there’s nothing to wrap it into.

(05:20:10) There are other languages that does that, especially the Perl or Python would be rather similar, but Go would not, Java would not. There’s a lot of other languages that have a lot more ceremony and boilerplate. Ruby has none of it. So it’s a wonderful starting language.

(05:20:26) There’s a book called Learn to Program by Pine that uses Ruby essentially to just teach basic programming principles that I’ve seen heavily recommended. So that’s a great language.

Lex Fridman (05:20:38) How quickly would you go to Rails?

DHH (05:20:39) It depends on what you want to do. If you want to build web applications, go to Rails right away, learn Ruby along with Rails. Because I think what really helps power through learning programming is to build programs that you want. Right? If you’re just learning it in the abstract, it’s difficult to motivate yourself to actually do it well.

(05:20:56) Some people learn languages just for the fun of them. Most people do not. Most people learn it because they have a mission; they want to build a program, they want to become a programmer. So you got to use it for something real. And I actually find that it’s easier to learn programming that way too because it drives your learning process.

(05:21:12) You can’t just learn the whole thing upfront. You can’t just sit down and read the language specification and then go like, “Ooh,” like Neo, “Now I know kung fu. Now I know Ruby.” It doesn’t download that way. You actually have to type it out in anger on a real program.

Lex Fridman (05:21:29) Yeah. Yeah, for sure.

DHH (05:21:30) So I would start there. But then number two probably would be JavaScript because JavaScript just is the language you need to know if you want to work with the web, and the web is the greatest application platform of all time if you’re making business software or collaboration software, all this kind of stuff.

(05:21:47) If you’re making video games, you should probably go off and learn C++ or C or something else like that. But if you’re in the realm of web applications, you got to learn JavaScript. Regardless of what else you learn, you got to learn JavaScript.

Lex Fridman (05:21:58) So if you’re learning Ruby, what does Ruby not have in terms of programming concepts that you would need other languages for?

DHH (05:22:09) I don’t know if there’s any concepts missing, but it doesn’t have the speed or the low-level access of memory manipulation-

DHH (05:22:17) … that you would need to build a 3D gaming engine, for example. No one’s going to build that in Ruby. You could build quite low-level stuff when it comes to web technologies in Ruby, but at some point, you’re going to hit the limit and you should use something else.

(05:22:32) I’m not someone who prescribes just Ruby for everything. Just once you reach the level of abstraction that’s involved with web applications, Ruby is superb. But if you’re writing, for example, a HTTP proxy, Go is great for that. We’ve written quite a few HTTP proxies lately at the company for various reasons, including our cloud exit and so forth.

(05:22:54) And Kevin, one of the programmers I’m working with, he writes all of that in Go. Go just have the primitives and it has the pace and the speed to do that really well. I highly recommend it. If you’re writing an HTTP general proxy, do it in Go. Great language for that. Don’t write your business logic in Go. I know people do, but I don’t see the point in that.

Lex Fridman (05:23:14) So what would you say are the three? So, Go, Ruby, plus Rails, JavaScript.

DHH (05:23:19) Yeah. If you’re interested in working with the web, I’d probably pick those three. Go, Ruby, and JavaScript.

Lex Fridman (05:23:25) Go, Ruby, and JavaScript. Okay. Functional languages.

DHH (05:23:28) Someone’s talking about OCaml.

Lex Fridman (05:23:30) They are always going to show up. It must be some kind of OCaml industrial complex or something like this, but they always say, “Mention OCaml.”

DHH (05:23:41) I love that there are people who love functional languages to that degree. Those people are not me. I don’t care at all. I care about functional principles when they help me in these isolated cases where that’s just better than everything else. But at heart, I’m an object-oriented guy. That’s just how I think about programs. That’s how I like to think about programs. That’s how I carve up a big problem space into the main language. Objects are my jam.

Lex Fridman (05:24:10) Yeah, me too. So I program in Lisp a bunch for AI applications for basic… So, Othello, chess engines, that kind of stuff. And I did try OCaml just to force myself to program just a very basic Game of Life, a little simulation. Lisp is just parentheses everywhere. It’s actually not readable at all.

DHH (05:24:34) That’s the problem I’ve had with Lisp.

Lex Fridman (05:24:38) OCaml is very intuitive, very readable. It’s nice.

DHH (05:24:40) I really should pick up a language like that at some point. I’ve been programming long enough that it’s a little embarrassing that I haven’t actually done anything real in anger in a fully functional programming language.

Lex Fridman (05:24:50) Yeah. But I have to figure out, I’m sure there’s an answer to this, what can I do that would be useful for me that I actually want to build?

Lex Fridman (05:25:00) That a functional language is better suited for.

Lex Fridman (05:25:03) Because I really want to experience the language properly.

Lex Fridman (05:25:06) Yeah. Because at this point, I’m very object-oriented-brained.

DHH (05:25:12) And that’s my problem too. I don’t care as much about these low-level problems in computer science. I care about the high-level. I care about writing software. I care about the abstraction layer that really floats well with web applications and business logic.

(05:25:29) And I’ve come to accept that about myself, even though, as we talked about, when I was a kid, I really wanted to become a games programmer. And then I saw what it took to write a collision-detection engine, and I go like, “Yeah, that’s not me at all.” I’m never going to be into vector matrix manipulation or any of that stuff. It’s way too much math. And I’m more of a writing person than of a math person.

Lex Fridman (05:25:54) I mean, just in the way you were speaking today, you have a poetic, literary approach to programming.

Lex Fridman (05:26:04) Yeah. It’s interesting.

DHH (05:26:04) That’s actually exactly right. So I did actually a keynote at RailsConf 10 years ago, where I called myself a software writer. I mean, I’m not the first person to say that. “Software writer” has been in the vernacular for a long time.

(05:26:16) But the modern identity that most programmers adopt when they’re trying to be serious is software engineer, and I reject that label. I’m not an engineer. Occasionally, I dabble in some engineering, but the vast majority of the time, I’m a software writer. I write software for human consumption and for my own delight.

(05:26:40) I can get away with that because I’m working in a high-level language like Ruby, working on collaboration software and to-do lists and all the other stuff. Again, if I was trying to apply my talent to writing 3D game engines, no, that’s not the right mindset. That’s not the right identity.

(05:26:58) But I find that the software engineering identity flattens things a little bit. I’d like to think that we have software writers and software mathematicians, for example, and then those are actually richer ways of describing the abstraction level that you’re working at than “engineer.”

Lex Fridman (05:27:16) Yeah. And I think if AI becomes more and more successful, I think we’ll need the software writer skill more and more because it feels like that’s the realm of which… Because it’s not writer. You’re going to have to do the software, you’re going to have to be a computer person, but there’s a more… I don’t know. I just don’t want to romanticize it, but it’s more poetic, it’s more literary. It more feels like writing a good blog post than-

DHH (05:27:48) I actually wish that AI had a bit higher standards for writing. I find the fact that it accepts my slobby, incomplete sentences a little offensive. I wish there was a strict mode for AI where it would snap my fingers if I was just feeding it keywords and like, “Speak proper. Do pronunciation, do punctuation.” Because I love that. I love crafting a just-right sentence that hasn’t been boiled down, that has no meat on it, has no character in it. It’s succinct, it’s not overly flowery. It’s just right.

(05:28:26) That writing phase to me is just addictive. And I find that when programming is the best, it’s almost equivalent exactly to that. You also have to solve a problem. You’re not just communicating a solution. You have to actually figure out what are you trying to say. But even writing has that.

(05:28:45) Half the time when I start writing a blog post, I don’t know exactly which arguments I’m going to use; they develop as part of the writing process. And that’s how writing software happens too. You know roughly the kind of problem you’re trying to solve. You don’t know exactly how you’re going to solve it. And as you start typing, the solution emerges.

Lex Fridman (05:29:05) And actually, as far as I understand, you and Jason are working on a new book. It’s in the early days of that kind of topic. I think he said… he tweeted that it’s going to be titled something like, “We don’t know what we’re doing upfront” or something like that. That kind of topic. And you figure it out along the way.

DHH (05:29:22) That’s a big part of it; trying to give more people the permission to trust their own instincts and their own gut and realizing that developing that supercomputer in your stomach is actually the work of a career and that you should not discard those feelings in preference to over… or not even complicated; to analytics, to intellectualism.

(05:29:50) Very often when we look at the big decisions we’ve had to make, they’ve come from the gut, where you cannot fully articulate why do I think this is the right thing. Well, because I’ve been in this business for 20 years and I’ve seen a bunch of things and I’ve talked to a bunch of people, and that is percolating into this being the right answer.

(05:30:08) A lot of people are very skeptical about that in business or unable to trust it because it feels like they can’t rationalize. Why are we doing something? Well, because I feel like it, damn it. That’s a great privilege of being a bootstrapped, independent founder who don’t owe their business to someone else and doesn’t have to produce a return because I feel like a lot of the bullshit really creeps in when you’re trying to rationalize to other people why you do the things you do and why you take the decisions that you do.

(05:30:34) If you don’t have anyone to answer to, you are free to follow your gut, and that’s hell of an enjoyable way to work, and it’s also and very often the correct way to work. Your gut knows a lot. You can’t articulate it, but it’s spot-on more times than not.

Lex Fridman (05:30:54) Yeah. Having to make a plan can be a paralyzing thing. I suppose there’s different kinds of brains. And first of all, I can’t wait to read that book if it materializes.

(05:31:06) I often feel like in the more interesting things I do in my life, I really don’t know what I’m doing upfront. And I think there’s a lot of people around me that care for me that really want me to know what I’m doing. They’re like, “What’s the plan? Why are you doing this crazy thing?”

(05:31:24) And if I had to wait until I have a plan, I’m not going to do it. They have different brains on this kind of stuff. Some people really are planners and it maybe energizes them, but I think most creative pursuits, most really interesting, most novel pursuits are like, you kind of have to just take the leap and then just figure out as you go.

DHH (05:31:45) My favorite essay in Rework is the last one, and it’s entitled, “Inspiration is perishable.” And I think that captures a lot of it, that if you take the time to do a detailed plan, you may very well have lost the inspiration by the time you’re done.

(05:32:02) If you follow the inspiration in that moment and trust your gut, trust your own competence that you will figure it out, you’re going to get so much more back. You’re going to go on the adventure you otherwise wouldn’t have, whether that’s just the business decisions or life decision. You have to seize that inspiration.

(05:32:21) There’s a great set of children’s books written by this Japanese author about chasing an idea and trying to get a hold of it, and it’s beautifully illustrated as an idea is something that’s floating around, as something you have to catch and latch onto, that I really feel captures this notion that inspiration is perishable; it’ll disappear. If you just put it back on the shelf and say, “Well, I got to be diligent about this, I got to line up a plan,” you may run out, and then there’s no steam to keep going.

Open source

Lex Fridman (05:32:54) I have to ask you about open source. What does it take to run a successful open source project? You’ve spoken about that it’s a misconception that open source is democratic. It’s actually meritocratic. That’s a beautiful way to put it. So there often is a benevolent dictator at the top often. So can you just speak to that, having run successful open source projects yourself and being a benevolent dictator yourself?

DHH (05:33:26) Which is going to be a bit of a biased piece of evidence here, but-

Lex Fridman (05:33:31) Why monarchy is best.

DHH (05:33:33) It’s great. We should definitely have dictators and they should control everything, especially when the dictator is me. Now, well, I think I learned very early on that a quick way to burn out in open source is to treat it as a business, as though your users are customers, as though they have claims of legitimacy on your time and your attention and your direction.

(05:33:56) Because I faced this almost immediately with Ruby on Rails. As soon as it was released, there were a million people who had all sorts of opinions about where I ought to take it. And not just opinions, but actually demands. “Unless you implement an Oracle database adapter, this is always going to be a toy.” It was actually more or less that exact demand that prompted me to have a slide at one of the early Rails conferences that just said, “Fuck you.”

DHH (05:34:27) I’m not going to do what you tell me to. I’m here as a bringer of gift. I am sharing code that I wrote on my own time, on my own volition. And you don’t have to say thank you. I mean, it’d be nice if you did. You can take the code and do whatever you want with it, you can contribute back if you want, but you can’t tell me what to do or where to go or how to act.

(05:34:51) I’m not a vendor. This is a fundamental misconception that users of open source occasionally step into because they’re used to buying software from companies who really care about their business. I care about people using my software, I think it’s great, but we don’t have a transactional relationship. I don’t get something back when you tell me what to do, except grief, and I don’t want it, so you can keep it.

(05:35:18) So my open source philosophy from the start has been I got to do this primarily for me. I love when other people find use in my open source. It’s not my primary motivation. I’m not primarily doing it for other people. I’m primarily doing it for me and my own objectives.

(05:35:35) Because as Adam Smith said, it’s not for the benevolence of the butcher that we expect our daily meat. It’s for his self-interest. And I actually find that to be a beautiful thought that our commons increase in value when we all pursue our self-interest, certainly in the realm of open source.

(05:35:57) This is also why I reject this notion that open source is in some sort of crisis, that there’s a funding crisis, that we have to spend more. No, we don’t. Open source has never been doing better. Open source has never controlled more domains in software than it has right now. There is no crisis.

(05:36:14) There’s a misconception from some people making open source and from a lot of people using open source that open source is primarily like commercial software; something you buy and something where you can then make demands as a customer and that the customer is always right. The customer is not always right, not even in business, but certainly not in open source.

(05:36:35) In open source, the customer as it is, is a receiver of gifts. We are having a gift exchange. I show up and give you my code. If you like it, you can use it. And if you have some code that fits in with where I’m going with this, I would love to get those gifts back. And we can keep trading like that.

(05:36:54) I give you more gifts. You give me some of your gifts. Together, we pool all the gifts such that someone showing up brand new just get a mountain of gifts. This is the magic thing of open source is it increases the total sum value of what’s in the commons when we all pursue our own self-interest.

(05:37:10) So I’m building things for Rails that I need. And you know what? You want me to do that. You do not want me to build things that I don’t need on behalf of other people because I’ll do a crap job. I build much better software when I can evaluate the quality of that software by my own use.

(05:37:28) I need this feature. I’m going to build a good version of that feature, and I’m going to build just enough just for me. So I’m not going to bloat it. I’m not trying to attract the customer here. I’m not trying to see some angle. I’m just building what I need. And if you go into open source with that mentality that you’re building for you and everything else is a bonus, I think you have all the ingredients to go the distance.

(05:37:53) I think the people who burn out in open source is when they go in thinking, “I’m making all these gifts. I don’t really need them myself, but I’m hoping someone else does and maybe they’ll also give me some money.” That’s a losing proposition. It never basically works.

(05:38:08) If you want money for your software, you should just sell it. We have a perfectly fine model of commercial software that people can make that kind and then they can sell it. But I find a lot of confusion, let’s just call it that politely, in open source contributors who want to have their cake and eat it too.

(05:38:26) They like the mode of working with open source, they maybe even like the status that comes from open source, but they also would like to earn a living for making that open source. And therefore, they occasionally end up with the kind of grievances that someone who feels underappreciated at work will develop when others aren’t doing enough to recognize their great gifts.

Lex Fridman (05:38:47) And then they might walk away. I wish I had more insight into their mind state of the individual people that are running these projects, if they’re feeling sad or they need more money. It’s just such a dark box.

Lex Fridman (05:39:05) I mean, of course, there’s some communication, but I just sadly see too often they just walk away.

DHH (05:39:11) Right. And I think that’s actually part of the beauty of open source.

DHH (05:39:16) You are not obligated to do this code forever. You’re obligated to do this for as long as you want to do it. That’s basically your own obligation.

Lex Fridman (05:39:26) Okay, so you might criticize this and push back. You did write a blog post on forever, ” Until the end of the internet” with [inaudible 05:39:32]. There is a beautiful aspect, and you found a good balance there. But I don’t know, you’re bringing so much joy to people with this thing you created. It’s not an obligation, but there’s a real beauty to taking care of this thing you’ve created.

Lex Fridman (05:39:49) And not forgetting… I think what the open source creator is not seeing enough, how many lives you’re making better. There’s certain pieces of software that I just-

Lex Fridman (05:40:00) … lives you’re making better. There’s certain pieces of software that I just quietly use a lot and they bring my life joy and I wish I could communicate that well. There’s ways to donate, but it’s inefficient. It’s usually hard to donate.

DHH (05:40:16) It is. There’s some ways for some people that made it easier. GitHub donations is one way of doing it. I donate to a few people even though I don’t love the paradigm. I also accept that we can have multiple paradigms. I accept that I can do open source for one set of motivations and other people can do open source for other motivations. We don’t all have to do it the same way, but I do want to counter the misconception that open source is somehow in a crisis unless we all start paying for open source. That model already exists. It’s commercial software. It works very well and plenty of great companies have been built off the back of it and the expectations are very clear. I pay you this amount and I get this software.

(05:40:55) Open source, once you start mixing money into, it gets real muddy real fast, and a lot of it’s just from those misaligned expectations that if you feel like you’re starving artists as an open source developer and you are owed X amount of money because your software is popular, you’re delusional and you need to knock that off. Just get back on track where you realize that you’re putting gifts into the world and if you get something back in terms of monetary compensation, okay, that’s a bonus. But if you need that money back in terms of monetary compensation, just charge for software or go work for a software company that will employ you to do open source. There’s tons of that. That is probably actually the primary mode that open source software is being developed in the world today. Commercial companies making open source that they need themselves and then contributing it back.

WordPress drama

Lex Fridman (05:41:46) So I’m glad you drew some hard lines. Here is a good moment to bring up what I think is maybe one of the greatest open source projects ever, WordPress. And you spoke up in October 24 about some of the stuff that’s been going on with WordPress’s founder, Matt Mullenweg, in a blog post, “Open source royalty and mad kings,” is a really good blog post on just the idea of Benevolent Dictators For Life, this model for open source projects. And then the basic implication was that Matt, as the BDFL of WordPress has lost his way a bit with this battle with WP Engine. So I should also say that I really love WordPress. It brings me joy. I think it’s a beacon of what open source could be. I think it’s made the internet better, a lot of people to create wonderful websites. And I also think, now you might disagree with this, but from everything I’ve seen, WP Engine just gives me bad vibes.

(05:43:03) I think they’re not the good guy in this. I don’t like it. I understand the frustration, I understand all of it, but I don’t think that excuses the behavior. There is a bit of… See this kind of counter to a little bit what you said, which is when you have an open source project of that size, there is a bit of a… When you’re the king of a project of a kingdom that large, there’s a bit of responsibility. Anyway, could you speak maybe, to your empathy of Matt and to your criticism? And maybe paint a path of how he and WordPress can be winning again.

DHH (05:43:52) First, I echo what you said about what a wonderful thing it is that WordPress success, there are not many projects in the open source world or in the world at large that has had as big of an impact on the internet as WordPress has. He deserves a ton of accolades for that work. So that was my engagement, essentially my premise. Do you know what? I had tremendous respect for what Matt has built with WordPress, what that entire ecosystem has built around itself. It’s a true marvel, but there’s some principles that are larger than my personal sympathies to the characters involved. I agree. The Silver Lake private equity company that’s involved with WP Engine is not my natural ally. I’m not the natural ally of private equity doing some game with VP Engine. That’s not my interest in the case. My interest is essentially a set of principles and the principles are if you release something as an open source, people are free to use it as they see fit and they’re free to donate code, or resources, or money back to the community as they see fit.

(05:45:10) You may disagree about whether they’ve done enough, whether they should do more, but you can’t show up after you’ve given the gift of free software to the world and then say, “Now that you’ve used that gift, you actually owe me a huge slide of your business because you got too successful using the thing I gave you for free.” You don’t get to take a gift back. That’s why we have open source licenses. They stipulate exactly what the obligations are on both sides of the equation. The users of open source don’t get to demand what the makers of open source do and how they act and the makers of open source don’t get to suddenly show up with a ransom note to the users and say, “Actually you owe me for all sorts of use.” I’m 100% allergic to that kind of interaction. And I think Matt unfortunately for whatever reason, got so wrapped up in what he was owed that he failed to realize what he was destroying. WordPress and Automatic already makes a ton of money.

(05:46:19) This is part of the wonder of WordPress. This is a project that generates 100s of millions of dollars and Matt didn’t feel like he was getting enough of that. That’s not a good argument, bro. You can’t just violate the spirit and the letter of these open source licenses and just start showing up with demand letters even to characters that are not particularly sympathetic. This goes to the root of my interpretation of open source in general. The GPL is a particular license that actually demands code from people who use it under certain circumstances. I’ve never liked the GPL. I don’t want your shitty code. If you don’t want to give it to me, what am I going to do with that? Some code dump that you’ve… I’m not on board with that part of Stallman’s vision at all. I love the MIT license. To me that is the perfect license because it is mercilessly short.

(05:47:17) I think it’s two paragraphs, three paragraphs, really short and it basically says, “Here’s some software. It comes with no warranty. You can’t sue me. You can’t demand anything, but you can do whatever the hell you want with it. Have a nice life.” That’s a perfect open source interaction in my opinion, and that license needs to be upheld. These licenses in general, even the GPL, even if I don’t like it, we have to abide by them because if we just set aside those licenses, when we in a moment’s notice feel like something’s slightly unfair, we’ve lost everything. We’ve lost the entire framework that allowed open source to prosper and allowed open source to become such an integral part of commerce too. I mean, back when open source was initially finding its feet, it was at war with commercial software. Stallman is at war with commercial software and always has been.

(05:48:11) Bill Gates was in return at war with open source for the longest time. The open source licenses and the clarity that they provide allowed us to end that war. Today, commercial software and open source software can peacefully coexist. I make commercial software, I sell Basecamp, I sell HEY, and then I also make a bunch of open source software that I give away for free gifts. That can’t happen if we start violating these contracts. No commercial company is going to go, “Let me base my next project off this piece of open source if I’m also running the liability that some Matt maker is going to show up seven years in and demand I give them $50 million.” That’s not an environment conducive to commerce collaboration or anything else and it’s just basically wrong. I think there’s one analysis that’s all about the practical outcomes of this, which I think are bad.

(05:49:05) There’s also an argument that’s simply about ethics. This is not right. You can’t just show up afterwards and demand something. This is not too dissimilar in my opinion, to the whole Apple thing we talked about earlier, Apple just showing up and feeling like they’re entitled to 30% of everyone’s business. No, that’s not right. That’s not fair. So I think Matt unfortunately steered himself blind on the indignity he thought was being perpetrated against him because there was all this money being made by BP Engine making a good product and not giving quite enough back in Matt’s opinion, tough cookie.

Lex Fridman (05:49:49) I think there, maybe I’m reading too much into it, but there might be some personal stuff too which weren’t not only not giving enough but probably implicitly promising that they will give and then taking advantage of him in that way in his mind. Just like interpersonal interaction and then you get interpersonally frustrated.

Lex Fridman (05:50:11) You forget the bigger picture ethics of it. It’s like when a guy keeps promising he’ll do something and then you realize you wake up one day a year or two later, “Wait a minute, I was being lied to this whole time,” and that I don’t even know if it’s about money.

DHH (05:50:29) I’d get mad too. It’s totally fine to get mad when people disappoint you. That’s not justification for upending decades of open source licensees and the essential de facto case law we’ve established around it. This is why I chose to even weigh in on this because I like WordPress. I don’t use WordPress. I’m not a part of that community. I don’t actually have a dog in this fight. I’m biased if anything towards Matt just as a fellow BDFL. I would like to see him do well with this, but I also think there’s some principles that stake here that ring much louder. I don’t want Rails to suddenly be tainted by the fact that it’s open source and whether companies can rely on it and build businesses on it because wait, maybe one day I’m going to turn Matt and I’m going to turn Matt King and I’m going to show up with a demand ransom letter. Now screw that. We have way more to protect here. There’s way more at stake than your personal beef with someone or your perceived grievance over what you’re owed.

Lex Fridman (05:51:31) What would you recommend? What do you think he should do, can do to walk it back to heal?

DHH (05:51:40) Decide. This is the curious thing. He could decide to give this up. That’s very, very difficult for driven ambitious people to do, to accept that they’re wrong and to give up and lay down their sword. So I had a hope earlier on in this that was possible. I haven’t seen any evidence that Matt is interested in that and I find that deeply regretful, but that’s his prerogative. I continue to speak out when he’s violating the spirit and ethics of open source, but I wish he would just accept that this was a really bad idea. He made a bad bet and I think he thought he’d just get away with it, that they’d just pay up and that he could put pressure.

(05:52:24) I mean, I know that temptation. When you sit as the head of a very important project, you know that comes with a great degree of power and you really need a great degree of discipline to rein that in and not exercise that power at every step where you feel aggrieved. I’ve felt aggrieved a million times over in the 20 plus years of Ruby on Rails. I’ve really tried very hard not to let those, sometimes petty, sometimes substantial grievances over time seep in to the foundation of the ecosystem and risk ruining everything.

Money and happiness

Lex Fridman (05:53:03) As the king of the Rails kingdom. Has the power gotten to your head over the years?

DHH (05:53:07) I’m sure it has. I mean, who wouldn’t?

Lex Fridman (05:53:10) Do you pace around in your chamber? [inaudible 05:53:12]-

DHH (05:53:11) I do, occasionally, and I do marvel at both what’s been built, what’s been possible. Over a million applications have been made with Ruby on Rails by one estimate that I’ve seen. Businesses like Shopify and GitHub and a million others have been built on top of something that I started. That’s very gratifying. But you really have to be careful not to smell your own exhaust too much and you have to be just as careful not to listen too much to the haters and not to listen too much to the super fans either that you assess the value and the principles of what you’re working towards on its own merits, on your own scoreboard. I try to block that out and then just go, “Well, I’m working on Rails because I love to write Ruby. I love to use Ruby to make web applications. That’s my North Star and I’ll continue to do that and I’ll continue to share all of the open source gifts that I uncover along the ways,” and that’s it. That’s enough too.

(05:54:23) I don’t have to get all of it out of it. This is sometimes just as with the guy who thought I’d given up on being Jira or something, instead of doing Basecamp, there are people over the years who’ve asked like, “Why didn’t you charge for Rails? Don’t you know how much money had been made off Rails?” If we just look at something like Shopify, it’s worth billions of dollars. I’m not a billionaire and so freaking what? I got more than enough. I got plenty of my share.

(05:54:51) I will say though, I’m also introspective enough to realize that if it hadn’t panned out as well as it did for me on my own business, maybe I would’ve been more tempted. Maybe if you see other people build huge successful companies off the back of your work and you really don’t have a pot to piss in, you might be tempted to get a little upset about that. I’ve seen that in the Rails world as well, where there are people who contributed substantial bodies of work and then got really miffed when they didn’t feel like they got enough back. I was fortunate enough that the business that Jason and I built with Ruby on Rails was as successful as it was and I made the money I needed to make that I didn’t need to chase the rest of it.

Lex Fridman (05:55:36) But we should also just make explicit that many people in your position chase the money. It’s not that difficult to chase. Basically you turned away money, you made a lot of decisions that just turned away money.

DHH (05:55:53) Maybe. I also think of this example with Matt. He probably thought there was easy money for the taking and it wasn’t so easy, was it? It looked like low-hanging dollar bills and they turned out to be some really sour grapes. It turned out he probably destroyed vast sums of money by undermining the whole WordPress trust and the ecosystem and putting question marks in the heads of folks who would choose to use WordPress or something else going forward. So I often think when people think like, “Oh, you left money on the table.” First of all, so what? I don’t have to have all the money, but second of all, maybe the money wasn’t on the table at all.

Lex Fridman (05:56:33) And maybe the cost, even if you got the money, maybe the cost in other ways like we’ve talked about, would outweigh all the money that you could have possibly gotten. I think you said that the thing that makes you happy is flow and tranquility. Those two things. Really beautifully put. And gaining money might assign to your responsibility of running a larger thing that takes away the flow that you gain from being… Fundamentally for you what flow means is programming and then tranquility is like… I think you also have a beautiful post of like, “Nirvana is an empty schedule.”

DHH (05:57:17) When I look at a upcoming week and I see that I have no scheduled meetings at all, which is quite common, or maybe I just have one thing for one hour on one day, I think to myself, “Do you know what? This could very easily have been very different. We could have been running a company of 100s of people or 1000s of people and my entire calendar would’ve been packed solid with little Tetris blocks of other people’s demands on my attention and time and I would’ve been miserable as fuck. And I look at that and go, “What more can I ask for?” Which is a really nice state of being, I’d actually say. I didn’t have this always. I did have, early on in my career, some sense of I need a little more, a little more security. And I remember this really interesting study where a bunch of researchers asked people who had made certain amounts of money, “How much money would it take for you to feel secure?”

(05:58:14) They’d ask people who had a million dollars net worth, “How much money do you need?” “Probably need $2 million. $2 million, then I’d be good.” Then they asked people with a net worth of $5 million, how much do you need?” “10. I need 10.” Ask people with $10 million, “What do you need?” “20.” Every single time people would need double of what they did. I did that for a couple of doublings until I realized, “You know what? This is silly. I’m already where I wished I would be and a million times over, so what less is there to pursue?” Now that doesn’t mean that if more money is coming my way, I’m going to say no to it. Of course not. But, it does mean that I’m free to set other things higher. And I also do think you realize, as Jim Carrey would say, “I wish everyone would get all the money that they wished for and they’d realize it wasn’t the answer.”

(05:59:01) That money solves a whole host of problems and anxieties and then it creates a bunch of new ones and then it also doesn’t touch a huge swath of the human experience at all. The world is full of miserable, anxious, hurt, rich people. It’s also full of miserable, anxious, poor people and I’d rather be a miserable, anxious, rich person than a poor person. But it isn’t this magic wand that make everything go away, and that’s again one of those insights, just like having children, that you cannot communicate in words. I’ve never been able to persuade a person who’s not wealthy that wealth wasn’t going to solve all their problems.

Lex Fridman (05:59:42) One quote you’ve returned to often that I enjoy a lot is the Coco Chanel quote of, “The best things in life are free and the second-best things are very, very expensive.” And I guess the task is to focus on surrounding yourself with the best things in life like family and all of this and not caring about the other stuff.

DHH (06:00:07) I would easily say you can care about the other stuff. Just know the order of priority. If you are blessed with a partner that you love, some children that you adore, you’ve already won the greatest prize that most humans are able to achieve. Most humans in this world, if they are of marital age and they have children, if you ask them what’s the most important thing they would all say that, they would all say that, no matter whether they’re rich or poor. It’s easy to lose sight of that when you’re chasing the second-best things because do you know what? They’re also very nice.

(06:00:45) I really like that Pagani Sonda. It was a very expensive car and I would’ve had no chance of acquiring it if I hadn’t become rather successful in business. So I don’t want to dismiss it either. It’s great fun to have money. It’s just not as fun for quite as long or as deep as you think it is. And these other things, having an occupation and a pursuit that you enjoy, being able to carry burdens with a stiff up a lip and with again, a sense of meaning, is incredible. To have family, to have friends, to have hobbies, to have all these things that are actually available to most people around the world, that’s winning. And it doesn’t mean you have to discount your ambitions. It doesn’t mean you can’t reach for more, but it does mean it’s pretty dumb if you don’t realize that it’s not going to complete you in some hocus-pocus woo sense to make more. It really isn’t.

Hope

Lex Fridman (06:01:56) What gives you hope about the future of this whole thing we have going on here, human civilization?

DHH (06:02:04) I find it easier to be optimistic than pessimistic because I don’t know either way. So if I get to choose, why not just choose to believe it’s going to pan out? “We suffer more in our imagination than we do in reality,” that’s one of the quotes out of Stoicism. And I also think we have a tendency, a lot of humans have a tendency to be pessimistic in advance for things they don’t know how it’s going to pan out. Climate change, for example, is making a lot of people very anxious and very pessimistic about the future. You know nothing. 40 years ago, we thought the problem was that the planet was going to be too cool. I happen to believe that it’s probably correct that the planet is getting too hot and that CO2 has something to do with it. Whether we have the right measures to fix it in time, if that’s even possible or not, is completely up in the air and we don’t know.

(06:03:03) If you convince yourself with such certainty that the world is going to turn to shit. It is, right up here in your head, today. Climate change might wipe out this entire species in 200 years. It’s not next year. It’s not 10 years from now. Life might become more unpleasant and there might be more negative effects and so on. Yes, okay, but then deal with that hardship when it arrives. Don’t take that in advance. How are you helping earth by just walking around being depressed?

Lex Fridman (06:03:36) I think our whole conversation today is also an indication, it’s just two humans talking. There’s billions of us and there is something about us that wants to solve problems and build cool stuff and so we’re going to build our way out of whatever shit we get ourselves into. This is what humans do. We create problems for ourselves and figure out how to build rocket ships to get out of those problems. And sometimes, the rocket ships create other problems like nuclear warheads and then we’ll, I hope, figure out ways how to avoid those problems. And then, there’ll be nanobots and then the aliens will come and it’ll be a massive war between the nanobots and the aliens and that will bring all of us humans together.

DHH (06:04:24) The funny thing, just to pick up one of the points you mentioned, the atom bomb, for example. When that was first invented, a lot of people thought we have essentially ended life on earth or maybe we prevented World War III from happening in the past 80 years because assured, neutral annihilation kept the superpowers from attacking each other at least head-on and kept their fighting to proxy wars. You know what? Proxy wars are not great, but they’re probably better than World War III with nuclear weapons. So it’s quite difficult in the moment to tell what’s actually benefit and what’s not, and I think we should be a bit more humble. I’ve certainly become more humble over time of thinking I know which way it’s going to turn. I think the pandemic was a huge moment for a lot of people where there was so much certainty about whether this intervention worked or that intervention didn’t work and most people were wrong.

(06:05:25) Certainly a lot of very smart people, very qualified people got that just utterly and catastrophyingly wrong. So just a little intellectual humility, I think back upon that and go like, “You know what? I’m not a PhD in virology,” and I don’t claim that I somehow saw how it always going to play out, but the people who were really experts in it, they’ve got a bunch of it wrong. Nobody knows anything. I keep reminding myself of that every day. No one knows anything. We can’t predict the economy a month out. We can’t predict world affairs a month… The world is just too complicated.

Lex Fridman (06:06:03) When I watched the Netflix documentary, Chimp Empire, and how there’s a hierarchy of chimps, all of that looks eerily similar to us humans. We’re recent descendants. So these experts, some of the chimps got a PhD, others don’t. Others are really muscular. Others are beta male kind. They’re sucking up to the alpha. There’s a lot of interesting dynamics going on that really maps cleanly to the geopolitics of the day. They don’t have nuclear weapons, but the nature of their behavior is similar to ours. So I think we barely know what’s going on, but do think there’s a basic will to cooperate as a basic compassion that underlies just the human spirit that’s there. And maybe that is just me being optimistic, but if that is indeed there, then we’re going to be okay.

DHH (06:07:03) The capacity is certainly there. Whether we choose that capacity or not, who knows and in what situation. I think accepting that we all have the capacity for both ways, for both incredible generosity and kindness and also cruelty. I think, Young, with this whole theory of the shadow was really spot-on that we all have that capacity in us and accepting that it’s our job to attempt to cultivate the better parts of our human nature is weighed against our propensity to some time be the worst of ourselves.

Lex Fridman (06:07:41) I’m excited to find out what’s going to happen. It’s so awesome to be human. I don’t want to die. I want to be alive for a while to see all the cool shit we do. And one of the cool things I want to see is all the software you create and all the things you tweet, all the trouble you get yourself into on Twitter. David, I’m a huge fan. Like I said, thank you for everything you’ve done for the world, for the millions of developers you’ve inspired and one of whom is me, and thank you for this awesome conversation, brother.

DHH (06:08:11) Thanks so much for having me.

Lex Fridman (06:08:14) Thanks for listening to this conversation with DHH. To support this podcast, please check out our sponsors in the description and consider subscribing to this channel. And now, let me leave you with some words from Rework by DHH and Jason Fried, “What you do is what matters, not what you think, or say, or plan.” Thank you for listening and hope to see you next time.

陶哲轩:数学与物理的最难问题及 AI 的未来 (2025-06-15)

Terence Tao: Hardest Problems in Mathematics, Physics & the Future of AI (2025-06-15)

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:主持人 Lex Fridman 对话菲尔兹奖得主、被誉为“数学界的莫扎特”的陶哲轩 (Terence Tao),探讨了数学前沿的艰深问题、数学家的思维方式,以及人工智能将如何重塑数学研究的未来。
  • 核心论点:本次对话的核心,是揭示现代数学研究的本质——一场在“结构”与“随机”之间寻找深刻联系的智力探索。陶哲轩通过剖析纳维-斯托克斯方程、孪生素数猜想等具体问题,阐述了数学家如何通过类比、简化(“策略性作弊”)和跨领域工具(“狐狸”视角)来攻克难题。他认为,看似随机的现象背后(如素数分布)可能隐藏着无法用现有统计工具证伪的“阴谋”(精巧结构),而真正的突破往往源于发现不同领域(如流体力学与计算理论)间的意外连接。对话进一步推演,认为人工智能和形式化证明系统(如 Lean)正引发一场范式革命,它不仅将验证过程变得可信和规模化,更有可能成为未来发现新结构、新猜想的强大引擎,从而改变数学家个体的角色和整个学科的协作模式。

2. 🧠 深度观点解析 (Deep Dive Analysis)

维度一:将流体力学问题转化为计算理论问题

  • 核心观点:解决纳维-斯托克斯(Navier-Stokes)方程的“爆破”(Blowup)问题——即流体速度是否可能在有限时间内变为无穷大——或许可以通过构建一个“液体图灵机”来实现。
  • 原理解构:该观点是一个极富创造性的类比。纳维-斯托克斯方程之所以困难,在于其“超临界”(supercritical)特性:在小尺度上,非线性的能量传输效应远强于线性的粘性耗散效应。这为能量在越来越小的尺度上“恶意”汇集、最终形成奇点(即“爆破”)提供了理论可能。陶哲轩的思路是,与其直接对抗这种复杂性,不如反向工程,刻意设计一个能够实现能量自持、自我复制并向更小尺度传递的系统。他借鉴了冯·诺依曼的“自我复制自动机”和康威的“生命游戏”中的思想。如果能证明水流本身(如涡环)可以被设计成逻辑门(与门、或门),那么原则上就可以构建出一个完全由流体组成的、能够进行计算的“冯·诺依曼机器”。这个机器的程序就是“制造一个更小、更快的自己,然后将所有能量转移给它并自我消亡”。这个过程如果能无限迭代,就构成了有限时间内的能量爆破,从而为纳维-斯托克斯问题提供一个否定的解决方案。
  • 证据/案例
    • 自身研究:陶哲轩为简化的“平均纳维-斯托克斯方程”构建了爆破模型,其中关键就是设计了一个类似电路中“气闸”的机制,通过编程延时来确保能量逐级、集中地传递,避免了能量过早分散而被粘性耗散。
    • 类比康威的生命游戏 (Conway’s Game of Life)。在这个简单的元胞自动机系统中,人们通过精心设计初始条件,构建出了“滑翔机”、“滑翔机枪”乃至可以实现逻辑运算和自我复制的复杂结构。这证明了简单的局部规则可以在宏观上涌现出计算能力。

维度二:结构与随机的二分法 (The Dichotomy between Structure and Randomness)

  • 核心观点:数学中绝大多数对象表现出随机性,但真正的难题和深刻的理论往往在于区分伪随机和真随机,并证明某些看似随机的集合中必然存在某种结构。
  • 原理解构:这是贯穿数论和组合数学的核心思想。一方面,像圆周率 π 的数字序列,我们相信它是随机的(任何数字出现概率均等),但无法证明。另一方面,一些对象是高度结构化的(如奇数序列)。陶哲轩的工作,特别是“逆定理”(Inverse Theorems),旨在建立一个框架:一个对象如果不是“完全随机”的,那么它必然在某种意义上“接近”一个结构化对象。这个“二分法”使得问题可以被拆解:要么利用随机性的统计工具,要么利用结构性的代数或几何工具。
  • 证据/案例
    • 塞迈雷迪定理 (Szemerédi’s Theorem):任何具有“正密度”的整数集(无论其是结构化的如奇数,还是随机生成的)都必然包含任意长度的等差数列。这是结构必然在足够大的“近随机”集合中涌现的典范。
    • 格林-陶定理 (Green-Tao Theorem):证明了素数集合中存在任意长度的等差数列。素数被认为是“伪随机”的,该定理的证明就精妙地结合了分析学(处理随机性)和代数学(处理结构性)的工具。
    • 孪生素数猜想的困难:与等差数列不同,孪生素数这种模式非常“脆弱”。可以通过微调(删除极少数素数)来破坏这个模式,而不影响素数的整体统计分布。这说明证明孪生素数猜想需要找到素数中某种非常精巧、无法被轻易篡改的深层结构。

维度三:形式化证明系统 (Lean) 作为数学研究的新基础设施

  • 核心观点:以 Lean 语言为代表的形式化证明系统正在从根本上改变数学的协作、验证和探索模式,其影响堪比当年 LaTeX 对学术写作的革命。
  • 原理解构:传统数学依赖于同行评审,这是一个耗时且可能出错的过程。形式化证明系统要求数学家将自然语言的证明翻译成计算机可严格验证的代码。这带来了几个根本性转变:
    1. 绝对可信:一旦证明被 Lean 编译器接受,其逻辑正确性就得到了近乎 100% 的保证。
    2. 原子级协作:证明被分解为成千上万个可独立验证的引理(lemmas)。这使得大规模、分布式的“无信任”协作成为可能。一个贡献者无需理解整个证明的全貌,只需解决其中一个被精确定义的子问题。
    3. 可维护与可重构:当证明中的某个参数需要修改时(例如,将一个常数从12优化到11),编译器能立刻精确定位所有受影响的代码行,大大降低了维护和迭代的成本。
    4. 知识库的诞生 (Mathlib):所有被形式化的定理和引理共同构成一个庞大的、可搜索、可复用的数学知识库,加速了新研究的进程。
  • 证据/案例
    • 陶哲轩的亲身经历:他组织了一个由约 50 人参与的众包项目,旨在解决抽象代数中 4000 个定律之间的 2200 万个蕴含关系。这种规模的项目在没有 Lean 之前是不可想象的。
    • 弗马大定理的形式化:Kevin Buzzard 正在领导一个项目,目标是用 Lean 形式化安德鲁·怀尔斯的弗马大定理证明。这本身就是一个巨大的工程,展示了形式化系统的能力边界。

维度四:人工智能在数学领域的未来:从“助手”到“合作者”

  • 核心观点:当前 AI(特别是 LLMs)在数学上更像一个“花哨的自动补全工具”,能力有限且不可靠。但其终极潜力在于发展出一种“数学嗅觉”(mathematical smell),成为能够提出深刻猜想、探索全新联系的真正合作者。
  • 原理解构:AI 目前面临两大障碍:
    1. 可靠性问题:LLMs 会“一本正经地胡说八道”,生成的证明看似完美无瑕,但可能包含微妙而愚蠢的逻辑错误,比人类的“坏代码”更难调试。
    2. 训练数据缺失:AI 的训练数据主要来自成功的、已发表的论文。它缺乏人类数学家在探索过程中大量的失败尝试、错误转向和修正过程的“负面数据”。陶哲轩风趣地比喻说:“AI 需要去读个研究生,在办公室被导师指导,才能学会真正的研究。” 未来的突破点在于让 AI 具备评估一个研究路径是否“有前途”的直觉,即“数学嗅觉”,类似于 AlphaGo 对围棋棋局的评估能力。
  • 证据/案例
    • DeepMind AlphaProof:在国际数学奥林匹克(IMO)级别的问题上达到了银牌水平,证明了 AI 在解决高度结构化、有明确目标的数学问题上的潜力。但其巨大的计算成本(3天谷歌服务器时间解一道题)和对人类前期形式化工作的依赖,显示其离规模化应用还很遥远。
    • 陶哲轩的预测:他预测在本十年内(this decade),AI 就有可能提出一个连接两个看似无关领域的、有意义且可能正确的新猜想。这将是 AI 从“解题”到“提出问题”的里程碑式跨越。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识
    • 进步在于证明“此路不通”:在数学中,构建一个简化的、可控的“反例”(如陶哲轩对平均纳维-斯托克斯方程的研究),来证明某些证明路径注定失败,本身就是一种重大的智力贡献。它为整个领域节省了时间和精力,指明了必须利用哪些特定属性才能继续前进。
    • 脆弱模式 vs. 稳健模式:并非所有数学模式都是平等的。等差数列(如格林-陶定理)是“像蟑螂一样”稳健的,即使在数据被大量破坏后依然存在。而孪生素数则是“脆弱”的,微小的扰动就可能使其消失。这种对模式“鲁棒性”的认知,解释了为何某些看似相似的猜想难度天差地别。
  • 盲点与局限
    • AI 幻觉的陷阱:陶哲轩指出,大型语言模型生成的数学证明,其最大的危险不是错误本身,而是错误被“完美地”伪装起来。它们学会了模仿正确证明的“文体”,使得错误极其隐蔽,对人类专家的审查构成了新的挑战。
    • 发表偏见限制了 AI 训练:学术界“只发表成功结果”的文化,导致 AI 无法学习到人类研究中至关重要的“试错”过程。这指出了当前 AI 训练范式的一个根本局限——它只能学习到最终的“地图”,而无法学习到绘制地图过程中的探索和修正。
  • 未解之谜
    • 奇偶性障碍 (Parity Barrier):在数论中,现有技术(如筛法)无法区分一个数有奇数个素因子还是偶数个素因子。这个看似技术性的障碍,是阻碍孪生素数猜想、哥德巴赫猜想等一系列核心问题取得突破的根本性壁垒,陶哲轩称其为“长期梦想”之一。
    • P vs NP:这不仅是计算机科学问题,也是一个元数学问题。如果 P=NP,许多数学问题的求解和证明过程将被颠覆。目前学界普遍倾向于 P≠NP,但同样缺乏决定性的证明方法。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “The beauty of mathematics is that you get to change the problem and change the rules as you wish… It’s like trying to solve a computer game where there’s unlimited cheat codes available.”

    • 中文意译:“数学的美妙之处在于,你可以随心所欲地改变问题和规则……这就像玩一个有无限作弊码的电脑游戏。”
    • 语境:解释数学家如何通过简化问题(例如,将高维问题降至一维)来理解困难的核心,这与其他学科(如工程学)的严格约束形成鲜明对比。
  2. “You can think of a theory… as a compression of the universe, and a data compression… The more compression that you make, the better your theory.”

    • 中文意译:“你可以把一个理论看作是……对宇宙的压缩,一种数据压缩……你实现的压缩率越高,你的理论就越好。”
    • 语境:用信息论的视角解释物理理论的价值。一个好的理论(如暗物质模型)能用极少的参数解释海量(PB级)的观测数据,这正是其力量所在。
  3. “A fox knows many things a little bit, but a hedgehog knows one thing very, very well… I identify mostly as a fox, certainly. I like arbitrage, somehow.”

    • 中文意译:“狐狸知道很多事情,但都懂一点;而刺猬则对一件事情了如指掌……我当然主要认同自己是只狐狸。我喜欢某种意义上的‘套利’。”
    • 语境:借用以赛亚·伯林的著名比喻,描述自己在广度(连接不同领域)和深度(深耕单一领域)之间的研究风格,并坦言自己更喜欢将在一个领域学到的技巧应用到另一个看似无关的领域。
  4. “We don’t have data on things that were proposed… and then people quickly realized that it was the wrong conjecture… There’s a trial and error process… which we don’t record because it’s embarrassing… And the AI has no access to this data to train on.”

    • 中文意译:“我们没有那些被提出、然后人们很快意识到是错误的猜想的数据……这个试错过程……我们不会记录下来,因为这很尴尬……而 AI 无法获取这些数据进行训练。”
    • 语境:深刻地指出了当前训练 AI 进行数学研究的根本数据瓶颈——缺乏对人类真实探索过程中“失败路径”的记录。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)

    • 技术栈Lean + AI Copilot 将成为前沿数学研究者的标准配置,极大降低形式化证明的门槛,加速 Mathlib 这类公共知识库的扩张。
    • 产品形态:会出现更多众包数学研究平台,将艰深问题分解为可通过形式化系统验证的“微任务”,吸引更广泛的贡献者(包括程序员和业余爱好者)。
    • 竞争格局:拥有强大算力和顶尖人才的科技巨头(如 Google DeepMind)将在“AI for Math”领域继续保持领先,但开源社区(Lean Community)将成为不可或缺的底层基础设施和人才池。
  • 长期终局 (5-10年)

    • 范式转变:数学研究将迎来一个“相变”(Phase Shift)。当形式化一个证明的成本低于用传统方式书写时,形式化将成为默认的研究和发表方式。期刊可能会为经过形式化验证的论文开设“快速通道”。
    • 数学家角色的演变:数学家的核心工作将从“计算和验证”更多地转向“提出深刻问题、设计研究框架、引导 AI 探索、以及诠释 AI 发现的模式”。人机协作将成为常态,数学家更像是一位拥有超级智能助手的“首席科学家”。
    • 新发现的涌现:AI 将不再仅仅是验证工具,而会成为猜想的生成引擎。通过分析整个 Mathlib 知识图谱,AI 有可能发现人类因认知局限而忽略的、跨越遥远数学分支的深刻联系,并提出全新的、惊人的猜想。
  • 行动建议

    • 开发者:投身于形式化证明系统和科学 AI 领域。为 Lean 等工具开发更智能的“策略”(tactics),或者利用 LLMs 构建从自然语言到形式化代码的更优转换器,都是高价值方向。
    • 投资者:关注那些致力于构建“科学发现”基础模型的公司。不仅仅是通用 LLM,而是专注于特定科学领域(数学、物理、生物)的 AI 工具和平台,它们拥有巨大的长期潜力。
    • 创业者:围绕“AI 辅助的科学协作”创造产品。例如,构建一个集成了版本控制(如 Git)、形式化验证(如 Lean)和 AI 助手(如 Copilot)的平台,服务于大规模、分布式的科研项目。
    • 研究者/学生现在就开始学习一门形式化证明语言(如 Lean)。这在未来十年可能成为如同掌握 LaTeX 或一门编程语言一样基础而关键的技能。同时,保持“狐狸”般的好奇心,主动探索不同学科的交叉点,因为这正是未来最可能产生突破的地方。

总结 (gemini-3-flash-preview)

这是一份关于资深数学家、菲尔兹奖得主陶哲轩(Terence Tao)与 Lex Fridman 对话的深度分析报告。


1. 核心摘要 (Summary)

这场对话聚焦于数学作为一种“宇宙语言”的本质,以及它在人工智能时代正经历的深刻变革。陶哲轩详细探讨了他在纳维-斯托克斯方程(Navier-Stokes)和素数理论等领域的突破性思维,并重点介绍了数学研究如何从“纸笔时代的个人英雄主义”向“基于 Lean 形式化语言的大规模协作”转型。对话展示了一位顶尖数学家如何通过类比、战略性简化(Cheating strategically)以及拥抱 AI 工具来探索真理的边界。

2. 关键信息点 (Key Takeaways)

  • 纳维-斯托克斯方程与“流体计算机”:陶哲轩提出了一个惊人的思路,即通过构建某种“水力图灵机”(Water-punk computer)来证明流体方程的奇异性(Blowup)。如果流体能够模拟计算,那么计算中的“停机问题”可能意味着流体在有限时间内会产生无限速度。
  • 数学家的分类:狐狸与刺猬:陶哲轩将自己归类为“狐狸”,擅长跨领域寻找类比和联系(Arbitrage),而“刺猬”则深耕单一领域。他认为理想的科研合作需要这两种特质的互补。
  • Lean 形式化语言的革命:陶哲轩正在积极推广 Lean 编程语言,它能将数学证明转化为计算机可验证的代码。这实现了“无须信任的数学”(Trustless mathematics),允许全球数百名数学家在原子级粒度上协作,而不再依赖单一权威的审核。
  • 战略性简化(Strategic Cheating):面对极端复杂的难题,陶哲轩的策略是先通过“作弊”关掉 90% 的难度(如假设维度为 1,或者假设没有非线性干扰),解决简化版问题后,再逐一找回丢失的复杂性。
  • 素数研究中的“奇偶性障碍”(Parity Barrier):在讨论孪生素数猜想和哥德巴赫猜想时,陶哲轩指出当前数学工具在区分“只有两个素因子的合数”与“真正的素数”时存在瓶颈,打破这一障碍是通向最终证明的关键。
  • AI 在数学中的角色:目前的 LLM(大语言模型)在数学证明中面临“组合爆炸”和“嗅觉缺失”的问题(无法预判哪条路径更有前途),但陶哲轩预测到 2026 年,AI 将在研究级数学论文中成为不可或缺的协作伙伴。

3. 行业洞察 (Industry Insights)

  • 数学研究的“形式化”浪潮:传统数学论文正面临“越来越长、难以审稿”的危机。陶哲轩对 Lean 的拥抱预示着数学正从“自然语言描述”转向“代码化验证”。这类似于软件工程中的单元测试和版本控制进入了纯数学领域。
  • AI 辅助证明(AlphaProof 等)的地位:DeepMind 的 AlphaProof 虽然在 IMO(国际奥数)水平上取得进展,但陶哲轩认为其当前的局限在于能量消耗比和对人工形式化输入的依赖。未来的关键在于 AI 如何像人类一样产生“数学美感”或“直觉嗅觉”,从而在数百万条可能的证明路径中筛选出正确的方向。
  • 跨学科的普遍性(Universality):数学不仅是计算工具,更是数据压缩的手段。陶哲轩指出,从气体动力学到金融模型,宏观规律的简洁性往往源于微观互动的统计一致性(如中心极限定理)。
  • 未来 2-3 年的变化:
    • 协作模式的范式转移:我们将看到更多像“等式理论项目”(Equational Theories Project)这样拥有 50 名以上作者的论文。数学研究将告别“孤独天才”模式,转向“众包式、模块化”的工程模式。
    • AI 协作门槛降低:随着数学库(Mathlib)的完善,AI 将能辅助数学家完成繁琐的引理搜索和初级逻辑推导。陶哲轩预言的“相变”——即用 Lean 写证明比用纸笔更快——可能会在 5 年内发生。
  • 潜在挑战:
    • 学术评估体系的滞后:当论文由 50 人或 AI 辅助完成时,传统的终身教职(Tenure)评审如何衡量个人贡献?陶哲轩提到的“贡献矩阵”将成为学术界必须面对的评价难题。
    • AI 的“幻觉”与正确性:尽管形式化语言可以确保最终结果正确,但如何利用 LLM 产生“有意义”而非“琐碎且正确”的数学见解,仍是未解决的挑战。
  • 长远预测:
    • 数学将成为 AI 训练的最佳阵地。不同于法律或文学,数学有明确的“真理证书”(Lean 证明)。这种闭环反馈系统(RL + Formal Verification)可能让数学领域率先实现 AGI(通用人工智能)在逻辑推理上的超越。

总结 (glm-4.7-flash)

1. 🎯 核心论题与背景

  • 对话背景:本段深度研报基于著名科技播客主播 Lex Fridman 与菲尔兹奖得主、当代最杰出的数学家陶哲轩的对话。语境触及了从流体动力学方程到人工智能证明系统的前沿交叉领域。
  • 核心论点:陶哲轩通过剖析数学解决策略、流体力学中的混沌理论以及数学与物理的对撞,重塑了我们对“难题”的认知框架。他指出,人类在面对极其复杂的系统(无论是流体、素数还是证明路径)时,最有效的策略并非死磕单一变量,而是通过**“降维简化”(Cheating)和“结构-随机性二分法”**来排除不可能的路径。同时,随着形式化证明与 AI 工具的崛起,数学正从孤独的独角戏转变为类似软件产业链的规模化协作工程,而人类在这些协作中的独特价值在于提供直觉、品味和识别“伪真”的能力。

2. 🧠 深度观点解析

2.1 为什么我们需要“空气闸”结构来解决 Navier-Stokes 方程?

  • 核心观点:陶哲轩指出,直接将水流动的能量从大尺度强行转移到小尺度会导致能量过度扩散,从而被粘滞力“扼杀”。因此,构建一个具有“延迟”和“空气闸”机制的非线性系统是产生奇点(Blowup)的关键。
  • 原理解构:这是一个关于能量守恒与粘滞耗散之间博弈的问题。对于三维纳维-斯托克斯方程,由于超临界性,在极小尺度上,能量传输项主导了粘滞项。解决该问题的关键在于控制能量的传输速率,防止其在到达临界点前被平均化或耗散掉。
  • 证据/案例:陶引用了他在 2016 年构建的“平均”偏微分方程,通过仅允许能量单向(从大向小)流动且同时保持自相似性的方式,成功强制制造了有限时间奇点。这证明了某些常规证明策略在超临界问题上是无效的。

2.2 物理即编程:流体博弈

  • 核心观点:陶哲轩提出将纳维-斯托克斯方程视为流体计算机的可能。如果流体动力学支持数字逻辑门(如 AND、OR 门)的构建,那么方程的不存在不均匀奇点就意味着必须存在阻碍这种计算的“构型”。
  • 原理解构:通过类比康威的生命游戏,生命游戏中简单的局部规则能涌现出复杂、甚至是自我复制的结构。类似地,如果流体中的涡环干涉能产生逻辑门,那么流体本身就可以执行计算。这依赖于波的局部化相位控制
  • 证据/案例:生命游戏中的“滑翔机”和“滑翔机枪”构成了基础的计算单元。陶的猜想是,流体如果遵循特定的 Navier-Stokes 边界条件,可以构建出类似 von Neumann 自复制机器(一个机器复制自己更小的版本)的流体构型,从而在数学上证明方程可能产生奇点。

2.3 “作弊”(Cheating)策略:现代数学问题解决论

  • 核心观点:处理极其困难问题的标准流程不是一步登天,而是先通过修改问题条件(即“作弊”)来隔离关键难点,逐个击破。
  • 原理解构:面对 10 个阻碍因素,不要尝试同时处理它们。相反,建立一个只保留这 1 个困难因素但移除其他 9 个的简化问题进行求解。一旦掌握了处理这种困难的方法,可以分阶段重新开启其他因素。
  • 证据/案例:陶将其比作香港电影里的武打戏,主角总是被小怪包围,但他采取逐个击破的策略。在数学上,这意味着把维数从高维降到一维,或者忽略误差项,最后再“合二为一”。这是处理超临界偏微分方程的标准方法论。

2.4 结构与随机性的二分法

  • 核心观点:大多数数学对象是随机噪声,但也有极少数具有结构。问题的关键在于证明对于特定对象(如素数),我们无法构造出一个精心设计的“阴谋”来排除某种模式。
  • 原理解构:对于素数,如果它是“随机”的,就很容易证明包含任意长度的等差数列。对于孪生素数猜想,难点在于证明素数的分布不能被一个“有序的阴谋”发生器所支配。如果局部推进的规则存在异常,可能会形成一个像“里氏病毒”一样的特殊构型影响了周围。
  • 证据/案例Szemerédi 定理表明,即使在随机抛硬币生成的数集中,也存在等差数列。陶解释说,孪生素数之所以难,是因为如果素数表现得像“随机集合”,结论成立;但如果它表现得像“有结构的集合”,我们甚至可能编造一个“阴谋”来消除所有孪生素数。证明随机性存在比证明阴谋不存在要难得多。

2.5 形式化数学与 AI 验证作为基础设施

  • 核心观点:计算机辅助证明系统(如 Lean)正在将数学研究从理论验证转变为类似“软件供应链”的可验证流程,而不仅仅是寻找答案。
  • 原理解构:传统的数学论文是“蓝图”,其中很多步骤未由人类详细写出。形式化数学要求将每一个逻辑步骤编码为计算机可读的代码。这允许极高规模的协作,因为每个微小的逻辑节点都可以独立验证和分配给不同的贡献者。
  • 证据/案例:陶参与的方程理论项目利用 50 名贡献者,在 Lean 系统下验证了约 2200 万个抽象代数问题(即 A 是否推出 B),这是一个通常需要孤军奋战数千年的领域,现在通过形式化工具实现了规模化。

3. 💡 反直觉与批判性视角

打破共识:甚至“无限”也是模糊的

  • 挑战主流:陶哲轩指出,虽然数学家习惯使用无穷大和零来简化问题,但在处理物理现实时,这种理想化是有陷阱的。
  • 盲点/局限:无穷级数的重排可导致不同的极限值。为了真正理解数学或物理本质,必须将“无限”的概念 “有限化”(Finitize)。将“无限猴子定理”从“无限时间内产出《哈姆雷特》”转化为“在 10^23 个猴子中需要多少时间”,能直接体现指数级的复杂度和现实的荒谬感。这是一种直觉回溯技术。

对 AI 的过度乐观不仅是错,更是危险的

  • 批判性审视:虽然形式化证明需要 AI 帮忙做 auto-complete,但陶认为目前的 LLM(大型语言模型)在数学上能力较弱,因为它们容易产生**“具有欺骗性的完美”**。
  • 盲点/局限:人类新手写的垃圾证明一眼就能看出错误(比如“3+5=8”错了)。但 LLM 生成的证明每一步都看似合理,甚至引用了你以为存在的定理,但合起来完全自相矛盾。这被称为“数学领域的代码味”谜题。目前的 AI 缺乏数学家直觉中的“嗅觉”来识别这种虚假的确定性。

“超级博士”的陷阱

  • 未解之谜/反思:虽然膜拜像 Perelman 这样独自钻研 7 年并成功解谜的数学家,暗示了孤军奋战的有效性,但陶强调这种模式对普通人极其危险。
  • 策略失误:对于大多数数学家和专业人士,单一项目的过度投入会导致职业脆断和心理崩溃。利用有限的时间,保持敏捷,不断在不同问题间切换,比孤注一掷更具生产力和可持续性。

4. 💎 金句与高光时刻

  1. “Infinity absorbs a lot of sins.”
    • 语境:关于无限猴子定理和无穷级数。即使是随机的事,如果时间足够长(无限),也可能发生看似不可能的事。
  2. “Some problems are just on the boundary between what we can do rather easily and what are hopeless; knowing that 90% of the job is done and we just need that remaining 10%.”
    • 语境:论及 Kakeya 问题和对 Navier-Stokes 的研究动力。数学家的快乐往往源于看轻了困难,只专注于那最后 10% 的缺失拼图。
  3. “If you have a model with 10 parameters that explains 10 observations, that is a completely useless model… but if you have a model with two parameters and it explains a trillion observations, that’s beautiful.”
    • 语境:讨论物理理论为何有效。美在于数据压缩的效率——越少的参数能解释越多的现象,说明该理论捕捉到了潜在的规律,而非拟合了数据。
  4. “I call it cheating strategically. You get to change the problem and change the rules as you wish… There’s unlimited cheat codes available.”
    • 语境:描述数学家处理难题的独特优势——为了研究本质,可以人为改变变量或忽略干扰因素,这在物理和工程中是不可行的。
  5. “Goodbye to the notion of lead author… let’s just have everyone be an author, but we will have an appendix with this matrix.”
    • 语境:关于大规模协作(如 Polymath II.5 项目)。在对数学贡献进行量化评估时,抛弃传统的“大佬挂名”文化,转向透明、可追溯的贡献矩阵。

5. 🚀 行业启示与未来推演

  • 短期影响 (1-3年)

    • 验证科学的兴起:期刊将开始接受“Lean 格式”的论文作为审稿加速标准。只要代码能证明结论正确,审稿人将只负责评估“意义”和“相关性”,大大加快论文发表速度。
    • 数学界的梅特卡夫定律:类似于网络效应,Mathlib(Lean 知识库)将成为越来越有价值的资产,使贡献者更倾向于向已有的生态中添加代码,实现正反馈循环。
  • 长期终局 (5-10年)

    • 数学的规模化生产:数学研究可能像写代码一样,变成数千人分工协作的流水线工程。AI 将负责高强度的初步搜索和 corroborate(佐证),而人类专注于设计证明路径和提出猜想。
    • “感觉”的不可替代性:阻碍 AI 达到 Fields Medal 级别精度的核心壁垒不是计算能力,而是直觉——即姜饼目标直觉(对“路径”的嗅觉)。未来最优秀的数学家不仅是解题者,更是“AI 训练师”,负责理解 AI 的幻觉,并告诉它“这只是作弊”。
  • 行动建议

    • 对于开发者/创业者的启示:在面对复杂的“超临界”业务问题时(如时间敏感性极高的系统、病毒式传播的算法),不要试图计算所有变量的完美交互。寻找关键的“影响变量”,建立隔离的测试环境,假装其他变量不存在,先解决主要矛盾。
    • 对于教育者的启示:必须打破“通过应试刷题来学习数学”的模式。陶主张利用像 Lean 这样的工具,让公众(即便是高中生)通过做“合法的数学证明的微小片段”参与研究,让数学从精英特权变为一种“可执行的、有反馈的构建活动”。

逐字稿

Introduction

Lex Fridman (00:00:00) The following is a conversation with Terence Tao, widely considered to be one of the greatest mathematicians in history, often referred to as The Mozart of Math. He won the Fields Medal and the Breakthrough Prize in Mathematics, and has contributed groundbreaking work to a truly astonishing range of fields in mathematics and physics. This was a huge honor for me for many reasons, including the humility and kindness that Terry showed to me throughout all our interactions. It means the world. This is the Lex Fridman Podcast. To support it, please check out our sponsors in the description or at LexFridman.com/sponsors. And now, dear friends, here’s Terence Tao.

First hard problem

Lex Fridman (00:00:49) What was the first really difficult research-level math problem that you encountered, one that gave you pause maybe?

Terence Tao (00:00:57) Well, in your undergraduate education you learn about the really hard impossible problems like the Riemann Hypothesis, the Twin-Primes Conjecture. You can make problems arbitrarily difficult. That’s not really a problem. In fact, there’s even problems that we know to be unsolvable. What’s really interesting are the problems just on the boundary between what we can do rather easily and what are hopeless, but what are problems where existing techniques can do 90% of the job and then you just need that remaining 10%. I think as a PhD student, the Kakeya Problem certainly caught my eye. And it just got solved actually. It’s a problem I’ve worked on a lot in my early research. Historically, it came from a little puzzle by the Japanese mathematician Soichi Kakeya in 1918 or so. So, the puzzle is that you have a needle on the plane or think like driving on a road something, and you want it to execute a U-turn, you want to turn the needle around, but you want to do it in as little space as possible. So, you want to use this little area in order to turn it around, but the needle is infinitely maneuverable. So, you can imagine just spinning it around. As the unit needle, you can spin it around its center, and I think that gives you a disc of area, I think pi over four. Or you can do a three-point U-turn, which is what we teach people in their driving schools to do. And that actually takes area of pi over eight, so it’s a little bit more efficient than a rotation. And so for a while people thought that was the most efficient way to turn things around, but Besicovitch showed that in fact you could actually turn the needle around using as little area as you wanted. So, 0.01, there was some really fancy multi back and forth U-turn thing that you could do that you could turn a needle around and in so doing it would pass through every intermediate direction. Is

Lex Fridman (00:02:51) This in the two-dimensional plane?

Terence Tao (00:02:52) This is in the two-dimensional plane. So, we understand everything in two dimensions. So, the next question is: what happens in three dimensions? So, suppose the Hubble space Telescope is tube in space, and you want to observe every single star in the universe, so you want to rotate the telescope to reach every single direction. And here’s unrealistic part, suppose that space is at a premium, which totally is not, you want to occupy as little volume as possible in order to rotate your needle around, in order to see every single star in the sky. How small a volume do you need to do that? And so you can modify Besicovitch’s construction. And so if your telescope has zero thickness, then you can use as little volume as you need. That’s a simple modification of the two-dimensional construction. But the question is that if your telescope is not zero thickness, but just very, very thin, some thickness delta, what is the minimum volume needed to be able to see every single direction as a function of delta?

(00:03:45) So, as delta gets smaller, as the needle gets thinner, the volume should go down. But how fast does it go down? And the conjecture was that it goes down very, very slowly like logarithmically roughly speaking, and that was proved after a lot of work. So, this seems like a puzzle. Why is it interesting? So, it turns out to be surprisingly connected to a lot of problems in partial differential equations, in number theory, in geometry, combinatorics. For example, in wave propagation, you splash some water around, you create water waves and they travel in various directions, but waves exhibit both particle and wave-type behavior. So, you can have what’s called a wave packet, which is a very localized wave that is localized in space and moving a certain direction in time. And so if you plot it in both space and time, it occupies a region which looks like a tube. What can happen is that you can have a wave which initially is very dispersed, but it all focuses at a single point later in time. You can imagine dropping a pebble into a pond and the ripples spread out, but then if you time-reverse that scenario, and the equations of wave motion are time-reversible, you can imagine ripples that are converging to a single point and then a big splash occurs, maybe even a singularity. And so it’s possible to do that. And geometrically what’s going on is that there’s also light rays, so if this wave represents light, for example, you can imagine this wave as a superposition of photons all traveling at the speed of light.

(00:05:15) They all travel on these light rays and they’re all focusing at this one point. So, you can have a very dispersed wave focus into a very concentrated wave at one point in space and time, but then it de-focuses again, it separates. But potentially if the conjecture had a negative solution, so what that meant is that there’s a very efficient way to pack tubes pointing different directions to a very, very narrow region of a very narrow volume. Then you would also be able to create waves that start out some… There’ll be some arrangement of waves that start out very, very dispersed, but they would concentrate, not just at a single point, but there’ll be a lot of concentrations in space and time. And you could create what’s called a blowup, where these waves amplitude becomes so great that the laws of physics that they’re governed by are no longer wave equations, but something more complicated and nonlinear.

(00:06:08) And so in mathematical physics, we care a lot about whether certain equations and wave equations are stable or not, whether they can create these singularities. There’s a famous unsolved problem called the Navier-Stokes regularity problem. So, the Navier-Stokes equations, equations that govern the fluid flow for incompressible fluids like water. The question asks: if you start with a smooth velocity field of water, can it ever concentrate so much that the velocity becomes infinite at some point? That’s called a singularity. We don’t see that in real life. If you splash around water in the bathtub, it won’t explode on you or have water leaving at the speed of light or anything, but potentially it is possible.

(00:06:49) And in fact, in recent years, the consensus has drifted towards the belief that, in fact, for certain very special initial configurations of, say, water, that singularities can form, but people have not yet been able to actually establish this. The Clay Foundation has these seven Millennium Prize Problems as a $1 million prize for solving one of these problems, and this is one of them. Of these of these seven, only one of them has been solved, at the Poincare Conjecture [inaudible 00:07:18]. So, the Kakeya Conjecture is not directly directly related to the Navier-Stokes Problem, but understanding it would help us understand some aspects of things like wave concentration, which would indirectly probably help us understand the Navier-Stokes Problem better.

Lex Fridman (00:07:32) Can you speak to the Navier-Stokes? So, the existence of smoothness, like you said, Millennium Prize Problem, You’ve made a lot of progress on this one. In 2016, you published a paper, Finite Time Blowup For An Average Three-Dimensional Navier-Stokes Equation. So, we’re trying to figure out if this thing… Usually it doesn’t blow up, but can we say for sure it never blows up?

Terence Tao (00:07:56) Right, yeah. So yeah, that is literally the $1 million question. So, this is what distinguishes mathematicians from pretty much everybody else. If something holds 99.99% of the time, that’s good enough for most things. But mathematicians are one of the few people who really care about whether really 100% of all situations are covered by it. So, most fluid, most of the time water does not blow up, but could you design a very special initial state that does this?

Lex Fridman (00:08:29) And maybe we should say that this is a set of equations that govern in the field of fluid dynamics, trying to understand how fluid behaves. And it’s actually turns out to be a really… Fluid is extremely complicated thing to try to model.

Terence Tao (00:08:43) Yeah, so it has practical importance. So this Clay Prize problem concerns what’s called the Incompressible Navier-Stokes, which governs things like water. There’s something called the Compressible Navier-Stokes, which governs things like air, and that’s particularly important for weather prediction. Weather prediction, it does a lot of computational fluid dynamics. A lot of it’s actually just trying to solve the Navier-Stokes equations as best they can. Also gathering a lot of data, so that they can initialize the equation. There’s a lot of moving parts, so it’s very important from practically.

Lex Fridman (00:09:09) Why is it difficult to prove general things about the set of equations like it not not blowing up?

Terence Tao (00:09:17) Short answer is Maxwell’s Demon. So, Maxwell’s Demon is a concept in thermodynamics. If you have a box of two gases in oxygen and nitrogen, and maybe you start with all the oxygen on one side and nitrogen on the other side, but there’s no barrier between them. Then they will mix and they should stay mixed. There’s no reason why they should un-mix. But in principle, because of all the collisions between them, there could be some sort of weird conspiracy that maybe there’s a microscopic demon called Maxwell’s Demon that will… every time an oxygen and nitrogen atom collide, they’ll bounce off in such a way that the oxygen sort of drifts onto one side and then nitrogen goes to the other. And you could have an extremely improbable configuration emerge, which we never see, and which statistically it’s extremely unlikely, but mathematically it’s possible that this can happen and we can’t rule that out.

(00:10:06) And this is a situation that shows up a lot in mathematics. A basic example is the digits of pi 3.14159 and so forth. The digits look like they have no pattern, and we believe they have no pattern. On the long-term, you should see as many ones and twos and threes as fours and fives and sixes, there should be no preference in the digits of pi to favor, let’s say seven over eight. But maybe there’s some demon in the digits of pi that every time you compute more and more digits, it biases one digit to another. And this is a conspiracy that should not happen. There’s no reason it should happen, but there’s no way to prove it with our current technology. So, getting back to Navier-Stokes, a fluid has a certain amount of energy, and because the fluid is in motion, the energy gets transported around.

(00:10:53) And water is also viscous, so if the energy is spread out over many different locations, the natural viscosity of the fluid will just damp out the energy and will go to zero. And this is what happens when we actually experiment with water. You splash around, there’s some turbulence and waves and so forth, but eventually it settles down and the lower the amplitude, the smaller velocity, the more calm it gets. But potentially there is some sort of demon that keeps pushing the energy of the fluid into a smaller and smaller scale, and it’ll move faster and faster. And at faster speeds, the effect of viscosity is relatively less. And so it could happen that it creates some sort of what’s called a self-similar blob scenario where the energy of the fluid starts off at some large scale and then it all sort of transfers energy into a smaller region of the fluid, which then at a much faster rate moves into an even smaller region and so forth.

(00:11:55) And each time it does this, it takes maybe half as long as the previous one, and then you could actually converge to all the energy concentrating in one point in a finite amount of time. And that’s scenario is called finite time blowup. So, in practice, this doesn’t happen. So, water is what’s called turbulent. So, it is true that if you have a big eddy of water, it will tend to break up into smaller eddies, but it won’t transfer all energy from one big eddy into one smaller eddy. It will transfer into maybe three or four, and then those ones split up into maybe three or four small eddies of their own. So the energy gets dispersed to the point where the viscosity can then keep everything under control. But if it can somehow concentrate all the energy, keep it all together, and do it fast enough that the viscous effects don’t have enough time to calm everything down, then this blowup can occur.

(00:12:51) So, there were papers who had claimed that, “Oh, you just need to take into account conservation of energy and just carefully use the viscosity and you can keep everything under control for not just the Navier-Stokes, but for many, many types of equations like this.” And so in the past there have been many attempts to try to obtain what’s called global regularity for Navier-Stokes, which is the opposite of finite time blowup, that velocity stays smooth. And it all failed. There was always some sign error or some subtle mistake and it couldn’t be salvaged.

(00:13:17) So, what I was interested in doing was trying to explain why we were not able to disprove finite time blowup. I couldn’t do it for the actual equations of fluids, which are too complicated, but if I could average the equations of motion of Navier-Stokes, basically if I could turn off certain types of ways in which water interacts and only keep the ones that I want. So, in particular, if there’s a fluid and it could transfer as energy from a large eddy into this small eddy or this other small eddy, I would turn off the energy channel that would transfer energy to this one and direct it only into this smaller eddy while still preserving the lower conservation energy.

Lex Fridman (00:13:58) So, you’re trying to make a blowup?

Terence Tao (00:14:00) Yeah, yeah. So, I basically engineer a blowup by changing rules of physics, which is one thing that mathematicians are allowed to do. We can change the equation.

Lex Fridman (00:14:08) How does that help you get closer to the proof of something?

Terence Tao (00:14:11) Right. So, it provides what’s called an obstruction in mathematics. So, what I did was that basically if I turned off the certain parts of the equation, which usually when you turn off certain interactions, make it less nonlinear, it makes it more regular and less likely to blow up. But I find that by turning off a very well-designed set of interactions, I could force all the energy to blow up in finite time. So, what that means is that if you wanted to prove the regularity for Navier-Stokes for the actual equation, you must use some feature of the true equation, which my artificial equation does not satisfy. So, it rules out certain approaches.

(00:14:55) So, the thing about math, it’s not just about taking a technique that is going to work and applying it, but you need to not take the techniques that don’t work. And for the problems that are really hard, often though are dozens of ways that you might think might apply to solve the problem, but it’s only after a lot of experience that you realize there’s no way that these methods are going to work. So, having these counterexamples for nearby problems rules out… it saves you a lot of time because you’re not wasting energy on things that you now know cannot possibly ever work.

Lex Fridman (00:15:30) How deeply connected is it to that specific problem of fluid dynamics or is this some more general intuition you build up about mathematics?

Terence Tao (00:15:38) Right. Yeah. So, the key phenomenon that my technique exploits is what’s called super-criticality. So, in partial [inaudible 00:15:46] equations, often these equations are like a tug of war between different forces. So, in Navier-Stokes, there’s the dissipation force coming from viscosity, and it’s very well understood. It’s linear, it calms things down. If viscosity was all there was, then nothing bad would ever happen, but there’s also transport that energy from… in one location of space can get transported because the fluid is in motion to other locations. And that’s a nonlinear effect, and that causes all the problems. So, there are these two competing terms in the Navier-Stokes Equation, the dissipation term and the transport term. If the dissipation term dominates, if it’s large, then basically you get regularity. And if the transport term dominates, then we don’t know what’s going on. It’s a very nonlinear situation, it’s unpredictable, it’s turbulent.

(00:16:32) So, sometimes these forces are in balance at small scales but not in balance at large scales or vice versa. Navier-Stokes is what’s called supercritical. So at smaller and smaller scales, the transport terms are much stronger than the viscosity terms. So, the viscosity terms are things that calm things down. And so this is why the problem is hard. In two dimensions, so the Soviet mathematician Ladyzhenskaya, she in the ’60s shows in two dimensions there was no blowup. And in two dimensions, the Navier-Stokes Equation is what’s called critical, the effect of transport and the effect of viscosity about the same strength even at very, very small scales. And we have a lot of technology to handle critical and also subcritical equations and prove regularity. But for supercritical equations, it was not clear what was going on, and I did a lot of work, and then there’s been a lot of follow up showing that for many other types of supercritical equations, you can create all kinds of blowup examples.

(00:17:27) Once the nonlinear effects dominate the linear effects at small scales, you can have all kinds of bad things happen. So, this is sort of one of the main insights of this line of work is that super-criticality versus criticality and subcriticality, this makes a big difference. That’s a key qualitative feature that distinguishes some equations for being sort of nice and predictable and… Like planetary motion, there’s certain equations that you can predict for millions of years or thousands at least. Again, it’s not really a problem, but there’s a reason why we can’t predict the weather past two weeks into the future because it’s a supercritical equation. Lots of really strange things are going on at very fine scales.

Lex Fridman (00:18:04) So, whenever there is some huge source of nonlinearity, that can create a huge problem for predicting what’s going to happen?

Terence Tao (00:18:13) Yeah. And if non-linearity is somehow more and more featured and interesting at small scales. There’s many equations that are nonlinear, but in many equations you can approximate things by the bulk. So, for example, planetary motion, if you want to understand the orbit of the Moon or Mars or something, you don’t really need the microstructure of the seismology of the Moon or exactly how the mass is distributed. Basically, you can almost approximate these planets by point masses, and it’s just the aggregate behavior is important. But if you want to model a fluid, like the weather, you can’t just say, “In Los Angeles the temperature is this, the wind speed is this.” For supercritical equations, the fine scale information is really important.

Lex Fridman (00:18:54) If we can just linger on the Navier-Stokes Equations a little bit. So, you’ve suggested, maybe you can describe it, that one of the ways to ways solve it or to negatively resolve it would be to construct a kind of liquid computer, and then show that the halting problem from computation theory has consequences for fluid dynamics, so show it in that way. Can you describe this idea?

Terence Tao (00:19:22) Right, yeah. So, this came out of this work of constructing this average equation that blew up. So, as part of how I had to do this, so there’s this naive way to do it, you just keep pushing. Every time you get one scale, you push it immediately to the next scale as fast as possible. This is sort of the naive way to force blowup. It turns out in five and higher dimensions, this works, but in three dimensions there was this funny phenomenon that I discovered, that if you change laws of physics, you just always keep trying to push the energy into smaller and smaller scales, what happens is that the energy starts getting spread out into many scales at once, so that you have energy at one scale. You’re pushing it into the next scale, and then as soon as it enters that scale, you also push it to the next scale, but there’s still some energy left over from the previous scale.

(00:20:16) You’re trying to do everything at once, and this spreads out the energy too much. And then it turns out that it makes it vulnerable for viscosity to come in and actually just damp out everything. So, it turns out this direct abortion doesn’t actually work. There was a separate paper by some other authors that actually showed this in three dimensions. So, what I needed was to program a delay, so kind of like airlocks. So, I needed an equation which would start with a fluid doing something at one scale, it would push this energy into the next scale, but it would stay there until all the energy from the larger scale got transferred. And only after you pushed all the energy in, then you open the next gate and then you push that in as well.

(00:21:01) So, by doing that, the energy inches forward, scale by scale in such a way that it’s always localized at one scale at a time, and then it can resist the effects of viscosity because it’s not dispersed. So, in order to make that happen, I had to construct a rather complicated nonlinearity. And it was basically… It was constructed like an electronic circuit. So, I actually thank my wife for this because she was trained as an electrical engineer, and she talked about she had to design circuits and so forth. And if you want a circuit that does a certain thing, maybe have a light that flashes on and then turns off and then on and off. You can build it from more primitive components, capacitors and resistors and so forth, and you have to build a diagram.

(00:21:47) And these diagrams, you can sort of follow up your eyeballs and say, “Oh yeah, the current will build up here and it will stop, and then it will do that.” So, I knew how to build analog of basic electronic components, like resistors and capacitors and so forth. And I would stack them together in such a way that I would create something that would open one gate. And then there’d be a clock, and then once the clock hits a certain threshold, it would close it. It would become a Rube Goldberg type machine, but described mathematically. And this ended up working. So, what I realized is that if you could pull the same thing off for the actual equations, so if the equations of water support a computation… So, you can imagine a steampunk, but it’s really water-punk type of thing where… So, modern computers are electronic, they’re powered by electrons passing through very tiny wires and interacting with other electrons and so forth.

(00:22:39) But instead of electrons, you can imagine these pulses of water moving a certain velocity. And maybe there are two different configurations corresponding to a bit being up or down. Probably that if you had two of these moving bodies of water collide, they would come out with some new configuration, which would be something like an AND gate or OR gate, that the output would depend in a very predictable way on the inputs. And you could chain these together and maybe create a Turing machine. And then you have computers which are made completely out of water. And if you have computers, then maybe you can do robotics, so hydraulics and so forth. And so you could create some machine which is basically a fluid analog, what’s called a von Neumann machine.

(00:23:26) So, von Neumann proposed if you want to colonize Mars, the sheer cost of transporting people in machines to Mars is just ridiculous, but if you could transport one machine to Mars, and this machine had the ability to mine the planet, create some more materials, smelt them and build more copies of the same machine, then you could colonize a whole planet over time. So, if you could build a fluid machine, which yeah, so it’s a fluid robot. And what it would do, its purpose in life, it’s programmed so that it would create a smaller version of itself in some sort of cold state. It wouldn’t start just yet. Once it’s ready, the big robot configuration of water would transfer all its energy into the smaller configuration and then power down. And then they clean itself up, and then what’s left is this newest state which would then turn on and do the same thing, but smaller and faster.

(00:24:19) And then the equation has a certain scaling symmetry. Once you do that, it can just keep iterating. So, this, in principle, would create a blowup for the actual Navier-Stokes. And this is what I managed to accomplish for this average Navier-Stokes. So, it provided this sort of roadmap to solve the problem. Now, this is a pipe dream because there are so many things that are missing for this to actually be a reality. So, I can’t create these basic logic gates. I don’t have these special configurations of water. There’s candidates, these include vortex rings that might possibly work. But also analog computing is really nasty compared to digital computing because there’s always errors. You have to do a lot of error correction along the way.

(00:25:05) I don’t know how to completely power down the big machine, so it doesn’t interfere the writing of the smaller machine, but everything in principle can happen. It doesn’t contradict any of the laws of physics, so it’s sort of evidence that this thing is possible. There are other groups who are now pursuing ways to make Navier-Stokes blow up, which are nowhere near as ridiculously complicated as this. They actually are pursuing much closer to the direct self-similar model, which can… It doesn’t quite work as is, but there could be some simpler scheme they want to just describe to make this work.

Lex Fridman (00:25:40) There is a real leap of genius here to go from Navier-Stokes to this Turing machine. So, it goes from what the self-similar blob scenario that you’re trying to get the smaller and smaller blob to now having a liquid Turing machine gets smaller and smaller and smaller, and somehow seeing how that could be used to say something about a blowup. That’s a big leap.

Game of life

Terence Tao (00:26:08) So, there’s precedent. So, the thing about mathematics is that it’s really good at spotting connections between what you might think of as completely different problems, but if the mathematical form is the same, you can draw a connection. So, there’s a lot of previously on what called cellular automata, the most famous of which is Conway’s Game of Life. There’s this infinite discrete grid, and at any given time, the grid is either occupied by a cell or it’s empty. And there’s a very simple rule that tells you how these cells evolve. So, sometimes cells live and sometimes they die. And when I was a student, it was a very popular screen saver to actually just have these animations go on, and they look very chaotic. In fact, they look a little bit like turbulent flow sometimes, but at some point people discovered more and more interesting structures within this Game of Life. So, for example, they discovered this thing called glider.

(00:27:00) So, a glider is a very tiny configuration of four or five selves which evolves and it just moves at a certain direction. And that’s like this vortex rings [inaudible 00:27:09]. Yeah, so this is an analogy, the Game of Life is a discrete equation, and the fluid Navier-Stokes is a continuous equation, but mathematically they have some similar features. And so over time people discovered more and more interesting things that you could build within the Game of Life. The Game of Life is a very simple system. It only has like three or four rules to do it, but you can design all kinds of interesting configurations inside it. There’s some called a glider gun that does nothing that spit out gliders one at a time. And then after a lot of effort, people managed to create AND gates and OR gates for gliders.

(00:27:48) There’s this massive ridiculous structure, which if you have a stream of gliders coming in here and a stream of gliders coming in here, then you may produce extreme gliders coming out. Maybe if both of the streams have gliders, then there’ll be an output stream, but if only one of them does, then nothing comes out. So, they could build something like that. And once you could build these basic gates, then just from software engineering, you can build almost anything. You can build a Turing machine. It’s enormous steampunk type things. They look ridiculous. But then people also generated self-replicating objects in the Game of Life, a massive machine, a [inaudible 00:28:31] machine, which over a huge period of time and always look like glider guns inside doing these very steampunk calculations. It would create another version of itself which could replicate.

Lex Fridman (00:28:42) That’s so incredible.

Terence Tao (00:28:42) A lot of this was like community crowdsourced by amateur mathematicians actually. So, I knew about that work. And so that is part of what inspired me to propose the same thing with Navier-Stokes. Seriously, analog is much worse than digital. It’s going to be… You can’t just directly take deconstructions in the Game of Life and plunk them in. But again, it shows it’s possible.

Lex Fridman (00:29:06) There’s a kind of emergence that happens with these cellular automata local rules… maybe it’s similar to fluids, I don’t know, but local rules operating at scale can create these incredibly complex dynamic structures. Do you think any of that is amenable to mathematical analysis? Do we have the tools to say something profound about that?

Terence Tao (00:29:34) The thing is, you can get these emergent very complicated structures, but only with very carefully prepared initial conditions. So, these glider guns and gates and self-propelled machines, if you just plunk on randomly some cells and you unlink them, you will not see any of these. And that’s the analogous situation with Navier-Stokes again, that with typical initial conditions, you will not have any of this weird computation going on. But basically through engineering, by specially designing things in a very special way, you can make clever constructions.

Lex Fridman (00:30:07) I wonder if it’s possible to prove the negative of… basically prove that only through engineering can you ever create something interesting.

Terence Tao (00:30:16) Yeah. This is a recurring challenge in mathematics that I call the dichotomy between structure and randomness, that most objects that you can generate in mathematics are random. They look like random, like the digital supply, well, we believe is a good example. But there’s a very small number of things that have patterns. But now, you can prove something has a pattern by just constructing… If something has a simple pattern and you have a proof that it does something like repeat itself every so often, you can do that and you can prove that… For example, you can prove that most sequences of digits have no pattern. So, if you just pick digits randomly, there’s something called low-large numbers. It tells you you’re going to get as many ones as twos in the long run. But we have a lot fewer tools to…

(00:31:01) If I give you a specific pattern like the digits of pi, how can I show that this doesn’t have some weird pattern to it? Some other work that I spent a lot of time on is to prove what are called structure theorems or inverse theorems that give tests for when something is very structured. So, some functions are what’s called additive. If you have a function of natural numbers of the natural numbers, so maybe two maps to four, three maps to six and so forth, some functions are what’s called additive, which means that if you add two inputs together, the output gets added as well. For example, a multiply by constant. If you multiply a number by 10… If you multiply A plus B by 10, that’s the same as multiplying A by 10 and B by 10, and then adding them together. So, some functions are additive, some functions are kind of additive but not completely additive.

(00:31:47) So, for example, if I take a number, and I multiply by the square of two and I take the integer part of that, so 10 by square route of two is like 14 point something, so 10 up to 14, 20 or up to 28. So, in that case, additivity is true then, so 10 plus 10 is 20 and 14 plus 14 is 28. But because of this rounding, sometimes there’s round-up errors, and sometimes when you add A plus A, this function doesn’t quite give you the sum of the two individual outputs, but the sum plus/minus one. So, it’s almost additive, but not quite additive.

(00:32:21) So, there’s a lot of useful results in mathematics, and I’ve worked a lot on developing things like this, to the effect that if a function exhibits some structure like this, then it’s basically there’s a reason for why it’s true. And the reason is because there’s some other nearby function, which is actually completely structured, which is explaining this sort of partial pattern that you have. And so if you have these inverse theorems, it creates this dichotomy that either the objects that you study are either have no structure at all or they are somehow related to something kind of structured. And in either way, in either case, you can make progress. A good example of this is that there’s this old theorem in mathematics-

Infinity

Terence Tao (00:33:01) A good example of this is that there’s this old theorem in mathematics called Szemerédi’s Theorem, proven in the 1970s. It concerns trying to find a certain type of pattern in a set of numbers, the patterns of arithmetic progression. Things like three, five, and seven or 10, 15 and 20, and Szemerédi, Endre Szemerédi proved that any set of numbers that are sufficiently big, what’s called positive density, has arithmetic progressions in it of any length you wish.

(00:33:28) For example, the odd numbers have a density of one half, and they contain arithmetic progressions of any length. So in that case, it’s obvious, because the odd numbers are really, really structured. I can just take 11, 13, 15, 17, I can easily find arithmetic progressions in that set, but Szemerédi’s theorem also applies to random sets. If I take a set of odd numbers and I flip a coin for each number, and I only keep the numbers for which I got a heads… So I just flip coins, I just randomly take out half the numbers, I keep one half. That’s a set that has no patterns at all, but just from random fluctuations, you will still get a lot of arithmetic progressions in that set.

Lex Fridman (00:34:10) Can you prove that there’s arithmetic progressions of arbitrary length within a random-

Terence Tao (00:34:17) Yes. Have you heard of the infinite monkey theorem? Usually, mathematicians give boring names to theorems, but occasionally they give colorful names.

Terence Tao (00:34:24) The popular version of the infinite monkey theorem is that if you have an infinite number of monkeys in a room, each with typewriter, they type out text randomly, almost surely, one of them is going to generate the entire script of Hamlet, or any other finite string of text. It’ll just take some time, quite a lot of time, actually, but if you have an infinite number, then it happens.

(00:34:44) So basically, the theorem is that if you take an infinite string of digits or whatever, eventually any finite pattern you wish will emerge. It may take a long time, but it will eventually happen. In particular, arithmetic progressions of any length will eventually happen, but you need an extremely long random sequence for this to happen.

Lex Fridman (00:35:04) I suppose that’s intuitive. It’s just infinity.

Terence Tao (00:35:08) Yeah, infinity absorbs a lot of sins.

Lex Fridman (00:35:11) Yeah. How we humans supposed to deal with infinity?

Terence Tao (00:35:15) Well, you can think of infinity as an abstraction of a finite number of which you do not have a bound. So nothing in real life is truly infinite, but you can ask yourself questions like, “What if I had as much money as I wanted?”, or, “What if I could go as fast as I wanted?”, and a way in which mathematicians formalize that is mathematics has found a formalism to idealize, instead of something being extremely large or extremely small, to actually be exactly infinite or zero, and often the mathematics becomes a lot cleaner when you do that. I mean, in physics, we joke about assuming spherical cows, real world problems have got all kinds of real world effects, but you can idealize, send some things to infinity, send some things to zero, and the mathematics becomes a lot simpler to work within.

Lex Fridman (00:36:06) I wonder how often using infinity forces us to deviate from the physics of reality.

Terence Tao (00:36:17) So there’s a lot of pitfalls. So we spend a lot of time in undergraduate math classes teaching analysis, and analysis is often about how to take limits and whether…

(00:36:28) So for example, A plus B is always B plus A. So when you have a finite number of terms and you add them, you can swap them and there’s no problem, but when you have an infinite number of terms, they’re these sort of show games you can play where you can have a series which converges to one value, but you rearrange it, and it suddenly converges to another value, and so you can make mistakes. You have to know what you’re doing when you allow infinity. You have to introduce these epsilons and deltas, and there’s a certain type of wave of reasoning that helps you avoid mistakes.

(00:36:58) In more recent years, people have started taking results that are true in infinite limits and what’s called finitizing them. So you know that something’s true eventually, but you don’t know when. Now give me a rate. So such… If I don’t have an infinite number of monkeys, but a large finite number of monkeys, how long do I have to wait for Hamlet to come out? That’s a more quantitative question, and this is something that you can attack by purely finite methods, and you can use your finite intuition, and in this case, it turns out to be exponential in the length of the text that you’re trying to generate.

(00:37:36) So this is why you never see the monkeys create Hamlet. You can maybe see them create a four letter word, but nothing that big, and so I personally find once you finitize an infinite statement, it does come much more intuitive, and it’s no longer so weird.

Lex Fridman (00:37:51) So even if you’re working with infinity, it’s good to finitize so that you can have some intuition?

Terence Tao (00:37:57) Yeah, the downside is that the finitized groups are just much, much messier. So the infinite ones are found first usually, decades earlier, and then later on, people finitize them.

Math vs Physics

Lex Fridman (00:38:07) So since we mentioned a lot of math and a lot of physics, what is the difference between mathematics and physics as disciplines, as ways of understanding, of seeing the world? Maybe we can throw engineering in there, you mentioned your wife is an engineer, give it new perspective on circuits. So this different way of looking at the world, given that you’ve done mathematical physics, so you’ve worn all the hats.

Terence Tao (00:38:30) Right. So I think science in general is interaction between three things. There’s the real world, there’s what we observe of the real world, observations, and then our mental models as to how we think the world works.

(00:38:46) We can’t directly access reality. All we have are the observations, which are incomplete and they have errors, and there are many, many cases where we want to know, for example, what is the weather like tomorrow, and we don’t yet have the observation, but we’d like to. A prediction.

(00:39:04) Then we have these simplified models, sometimes making unrealistic assumptions, spherical cow type things. Those are the mathematical models.

(00:39:11) Mathematics is concerned with the models. Science collects the observations, and it proposes the models that might explain these observations. What mathematics does, we stay within the model, and we ask what are the consequences of that model? What observations, what predictions would the model make of future observations, or past observations? Does it fit? Observe data?

(00:39:35) So there’s definitely a symbiosis. I guess mathematics is unusual among other disciplines is that we start from hypotheses, like the axioms of a model, and ask what conclusions come up from that model. In almost any other discipline, you start with the conclusions. “I want to do this. I want to build a bridge, I want to make money, I want to do this,” and then you find the paths to get there. There’s a lot less sort of speculation about, “Suppose I did this, what would happen?”. Planning and modeling. Speculative fiction maybe is one other place, but that’s about it, actually. Most of the things we do in life is conclusions driven, including physics and science. I mean, they want to know, “Where is this asteroid going to go? What is the weather going to be tomorrow?”, but mathematics also has this other direction of going from the axioms.

Lex Fridman (00:40:32) What do you think… There is this tension in physics between theory and experiment. What do you think is the more powerful way of discovering truly novel ideas about reality?

Terence Tao (00:40:42) Well, you need both, top down and bottom up. It’s really an interaction between all these… So over time, the observations and the theory and the modeling should both get closer to reality, but initially, and this is always the case out there, they’re always far apart to begin with, but you need one to figure out where to push the other.

(00:41:04) So if your model is predicting anomalies that are not predicted by experiment, that tells experimenters where to look to find more data to refine the models. So it goes back and forth.

(00:41:21) Within mathematics itself, there’s also a theory and experimental component. It’s just that until very recently, theory has dominated almost completely. 99% of mathematics is theoretical mathematics, and there’s a very tiny amount of experimental mathematics. People do do it. If they want to study prime numbers or whatever, they can just generate large data sets.

(00:41:41) So once we had the computers, we had to do it a little bit. Although even before… Well, like Gauss for example, he discovered a reconjection, the most basic theorem in number theory, called the prime number theorem, which predicts how many primes up to a million, up to a trillion. It’s not an obvious question, and basically what he did was that he computed, mostly by himself, but also hired human computers, people whose professional job it was to do arithmetic, to compute the first hundred thousand primes or something, and made tables and made a prediction. That was an early example of experimental mathematics, but until very recently, it was not…

(00:42:22) I mean, theoretical mathematics was just much more successful. Of course, doing complicated mathematical computations was just not feasible until very recently, and even nowadays, even though we have powerful computers, only some mathematical things can be explored numerically.

(00:42:37) There’s something called the combinatorial explosion. If you want us to study, for example, Szemerédi’s theorem, you want to study all possible subsets of numbers one to a thousand. There’s only 1000 numbers. How bad could it be? It turns out the number of different subsets of one to a thousand is two to the power of 1000, which is way bigger than any computer can currently enumerate.

(00:42:59) So there are certain math problems that very quickly become just intractable to attack by direct brute force computation. Chess is another famous example. The number of chess positions, we can’t get a computer to fully explore, but now we have AI, we have tools to explore this space, not with 100% guarantees of success, but with experiment. So we can empirically solve chess now. For example, we have very, very good AIs that don’t explore every single position in the game tree, but they have found some very good approximation, and people are using actually these chess engines to do experimental chess. They’re revisiting old chess theories about, “Oh, when you do this type of opening… This is a good type of move, this is not,” and they can use these chess engines to actually refine, and in some cases, overturn conventional wisdom about chess, and I do hope that that mathematics will have a larger experimental component in the future, perhaps powered by AI.

Lex Fridman (00:44:05) We’ll, of course, talk about that, but in the case of chess, and there’s a similar thing in mathematics, I don’t believe it’s providing a kind of formal explanation of the different positions. It’s just saying which position is better or not that you can intuit as a human being, and then from that, we humans can construct a theory of the matter.

Nature of reality

(00:44:27) You’ve mentioned the Plato’s cave allegory. In case people don’t know, it’s where people are observing shadows of reality, not reality itself, and they believe what they’re observing to be reality. Is that, in some sense, what mathematicians and maybe all humans are doing, is looking at shadows of reality? Is it possible for us to truly access reality?

Terence Tao (00:44:55) Well, there are these three ontological things. There’s actual reality, there’s observations and our models, and technically they are distinct, and I think they will always be distinct, but they can get closer over time, and the process of getting closer often means that you have to discard your initial intuitions. So astronomy provides great examples, like an initial model of the world is flat because it looks flat and it’s big, and the rest of the universe, the skies, is not. The sun, for example, looks really tiny.

(00:45:38) So you start off with a model, which is actually really far from reality, but it fits the observations that you have. So things look good, but over time, as you make more and more observations, bring it closer to reality, the model gets dragged along with it, and so over time, we had to realize that the earth was round, that it spins, it goes around the solar system, solar system goes around the galaxy, and so on and so forth, and the universe was expanding. Expansions is self-expanding, accelerating, and in fact, very recently this year… So even the acceleration of the universe itself, this evidence now is non-constant.

Lex Fridman (00:46:13) The explanation behind why that is…

Lex Fridman (00:46:18) It’s catching up. I mean, it’s still the dark matter, dark energy, this kind of thing.

Terence Tao (00:46:23) We have a model that explains, that fits the data really well. It just has a few parameters that you have to specify. So people say, “Oh, that’s fudge factors. With enough fudge factors, you can explain anything,” but the mathematical point over the model is that you want to have fewer parameters in your model and data points in your observational set.

(00:46:43) So if you have a model with 10 parameters that explains 10 observations, that is a completely useless model, its what’s called overfitted, but if you have a model with two parameters and it explains a trillion observations, which is basically the dark matter model, I think it has 14 parameters, and it explains petabytes of data that the astronomers have.

(00:47:06) You can think of a theory. One way to think about a physical mathematical theory is it’s a compression of the universe, and a data compression. So you have these petabytes of observations, you like to compress it to a model which you can describe in five pages and specify a certain number of parameters, and if it can fit, to reasonable accuracy, almost all of your observations, the more compression that you make, the better your theory.

Lex Fridman (00:47:32) In fact, one of the great surprises of our universe and of everything in it is that it’s compressible at all. That’s the unreasonable effectiveness of mathematics

Terence Tao (00:47:40) Yeah, Einstein had a quote like that. “The most incomprehensible thing about the universe is that it is comprehensible.”

Lex Fridman (00:47:45) Right, and not just comprehensible. You can do an equation like e=MC2.

Terence Tao (00:47:49) There is actually some possible explanation for that. So there’s this phenomenon in mathematics called universality. So, many complex systems at the macro scale are coming out of lots of tiny interactions at the macro scale, and normally, because of the commutative explosion, you would think that the macro scale equations must be infinitely, exponentially more complicated than the macro scale ones, and they are, if you want to solve them completely exactly. If you want to model all the atoms in a box of air…

(00:48:21) Like Avogadro’s number is humongous. There’s a huge number of particles. If you actually tried to track each one, it’ll be ridiculous, but certain laws emerge at the microscopic scale that almost don’t depend on what’s going on at the macro scale, or only depend on a very small number of parameters.

(00:48:35) So if you want to model a gas of a quintillion particles in a box, you just need to know is temperature and pressure and volume, and a few parameters, like five or six, and it models almost everything you need to know about these 10 to 23 or whatever particles. So we don’t understand universality anywhere near as we would like mathematically, but there are much simpler toy models where we do have a good understanding of why universality occurs. The most basic one is the central limit theorem that explains why the bell curve shows up everywhere in nature, that so many things are distributed by what’s called a Gaussian distribution, famous bell curve. There’s now even a meme with this curve.

Lex Fridman (00:49:18) And even the meme applies broadly. The universality to the meme.

Terence Tao (00:49:22) Yes, you can go meta if you like, but there are many, many processes. For example, you can take lots of independent random variables and average them together in various ways. You can take a simple average or more complicated average, and we can prove in various cases that these bell curves, these Gaussians, emerge, and it is a satisfying explanation.

(00:49:44) Sometimes they don’t. So if you have many different inputs and they’re all correlated in some systemic way, then you can get something very far from a bell curve to show up, and this is also important to know when [inaudible 00:49:55] fails. So universality is not a 100% reliable thing to rely on. The global financial crisis was a famous example of this. People thought that mortgage defaults had this sort of Gaussian type behavior, that if a population of a hundred thousand Americans with mortgages ask what proportion of them would default on their mortgages, if everything was de-correlated, it would be an asset bell curve, and you can manage risk of options and derivatives and so forth, and there’s a very beautiful theory, but if there are systemic shocks in the economy that can push everybody to default at the same time, that’s very non-Gaussian behavior, and this wasn’t fully accounted for in 2008.

(00:50:45) Now I think there’s some more awareness that this systemic risk is actually a much bigger issue, and just because the model is pretty and nice, it may not match reality. So the mathematics of working out what models do is really important, but also the science of validating when the models fit reality and when they don’t… You need both, but mathematics can help, because for example, these central limit theorems, it tells you that if you have certain axioms like non-correlation, that if all the inputs were not correlated to each other, then you have this Gaussian behavior and things are fine. It tells you where to look for weaknesses in the model.

(00:51:25) So if you have a mathematical understanding of Szemerédi’s theorem, and someone proposes to use these Gaussian [inaudible 00:51:32] or whatever to model default risk, if you’re mathematically trained, you would say, “Okay, but what are the systemic correlation between all your inputs?”, and so then you can ask the economist, “How much of a risk is that?”, and then you can go look for that. So there’s always this synergy between science and mathematics.

Lex Fridman (00:51:52) A little bit on the topic of universality, you’re known and celebrated for working across an incredible breadth of mathematics, reminiscent of Hilbert a century ago. In fact, the great Fields Medal winning mathematician Tim Gowers has said that you are the closest thing we get to Hilbert. He’s a colleague of yours.

Terence Tao (00:52:16) Oh yeah, good friend.

Lex Fridman (00:52:16) But anyway, so you are known for this ability to go both deep and broad in mathematics. So you’re the perfect person to ask. Do you think there are threads that connect all the disparate areas of mathematics? Is there a kind of a deep, underlying structure to all of mathematics?

Terence Tao (00:52:36) There’s certainly a lot of connecting threads, and a lot of the progress of mathematics can be represented by taking… By stories of two fields of mathematics that were previously not connected, and finding connections.

(00:52:50) An ancient example is geometry and number theory. So in the times of the ancient Greeks, these were considered different subjects. I mean, mathematicians worked on both. Euclid worked both on geometry, most famously, but also on numbers, but they were not really considered related. I mean, a little bit, like you could say that this length was five times this length because you could take five copies of this length and so forth, but it wasn’t until Descartes, who developed analytical geometry, that you can parameterize the plane, a geometric object, by two real numbers. So geometric problems can be turned into problems about numbers.

(00:53:35) Today this feels almost trivial. There’s no content to this. Of course, a plane is X and Y, because that’s what we teach and it’s internalized, but it was an important development that these two fields were unified, and this process has just gone on throughout mathematics over and over again. Algebra and geometry were separated, and now we have this fluid, algebraic geometry that connects them, and over and over again, and that’s certainly the type of mathematics that I enjoy the most.

(00:54:06) I think there’s sort of different styles to being a mathematician. I think hedgehogs and fox… A fox knows many things a little bit, but a hedgehog knows one thing very, very well, and in mathematics, there’s definitely both hedgehogs and foxes, and then there’s people who can play both roles, and I think ideal collaboration, British mathematicians involves very… You need some diversity, like a fox working with many hedgehogs or vice versa, but I identify mostly as a fox, certainly. I like arbitrage, somehow. Learning how one field works, learning the tricks of that wheel, and then going to another field which people don’t think is related, but I can adapt the tricks.

Lex Fridman (00:54:49) So see the connections between the fields.

Terence Tao (00:54:52) Yeah. So there are other mathematicians who are far deeper than I am. They’re really hedgehogs. They know everything about one field, and they’re much faster and more effective in that field, but I can give them these extra tools.

Lex Fridman (00:55:05) I mean, you’ve said that you can be both a hedgehog and the fox, depending on the context, depending on the collaboration. So can you, if it’s at all possible, speak to the difference between those two ways of thinking about a problem? Say you’re encountering a new problem, searching for the connections versus very singular focus.

Terence Tao (00:55:26) I’m much more comfortable with the fox paradigm. Yeah. So yeah, I like looking for analogies, narratives. I spend a lot of time… If there’s a result, I see it in one field, and I like the result, it’s a cool result, but I don’t like the proof, it uses types of mathematics that I’m not super familiar with, I often try to re-prove it myself using the tools that I favor.

(00:55:53) Often, my proof is worse, but by the exercise they’re doing, so I can say, “Oh, now I can see what the other proof was trying to do,” and from that, I can get some understanding of the tools that are used in that field. So it’s very exploratory, very… Doing crazy things in crazy fields and reinventing the wheel a lot, whereas the hedgehog style is, I think, much more scholarly. You’re very knowledge-based. You stay up to speed on all the developments in this field, you know all the history, you have a very good understanding of exactly the strengths and weaknesses of each particular technique. I think you rely a lot more on calculation than sort of trying to find narratives. So yeah, I can do that too, but other people are extremely good at that.

Lex Fridman (00:56:44) Let’s step back and maybe look at a bit of a romanticized version of mathematics. So I think you’ve said that early on in your life, math was more like a puzzle-solving activity when you were young. When did you first encounter a problem or proof where you realized math can have a kind of elegance and beauty to it?

Terence Tao (00:57:11) That’s a good question. When I came to graduate school in Princeton, so John Conway was there at the time, he passed away a few years ago, but I remember one of the very first research talks I went to was a talk by Conway on what he called extreme proof.

(00:57:28) So Conway just had this amazing way of thinking about all kinds of things in a way that you wouldn’t normally think of. So he thought proofs themselves as occupying some sort of space. So if you want to prove something, let’s say that there’s infinitely many primes, you have all different proofs, but you could rank them in different axes. Some proofs are elegant, some proofs are long, some proofs are elementary and so forth, and so there’s this cloud, so the space of all proofs itself has some sort of shape, and so he was interested in extreme points of this shape. Out of all these proofs, what is one of those, the shortest, at the expense of everything else, or the most elementary or whatever?

(00:58:09) So he gave some examples of well-known theorems, and then he would give what he thought was the extreme proof in these different aspects. I just found that really eye-opening, that it’s not just getting a proof for a result that was interesting, but once you have that proof, trying to optimize it in various ways, that proofing itself had some craftsmanship to it.

(00:58:40) It’s certainly informed my writing style, like when you do your math assignments and as you’re an undergraduate, your homework and so forth, you’re sort of encouraged to just write down any proof that works and hand it in, and as long as it gets a tick mark, you move on, but if you want your results to actually be influential and be read by people, it can’t just be correct. It should also be a pleasure to read, motivated, be adaptable to generalize to other things. It’s the same in many other disciplines, like coding. There’s a lot of analogies between math and coding. I like analogies, if you haven’t noticed. You can code something, spaghetti code, that works for a certain task, and it’s quick and dirty and it works, but there’s lots of good principles for writing code well so that other people can use it, build upon it so it has fewer bugs and whatever, and there’s similar things with mathematics.

Lex Fridman (00:59:37) Yeah, first of all, there’s so many beautiful things there, and [inaudible 00:59:42] is one of the great minds in mathematics ever, and computer science, just even considering the space of proofs and saying, “Okay, what does this space look like, and what are the extremes?”

(00:59:56) Like you mentioned, coding as an analogy is interesting, because there’s also this activity called the code golf, which I also find beautiful and fun, where people use different programming languages to try to write the shortest possible program that accomplishes a particular task, and I believe there’s even competitions on this, and it’s also a nice way to stress test not just the programs, or in this case, the proofs, but also the different languages. Maybe that’s a different notation or whatever to use to accomplish a different task.

Terence Tao (01:00:31) Yeah, you learn a lot. I mean, it may seem like a frivolous exercise, but it can generate all these insights, which, if you didn’t have this artificial objective to pursue, you might not see…

Lex Fridman (01:00:43) What, to you, is the most beautiful or elegant equation in mathematics? I mean, one of the things that people often look to in beauty is the simplicity. So if you look at e=MC2… So when a few concepts come together, that’s why the Euler identity is often considered the most beautiful equation in mathematics. Do you find beauty in that one, in the Euler identity?

Terence Tao (01:01:08) Yeah. Well, as I said, what I find most appealing is connections between different things that… So if you… Pi equals minus one. So yeah, people use all the fundamental constants. Okay. I mean, that’s cute, but to me…

(01:01:24) So the exponential function, which is by Euler, was to measure exponential growth. So compound interest or decay, anything which is continuously growing, continuously decreasing, growth and decay, or dilation or contraction, is modeled by the exponential function, whereas pi comes around from circles and rotation, right? If you want to rotate a needle, for example, a hundred degrees, you need rotate by pi radians, and i, complex numbers, represents the swapping imaginary axes of a 90 degree rotation. So a change in direction.

(01:01:53) So the exponential function represents growth and decay in the direction that you already are. When you stick an i in the exponential, now instead of motion in the same direction as your current position, the motion as a right angles to your current position. So rotation, and then, so E to the pi i equals minus one tells you that if you rotate for a time pi, you end up at the other direction. So it unifies geometry through dilation and exponential growth or dynamics through this act of complexification, rotation by pi i. So it connects together all these two as mathematics, dynamics, geometry and complex numbers. They’re all considered almost… They were all next-door neighbors in mathematics because of this identity.

Lex Fridman (01:02:37) Do you think the thing you mentioned as Q, the collision of notations from these disparate fields, is just a frivolous side effect, or do you think there is legitimate value in when notation… Although our old friends come together in the night?

Terence Tao (01:02:54) Well, it’s confirmation that you have the right concepts. So when you first study anything, you have to measure things, and give them names, and initially sometimes, because your model is, again, too far off from reality, you give the wrong things the best names, and you only find out later what’s really important.

Lex Fridman (01:03:14) Physicists can do this sometimes, but it turns out okay.

Terence Tao (01:03:18) So actually, physics [inaudible 01:03:19] e=MC2. So one of the big things was the E, right? So when Aristotle first came up with his laws of motion, and then Galileo and Newton and so forth, they saw the things they could measure, they could measure mass and acceleration and force and so forth, and so Newtonian mechanics, for example, F=ma, was the famous Newton’s second law of motion. So those were the primary objects. So they gave them the central billing in the theory.

(01:03:44) It was only later after people started analyzing these equations that there always seemed to be these quantities that were conserved. So in particular, momentum and energy, and it’s not obvious that things have an energy. It’s not something you can directly measure the same way you can measure mass and velocity, so both, but over time, people realized that this was actually a really fundamental concept.

(01:04:05) Hamilton, eventually in the 19th century, reformulated Newton’s laws of physics into what’s called Hamiltonian mechanics, where the energy, which is now called the Hamiltonian, was the dominant object. Once you know how to measure the Hamiltonian of any system, you can describe completely the dynamics like what happens to all the states. It really was a central actor, which was not obvious initially, and this change of perspective really helped when quantum mechanics came along, because the early physicists who studied quantum mechanics, they had a lot of trouble trying to adapt their Newtonian thinking, because everything was a particle and so forth, to quantum mechanics, because everything was a wave, but it just looked really, really weird.

(01:04:51) You ask, “What is the quantum version of F=ma?”, and it’s really, really hard to give an answer to that, but it turns out that the Hamiltonian, which was so secretly behind the scenes in classical mechanics, also is the key object in quantum mechanics, that there’s also an object called a Hamiltonian. It’s a different type of object. It’s what’s called an operator rather than a function, but again, once you specify it, you specify the entire dynamics.

(01:05:17) So there’s something called Schrodinger’s equation that tells you exactly how quantum systems evolve once you have a Hamiltonian. So side by side, they look completely different objects. One involves particles, one involves waves and so forth, but with this centrality, you could start actually transferring a lot of intuition and facts from classical mechanics to quantum mechanics. So for example, in classical mechanics, there’s this thing called Noether’s theorem. Every time there’s a symmetry in a physical system, there was a conservation law. So the laws of physics are translation invariant. Like if I move 10 steps to the left, I experience the same laws of physics as if I was here, and that corresponds to conservation momentum. If I turn around by some angle, again, I experience the same laws of physics. This corresponds to the conservation of angular momentum. If I wait for 10 minutes, I still have the same laws of physics.

Terence Tao (01:06:00) It If I wait for 10 minutes, I still have the same laws of physics. So there’s time transition invariance. This corresponds to the law of the conservation of energy. So there’s this fundamental connection between symmetry and conservation. And that’s also true in quantum mechanics, even though the equations are completely different, but because they’re both coming from the Hamiltonian, the Hamiltonian controls everything, every time the Hamiltonian has a symmetry, the equations will have a conservation wall. Once you have the right language, it actually makes things a lot cleaner.

(01:06:32) One of the problems why we can’t unify quantum mechanics and general relativity, yet we haven’t figured out what the fundamental objects are. For example, we have to give up the notion of space and time being these almost Euclidean-type spaces, and it has to be, we know that at very tiny scales there’s going to be quantum fluctuations. There’s space-time foam and trying to use Cartesian coordinates X, Y, Z. It’s a non-starter, but we don’t know what to replace it with. We don’t actually have the concepts, the analog Hamiltonian that sort of organized everything.

Theory of everything

Lex Fridman (01:07:09) Does your gut say that there is a theory of everything, so this is even possible to unify, to find this language that unifies general relativity and quantum mechanics?

Terence Tao (01:07:19) I believe so. The history of physics has been out of unification much like mathematics over the years. [inaudible 01:07:26] magnetism was separate theories and then Maxwell unified them. Newton unified the motions of heavens for the motions of objects on the Earth and so forth. So it should happen. It’s just that, again, to go back to this model of the observations and theory, part of our problem is that physics is a victim of it’s own success. That our two big theories of physics, general relativity and quantum mechanics are so good now is that together they cover 99.9% of all the observations we can make. And you have to either go to extremely insane particle accelerations or the early universe or things that are really hard to measure in order to get any deviation from either of these two theories to the point where you can actually figure out how to combine together. But I have faith that we’ve been doing this for centuries and we’ve made progress before. There’s no reason why we should stop.

Lex Fridman (01:08:18) Do you think you’ll be a mathematician that develops a theory of everything?

Terence Tao (01:08:24) What often happens is that when the physicists need some theory of mathematics, there’s often some precursor that the mathematicians worked out earlier. So when Einstein started realizing that space was curved, he went to some mathematician and asked, “Is there some theory of curved space that mathematicians already came up with that could be useful?” And he said, “Oh yeah, I think Riemann came up with something.” And so yeah, Riemann had developed Riemannian geometry, which is precisely a theory of spaces that are curved in various general ways, which turned out to be almost exactly what was needed by Einstein’s theory. This is going back to weakness and unreasonable effectiveness of mathematics. I think the theories that work well, that explain the universe, tend to also involve the same mathematical objects that work well to solve mathematical problems. Ultimately, they’re just both ways of organizing data in useful ways.

Lex Fridman (01:09:17) It just feels like you might need to go some weird land that’s very hard to intuit. You have string T=theory.

Terence Tao (01:09:25) Yeah, that was a leading candidate for many decades. I think it’s slowly pulling out of fashion. It’s not matching experiment.

Lex Fridman (01:09:33) So one of the big challenges of course, like you said, is experiment is very tough because of how effective both theories are. But the other is just you’re talking about you’re not just deviating from space-time. You’re going into some crazy number of dimensions. You’re doing all kinds of weird stuff that to us, we’ve gone so far from this flat earth that we started at, like you mentioned, and now it’s very hard to use our limited ape descendants of a cognition to intuit what that reality really is.

Terence Tao (01:10:10) This is why analogies are so important. So yeah, the round earth is not intuitive because we’re stuck on it. But round objects in general, we have pretty good intuition over and we have interest about light works and so forth. And it’s actually a good exercise to actually work out how eclipses and phases of the sun and the moon and so forth can be really easily explained by round earth and round moon and models. And you can just take a basketball and a golf ball and a light source and actually do these things yourself. So the intuition is there, but you have to transfer it.

Lex Fridman (01:10:47) That is a big leap intellectually for us to go from flat to round earth because our life is mostly lived in flat land. To load that information and we’re all like, take it for granted. We take so many things for granted because science has established a lot of evidence for this kind of thing, but we’re on a round rock flying through space. Yeah, that’s a big leap. And you have to take a chain of those leaps. The more and more and more we progress,

Terence Tao (01:11:15) Right, yeah. So modern science is maybe, again, a victim of own success is that in order to be more accurate, it has to move further and further away from your initial intuition. And so for someone who hasn’t gone through the whole process of science education, it looks more suspicious because of that. So we need more grounding. There are scientists who do excellent outreach, but there’s lots of science things that you can do at home. Lots of YouTube videos I did at YouTube video recently, Grant Sanderson, we talked about this earlier, that how the ancient Greeks were able to measure things like the distance of the moon, distance the earth, and using techniques that you could also replicate yourself. It doesn’t all have to be fancy space telescopes and very intimidating mathematics.

Lex Fridman (01:12:01) Yeah, I highly recommend that. I believe you give a lecture and you also did an incredible video with Grant. It’s a beautiful experience to try to put yourself in the mind of a person from that time shrouded in mystery. You’re on this planet, you don’t know the shape of it, the size of it. You see some stars, you see some things and you try to localize yourself in this world and try to make some kind of general statements about distanced places.

Terence Tao (01:12:29) Change of perspective is really important. You say travel broadens the mind, this is intellectual travel. Put yourself in the mind of the ancient Greeks or person some other time period, make hypotheses, spherical [inaudible 01:12:41], whatever, speculate. And this is what mathematicians do and some other, what artists do actually.

Lex Fridman (01:12:48) It’s just incredible that given the extreme constraints, you could still say very powerful things. That’s why it’s inspiring. Looking back in history, how much can be figured out when you don’t have much to figure out stuff with.

Terence Tao (01:13:01) If you propose axioms, then the mathematics does. You follow those axioms to their conclusions and sometimes you can get quite a long way from initial hypotheses.

General relativity

Lex Fridman (01:13:10) If we can stay in the land of the weird. You mentioned general relativity. You’ve contributed to the mathematical understanding, Einstein’s field equations. Can you explain this work and from a mathematical standpoint, what aspects of general relativity are intriguing to you? Challenging to you?

Terence Tao (01:13:31) I have worked on some equations. There’s something called the wave maps equation or the Sigma field model, which is not quite the equation of space-time gravity itself, but of certain fields that might exist on top of space-time. So Einstein’s equations of relativity just describe space and time itself. But then there’s other fields that live on top of that. There’s the electromagnetic field, there’s things called Yang-Mills fields, and there’s this whole hierarchy of different equations of which Einstein’s considered one of the most nonlinear and difficult, but relatively low on the hierarchy was this thing called the wave maps equation. So it’s a wave which at any given point is fixed to be on a sphere. So I can think of a bunch of arrows in space and time. Yeah, so it’s pointing in different directions, but they propagate like waves. If you wiggle an arrow, it would propagate and make all the arrows move kind of like sheaves of wheat in a wheat field.

(01:14:27) And I was interested in the global regularity problem. Again for this question, is it possible for the energy here to collect at a point? So the equation I considered was actually what’s called a critical equation where it’s actually the behavior at all scales is roughly the same. And I was able barely to show that you couldn’t actually force a scenario where all the energy concentrated at one point, that the energy had to disperse a little bit at the moment, just a little bit. It would stay regular. Yeah, this was back in 2000. That was part of why I got interested in [inaudible 01:14:58] afterwards actually. So I developed some techniques to solve that problem. So part of it, this problem is really nonlinear because of the curvature of the sphere. There was a certain nonlinear effect, which was a non-perturbative effect. It was when you sort looked at it normally it looked larger than the linear effects of the wave equation. And so it was hard to keep things under control even when your energy was small.

(01:15:23) But I developed what’s called a gauge transformation. So the equation is kind of like an evolution of sheaves of wheat, and they’re all bending back and forth, so there’s a lot of motion. But if you imagine stabilizing the flow by attaching little cameras at different points in space, which are trying to move in a way that captures most of the motion, and under this stabilized flow, the flow becomes a lot more linear. I discovered a way to transform the equation to reduce the amount of nonlinear effects, and then I was able to solve the equation. I found the transformation while visiting my aunt in Australia, and I was trying to understand the dynamics of all these fields, and I couldn’t do a pen and paper, and I had not enough facility of computers to do any computer simulations.

(01:16:08) So I ended up closing my eyes being on the floor and just imagining myself to actually be this vector field and rolling around to try to see how to change coordinates in such a way that somehow things in all directions would behave in a reasonably linear fashion. And yeah, my aunt walked in on me while I was doing that and she was asking, “Why am I doing this?”

Lex Fridman (01:16:28) It’s complicated as the answer.

Terence Tao (01:16:30) “Yeah, yeah. And okay, fine. You are a young man. I don’t ask questions.”

Solving difficult problems

Lex Fridman (01:16:34) I have to ask about how do you approach solving difficult problems if it’s possible to go inside your mind when you’re thinking, are you visualizing in your mind the mathematical objects, symbols, maybe what are you visualizing in your mind? Usually when you’re thinking?

Terence Tao (01:16:57) A lot of pen and paper. One thing you pick up as a mathematician is I call it cheating strategically. So the beauty of mathematics is that you get to change the problem and change the rules as you wish. You don’t get to do this by any other field. If you’re an engineer and someone says, “Build a bridge over this river,” you can’t say, “I want to build this bridge over here instead,” or, “I want to put it out of paper instead of steel,” but a mathematician, you can do whatever you want on. It’s like trying to solve a computer game where there’s unlimited cheat codes available. And so you can set this, there’s a dimension that’s large. I’ve set it to one. I’ll solve the one-dimensional problem first. So there’s a main term and an error term. I’m going to make a spherical call assumption [inaudible 01:17:45] term is zero.

(01:17:45) And so the way you should solve these problems is not in this Iron Man mode where you make things maximally difficult, but actually the way you should approach any reasonable math problem is that if there are 10 things that are making your life difficult, find a version of the problem that turns off nine of the difficulties, but only keeps one of them and solve that. And then so you solve nine cheats. Okay, you solve 10 cheats, then the game is trivial, but you solve nine cheats. You solve one problem that teaches you how to deal with that particular difficulty. And then you turn that one-off and you turn someone else something else on, and then you solve that one. And after you know how to solve the 10 problems, 10 difficulties separately, then you have to start merging them a few at a time.

(01:18:26) As a kid, I watched a lot of these Hong Kong action movies from our culture, and one thing is that every time it’s a fight scene, so maybe the hero gets swarmed by a hundred bad-guy goons or whatever, but it’ll always be choreographed so that he’d always be only fighting one person at a time and it would defeat that person and move on. And because of that, he could defeat all of them. But whereas if they had fought a bit more intelligently and just swarmed the guy at once, it would make for much worse cinema, but they would win.

Lex Fridman (01:19:02) Are you usually pen and paper? Are you working with computer and LaTeX?

Terence Tao (01:19:08) Mostly pen and paper actually. So in my office I have four giant blackboards and sometimes I just have to write everything I know about the problem on the four blackboards and then sit my couch and just see the whole thing.

Lex Fridman (01:19:20) Is it all symbols like notation or is there some drawings?

Terence Tao (01:19:23) Oh, there’s a lot of drawing and a lot of bespoke doodles that only makes sense to me. And the beauty of a blackboard is you erase and it’s a very organic thing. I’m beginning to use more and more computers, partly because AI makes it much easier to do simple coding things that if I wanted to plot a function before, which is moderately complicated, has some iteration or something, I’d had to remember how to set up a Python program and how does a full loop work and debug it and it would take two hours and so forth. And now I can do it in 10, 15 minutes as much. I’m using more and more computers to do simple explorations.

AI-assisted theorem proving

Lex Fridman (01:20:01) Let’s talk about AI a little bit if we could. So maybe a good entry point is just talking about computer-assisted proofs in general. Can you describe the Lean formal proof programming language and how it can help as a proof assistant and maybe how you started using it and how it has helped you?

Terence Tao (01:20:25) So Lean is a computer language, much like standard languages like Python and C and so forth, except that in most languages the focus is on using executable code. Lines of code do things, they flip bits or they make a robot move or they deliver your text on the internet or something. So lean is a language that can also do that. It can also be run as a standard traditional language, but it can also produce certificates. So a software language like Python might do a computation and give you that the answer is seven. Okay, does the sum of three plus four equal to seven?

(01:20:59) But Lean can produce not just the answer, but a proof that how it got the answer of seven as three plus four and all the steps involved. So it creates these more complicated objects, not just statements, but statements with proofs attached to them. And every line of code is just a way of piecing together previous statements to create new ones. So the idea is not new. These things are called proof assistance, and so they provide languages for which you can create quite complicated mathematical proofs. They produce these certificates that give a 100% guarantee that your arguments are correct if you trust the compiler of Lean, but they made the compiler really small and there are several different compilers available for the Lean.

Lex Fridman (01:21:45) Can you give people some intuition about the difference between writing on pen and paper versus using Lean programming language? How hard is it to formalize statement?

Terence Tao (01:21:56) So Lean, a lot of mathematicians were involved in the design of Lean. So it’s designed so that individual lines of code resemble individual lines of mathematical argument. You might want to introduce a variable, you want to prove our contradiction. There are various standard things that you can do and it’s written. So ideally should like a one-to-one correspondence. In practice, it isn’t because Lean is explaining a proof to extremely pedantic colleague who will point out, “Okay, did you really mean this? What happens if this is zero? Okay, how do you justify this?” So Lean has a lot of automation in it to try to be less annoying. So for example, every mathematical object has to come with a type. If I talk about X, is X a rule number or a natural number or a function or something? If you write things informally, it’s often if you have context. You say, “Clearly X is equal to let X be the sum of Y and Z and Y and Z were already rule number, so X should also be a rule number.” So Lean can do a lot of that, but every so often it says, wait a minute, can you tell me more about what this object is? What type of object it is? You have to think more at a philosophical level, not just computations that you’re doing, but what each object actually is in some sense.

Lex Fridman (01:23:17) Is he using something like LLMs to do the type inference or you match with the real number?

Terence Tao (01:23:23) It’s using much more traditional what’s called good old-fashioned AI. You can represent all these things as trees, and there’s always algorithm to match one tree to another tree.

Lex Fridman (01:23:30) So it’s actually doable to figure out if something is a real number or a natural number.

Terence Tao (01:23:36) Every object comes with a history of where it came from, and you can kind of trace it.

Terence Tao (01:23:41) Yeah. So it’s designed for reliability. So modern AIs are not used in, it’s a disjoint technology. People are begin to use AIs on top of lean. So when a mathematician tries to program proven in lean, often there’s a step. Okay, now I want to use the fundamental thing on calculus, say to do the next step. So the lean developers have built this massive project called Mathlib, a collection of tens of thousands of useful facts about methodical objects.

(01:24:09) And somewhere in there is the fundamental calculus, but you need to find it. So a lot of the bottleneck now is actually lemma search. There’s a tool that you know is in there somewhere and you need to find it. And so there are various search engine engines specialized for Mathlib that you can do, but there’s now these large language models that you can say, “I need the fundamental calculus at this point.” And it was like, okay, for example, when I code, I have GitHub Copilot installed as a plugin to my IDE, and it scans my text and it sees what I need. Says I might even type, now I need to use the fundamental calculus. And then it might suggest, “Okay, try this,” and maybe 25% of the time it works exactly. And then another ten-fifty percent of the time it doesn’t quite work, but it’s close enough that I can say, oh yeah, if I just change it here and here, it’ll work. And then half the time it gives me complete rubbish. But people are beginning to use AIs a little bit on top, mostly on the level of basically fancy auto-complete that you can type half of one line of a proof and it will find, it’ll tell you.

Lex Fridman (01:25:11) Yeah, but a fancy, especially fancy with the sort of capital letter F, remove some of the friction mathematician might feel when they move from pen and paper to formalizing.

Terence Tao (01:25:23) Yes. Yeah. So right now I estimate that the time and effort taken to formalize it, proof is about 10 times the amount taken to write it out. So it’s doable, but it’s annoying.

Lex Fridman (01:25:36) But doesn’t it kill the whole vibe of being a mathematician? Having a pedantic coworker?

Terence Tao (01:25:42) Right? Yeah, if that was the only aspect of it, but there’s some cases was actually more pleasant to do things formally. So there’s a theorem I formalized, and there was a certain constant 12 that came out in the final statement. And so this 12 had been carried all through the proof and everything had to be checked all these other numbers that had to be consistent with this final number 12. And so we wrote a paper through this theorem with this number 12. And then a few weeks later someone said, “Oh, we can actually improve this 12 to an 11 by reworking some of these steps.” And when this happens with pen and paper, every time you change your parameter, you have to check line by line that every single line of your proof still works. And there can be subtle things that you didn’t quite realize, some properties, not number 12, that you didn’t even realize that you were taking advantage of. So a proof can break down at a subtle place.

(01:26:29) So we had formalized the proof with this constant 12, and then when this new paper came out, we said, “Oh,” so that took three weeks to formalize and 20 people to formalize this original proof. I said, “Now let’s update the proof to 11.” And what you can do with Lean is in your headline theorem, you change your 12 to 11, you run the compiler and off the thousands of lines of code, you have 90% of them still work and there’s a couple that are lined in red. Now, I can’t justify these steps, but immediately isolates which steps you need to change, but you can skip over everything, which works just fine.

(01:27:04) And if you program things correctly with good programming practices, most of your lines will not be red. And there’ll just be a few places where you, if you don’t hard code your constants, but you use smart tactics and so forth, you can localize the things you need to change to a very small period of time. So within a day or two, we had updated our proof because it’s this very quick process, you make a change. There are 10 things now that don’t work. For each one, you make a change and now there’s five more things that don’t work, but the process converges much more smoothly then with pen and paper.

Lex Fridman (01:27:40) So that’s for writing? Are you able to read it? If somebody else has a proof, are you able to, what’s the versus paper and?

Terence Tao (01:27:48) Yeah, so the proofs are longer, but each individual piece is easier to read. So if you take a math paper and you jump to page 27 and you look at paragraph six and you have a line of text of math, I often can’t read it immediately because it assumes various definitions, which I have to go back and maybe on 10 pages earlier this was defined and the proof is scattered all over the place, and you basically are forced to read fairly sequentially. It’s not like say a novel where in a theory you could open up a novel halfway through and start reading. There’s a lot of context. But when [inaudible 01:28:23] Lean, if you put your cursor on a line code, every single object there, you can hover over it and it would say what it is, where it came from, where stuff is justified. You can trace things back much easier than flipping through a math paper.

(01:28:34) So one thing that Lean really enables is actually collaborating on proofs at a really atomic scale that you really couldn’t do in the past. So traditionally with pen and paper, when you want to collaborate with another mathematician, either you do it at a blackboard where you can really interact, but if you’re doing it sort of by email or something, basically, yeah, you have to segment it. Say, “I’m going to finish section three, you do section four,” but you can’t really work on the same thing, collaborate at the same time.

(01:29:03) But with Lean, you can be trying to formalize some portion of proof and say, “I got stuck at line 67 here. I need to prove this thing, but it doesn’t quite work. Here’s the three lines of code I’m having trouble with.” But because all the context is there, someone else can say, “Oh, okay, I recognize what you need to do. You need to apply this trick or this tool,” and you can do extremely atomic-level conversations. So because of Lean, I can collaborate with dozens of people across the world, most of who I have never met in person, and I may not know actually even whether they’re how reliable they are in the proof-taking field, but Lean gives me a certificate of trust so I can do trustless mathematics.

Lex Fridman (01:29:43) So there’s so many interesting questions there. So one, you’re known for being a great collaborator. So what is the right way to approach solving a difficult problem in mathematics when you’re collaborating? Are you doing a divide and conquer type of thing? Or are you focused in on a particular part and you’re brainstorming?

Terence Tao (01:30:05) There’s always a brainstorming process first. Yeah, so math research projects, by their nature, when you start, you don’t really know how to do the problem. It’s not like an engineering project where somehow the theory has been established for decades and it’s implementation is the main difficulty. You have to figure out even what is the right path. So this is what I said about cheating first. It’s like to go back to the bridge building analogy. So first assume you have infinite budget and unlimited amounts of workforce and so forth. Now can you build this bridge? Okay, now have infinite budget, but only finite workforce, right? Now can you do that? And so forth. So, of course no engineer can actually do this. Like I say, they have fixed requirements. Yes, there’s this sort of jam sessions always at the beginning where you try all kinds of crazy things and you make all these assumptions that are unrealistic, but you plan to fix later.

(01:30:57) And you try to see if there’s even some skeleton of an approach that might work. And then hopefully that breaks up the problem into smaller sub problems, which you don’t know how to do. But then you focus on the sub ones. And sometimes different collaborators are better at working on certain things. So one of my themes I’m known for is a theme of Ben Green, which now called the Green-Tao theorem. It’s a statement that the primes contain mathematic progressions of an event. So it was a modification of his [inaudible 01:31:26] already. And the way we collaborated was that Ben had already proven a similar result for progressions of length three. He showed that such like the primes contain loss and loss of progressions of length three, even subsets of the primes, certain subsets do, but his techniques only worked for the three progressions. They didn’t work for longer.

(01:31:46) But I had these techniques coming from a [inaudible 01:31:48] theory, which is something that I had been playing with and I knew better than Ben at the time. And so if I could justify certain randomness properties of some set relating for primes, there’s a certain technical condition, which if I could have it, if Ben could supply me to this fact, I could conclude the theorem. But what I asked was a really difficult question in number theory, which he said, “There’s no way we can prove this.” So he said, “Can you prove your part of the theorem using a weak hypothesis that I have a chance to prove it?” And he proposed something which he could prove, but it was too weak for me. I can’t use this. So there was this conversation going back and forth, a hacker-

Lex Fridman (01:32:29) Different cheats to-

Terence Tao (01:32:31) Yeah, yeah, I want to cheat more. He wants to cheat less, but eventually we found a property which A, he could prove, and B, I could use, and then we could prove our theorem. So there are all kinds of dynamics. Every collaboration has some story. No two are the same.

Lean programming language

Lex Fridman (01:32:51) And then on the flip side of that, like you mentioned with Lean programming, now that’s almost like a different story because you can create, I think you’ve mentioned a blueprint for a problem, and then you can really do a divide and conquer with Lean where you’re working on separate parts and they’re using the computer system proof checker essentially to make sure that everything is correct along the way.

Terence Tao (01:33:17) So it makes everything compatible and trustable. Yeah, so currently only a few mathematical projects can be cut up in this way. At the current state of the art, most of the Lean activity is on formalizing proofs that have already been proven by humans. A math paper basically is a blueprint in a sense. It is taking a difficult statement like big theorem and breaking up into me a hundred little lemmas, but often not all written with enough detail that each one can be sort of directly formalized.

(01:33:46) A blueprint is a really pedantically written version of a paper where every step is explained as much detail as possible and just trying to make each step kind of self-contained or depending on only a very specific number of previous statements that been proven so that each node of this blueprint graph that gets generated can be tackled independently of the others. And you don’t even need to know how the whole thing works. So it’s like a modern supply chain. If you want to create an iPhone or some other complicated object, no one person can build up a single object, but you can have specialists who just, if they’re given some widgets from some other company, they can combine them together to form a slightly bigger widget.

Lex Fridman (01:34:27) I think that’s a really exciting possibility because you can have, if you can find problems that could be broken down in this way, then you could have thousands of contributors, right? To be completely distributed.

Terence Tao (01:34:39) So I told you before about this split between theoretical and experimental mathematics. And right now most mathematics is theoretical, only a tiny bit it’s experimental. I think the platform that Lean and other software tools, so GitHub and things like that will allow experimental mathematics to scale up to a much greater degree than we can do now. So right now, if you want to do any mathematical exploration of some mathematical pattern or something, you need some code to write out the pattern. And I mean, sometimes there are some computer algebra packages that could help, but often it’s just one mathematician coding lots and lots of Python or whatever. And because coding is such an error-prone activity, it’s not practical to allow other people to collaborate with you on writing modules for your code because if one of the modules has a bug in it, the whole thing is unreliable. So you get these bespoke spaghetti code written by non-professional programmers, mathematicians, and they’re clunky and slow. And so because of that, it’s hard to really mass-produce experimental results.

(01:35:45) But I think with Lean, I’m already starting some projects where we are not just experimenting with data, but experimenting with proofs. So I have this project called the Equational Theories Project. Basically we generated about 22 million little problems in abstract algebra. Maybe I should back up and tell you what the project is. Okay, so abstract algebra studies operations like multiplication, addition and the abstract properties. So multiplication for example, is commutative. X times Y is always Y times X, at least for numbers. And it’s also associative. X times Y times Z is the same as X times Y times Z. So these operations obey some laws that don’t obey others. For example, X times X is not always equal to X. So that law is not always true. So given any operation, it obeys some laws and not others. And so we generated about 4,000 of these possible laws of algebra that certain operations can satisfy.

(01:36:38) And our question is which laws imply which other ones, so for example, does commutativity imply associativity? And the answer is no, because it turns out you can describe an operation which obeys the commutative law but doesn’t obey the associative law. So by producing an example, you can show that commutativity does not imply associativity. But some other laws do imply other laws by substitution and so forth, and you can write down some algebraic proof. So we look at all the pairs between these 4,000 laws and this up 22 million of these pairs. And for each pair we ask, does this law imply this law? If so, give a proof. If not, give a counterexample. So 22 million problems, each one of which you could give to an undergraduate algebra student, and they had a decent chance of solving the problem, although there are a few, at least 22 million, like a hundred or so that are really quite hard, but a lot are easy. And the project was just to work out to determine the entire graph which ones imply which other ones.

Lex Fridman (01:37:31) That’s an incredible project, by the way. Such a good idea, such a good test that the very thing we’ve been talking about at a scale that’s remarkable.

Terence Tao (01:37:38) So it would not have been feasible. The state of the art in the literature was like 15 equations and sort of how they applied, that’s at the limit of what a human with pen and paper can do. So you need to scale that up. So you need to crowdsource, but you also need to trust all the, no one person can check 22 million of these proofs. You need it to be computerized. And so it only became possible with Lean. We were hoping to use a lot of AI as well. So the project is almost complete. So at these 22 million, all but two had been settled.

Terence Tao (01:38:12) Well, actually, and of those two, we have a pen and paper proof of the two, and we’re formalizing it. In fact, this morning I was working on finishing it, so we’re almost done on this.

Lex Fridman (01:38:25) How many people were you able to get?

Terence Tao (01:38:26) About 50, which in mathematics is considered a huge number.

Lex Fridman (01:38:30) It’s a huge number. That’s crazy.

Terence Tao (01:38:32) Yeah. So we’re going to have a paper of 50 authors and a big appendix of who contributed what.

Lex Fridman (01:38:38) Here’s an question, not to maybe speak even more generally about it. When you have this pool of people, is there a way to organize the contributions by level of expertise of the people, of the contributors? Now, okay, I’m asking a lot of pothead questions here, but I’m imagining a bunch of humans, and maybe in the future, some AIs, can there be an ELO rating type of situation?

Lex Fridman (01:39:00) Can there be an Elo-rating type of situation where a gamification of this.

Terence Tao (01:39:07) The beauty of these lean projects is that automatically you get all this data, so everything’s being uploaded for this GitHub. GitHub tracks who contributed what. So you could generate statistics at any later point in time. You can say, “Oh, this person contributed this many lines of code” or whatever. These are very crude metrics. I would definitely not want this to become part of your tenure review or something. But I mean, I think already in enterprise computing, people do use some of these metrics as part of the assessment of performance of an employee. Again, this is the direction which is a bit scary for academics to go down. We don’t like metrics so much.

Lex Fridman (01:39:49) And yet academics use metrics. They just use old ones, number of papers.

Terence Tao (01:39:56) Yeah, it’s true that…

Lex Fridman (01:39:59) It feels like this is a metric, while flawed, is going in more in the right direction. Right.

Lex Fridman (01:40:06) It’s interesting. At least it’s a very interesting metric.

Terence Tao (01:40:08) Yeah, I think it’s interesting to study. I think you can do studies of whether these are better predictors. There’s this problem called Goodhart’s Law. If a statistic is actually used to incentivize performance, it becomes gamed, and then it’s no longer a useful measure.

Lex Fridman (01:40:22) Oh, humans. Always gaming the…

Terence Tao (01:40:25) It’s rational. So what we’ve done for this project is self-report. So there are actually standard categories from the sciences of what types of contributions people give. So there’s the concept and validation and resources and coding and so forth. So there’s a standard list of twelve or so categories, and we just ask each contributor to… There’s a big matrix of all the authors and all the categories just to tick the boxes where they think that they contributed, and just give a rough idea. Also, you did some coding and you provided some compute, but you didn’t do any of the pen- and-paper verification or whatever.

(01:41:02) And I think that that works out. Traditionally, mathematicians just order alphabetically by surname. So we don’t have this tradition as in the sciences of “lead author” and “second author” and so forth, which we’re proud of. We make all the authors equal status, but it doesn’t quite scale to this size. So a decade ago I was involved in these things called polymath projects. It was the crowdsourcing mathematics but without the lean component. So it was limited by, you needed a human moderator to actually check that all the contributions coming in were actually valid. And this was a huge bottleneck, actually, but still we had projects that were 10 authors or so. But we had decided, at the time, not to try to decide who did what, but to have a single pseudonym. So we created this fictional character called DHJ Polymath in the spirit of [inaudible 01:41:51]. This is the pseudonym for a famous group of mathematicians in the 20th century.

(01:41:56) And so the paper was authored on the pseudonym, so none of us got the author credit. This actually turned out to be not so great for a couple of reasons. So one is that if you actually wanted to be considered for tenure or whatever, you could not use this paper in your… As you submitted as one your publications, because it didn’t have the formal author credit. But the other thing that we’ve recognized much later is that when people referred to these projects, they naturally referred to the most famous person who was involved in the project. So “This was Tim Gower’s playoff project.” “This was Terence Tao’s playoff project,” and not mention the other 19 or whatever people that were involved.

Terence Tao (01:42:37) So we’re trying something different this time around where we have, everyone’s an author, but we will have an appendix with this matrix, and we’ll see how that works.

DeepMind’s AlphaProof

Lex Fridman (01:42:45) So both projects are incredible, just the fact that you’re involved in such huge collaborations. But I think I saw a talk from Kevin Buzzard about the Lean Programming Language just a few years ago, and you’re saying that this might be the future of mathematics. And so it’s also exciting that you’re embracing one of the greatest mathematicians in the world embracing this, what seems like the paving of the future of mathematics.

(01:43:12) So I have to ask you here about the integration of AI into this whole process. So DeepMind’s alpha proof was trained using reinforcement learning on both failed and successful formal lean proofs of IMO problems. So this is sort of high-level high school?

Terence Tao (01:43:32) Oh, very high-level, yes.

Lex Fridman (01:43:33) Very high-level, high-school level mathematics problems. What do you think about the system and maybe what is the gap between this system that is able to prove the high-school level problems versus gradual-level problems?

Terence Tao (01:43:47) Yeah, the difficulty increases exponentially with the number of steps involved in the proof. It’s a commentarial explosion. So the thing of large language models is that they make mistakes and so if a proof has got 20 steps and your [inaudible 01:44:01] has a 10% failure rate at each step of going the wrong direction, it’s extremely unlikely to actually reach the end.

Lex Fridman (01:44:09) Actually, just to take a small tangent here, how hard is the problem of mapping from natural language to the formal program?

Terence Tao (01:44:19) Oh yeah. It’s extremely hard, actually. Natural language, it’s very fault-tolerant. You can make a few minor grammatical errors and speak in the second language, can get some idea of what you’re saying. But formal language, if you get one little thing wrong, then the whole thing is nonsense. Even formal to formal is very hard. There are different incompatible prefaces and languages. There’s Lean, but also Cox and Isabelle and so forth. And even converting from a formal action to formal language is an unsolved problem.

Lex Fridman (01:44:52) That is fascinating. Okay. But once you have an informal language, they’re using their RL trained model, something akin to AlphaZero that they used to go to then try to come up with tools, also have a model. I believe it’s a separate model for geometric problems.

Lex Fridman (01:45:12) So what impresses you about the system, and what do you think is the gap?

Terence Tao (01:45:18) Yeah, we talked earlier about things that are amazing over time become kind of normalized. So now somehow, it’s of course geometry is a silver bullet problem.

Lex Fridman (01:45:27) Right. That’s true, that’s true. I mean, it’s still beautiful to…

Terence Tao (01:45:31) Yeah, these are great work that shows what’s possible. The approach doesn’t scale currently. Three days of Google’s server time can solve one high school math format there. This is not a scalable prospect, especially with the exponential increase as the complexity increases.

Lex Fridman (01:45:49) We should mention that they got a silver medal performance. The equivalent of the silver medal performance.

Terence Tao (01:45:55) So first of all, they took way more time than was allotted, and they had this assistance where the humans started helped by formalizing, but also they’re giving themselves full marks for the solution, which I guess is formally verified. So I guess that’s fair. There are efforts, there will be a proposal at some point to actually have an AI math Olympiad where at the same time as the human contestants get the actual Olympiad problems, AI’s will also be given the same problems, the same time period and the outputs will have to be graded by the same judges, which means that it’ll have be written in natural language rather than formal language.

Lex Fridman (01:46:37) Oh, I hope that happens. I hope that this IMO happens. I hope next one.

Terence Tao (01:46:41) It won’t happen this IMO. The performance is not good enough in the time period. But there are smaller competitions, there are competitions where the answer is a number rather than a long form proof. And AI is actually a lot better at problems where there’s a specific numerical answer, because it’s easy to do reinforcement learning on it.” Yeah, you’ve got the right answer, you’ve got the wrong answer.” It’s a very clear signal, but a long form proof either has to be formal, and then the Lean can give it thumbs up or down, or it’s informal, but then you need a human to create it to tell. And if you’re trying to do billions of reinforcement learning runs, you can’t hire enough humans to grade those. It’s already hard enough for the last language to do reinforcement learning on just the regular text that people get. But now we actually hire people, not just give thumbs up, thumbs down, but actually check the output mathematically, yeah, that’s too expensive.

Human mathematicians vs AI

Lex Fridman (01:47:45) So if we just explore this possible future, what is the thing that humans do that’s most special in mathematics, that you could see AI not cracking for a while? So inventing new theories? Coming up with new conjectures versus proving the conjectures? Building new abstractions? New representations? Maybe an AI turnstile with seeing new connections between disparate fields?

Terence Tao (01:48:17) That’s a good question. I think the nature of what mathematicians do over time has changed a lot. So a thousand years ago, mathematicians had to compute the date of Easter, and they really complicated calculations, but it is all automated, the order of centuries, we don’t need that anymore. They used to navigate to do spherical navigation, circle trigonometry to navigate how to get from the Old World to the New or something, like very complicated calculation. Again, have been automated. Even a lot of undergraduate mathematics even before AI, like Wolfram Alpha for example. It’s not a language model, but it can solve a lot of undergraduate-level math tasks. So on the computational side, verifying routine things, like having a problem and saying, ” Here’s a problem in partial differential equations, could you solve it using any of the 20 standard techniques?” And say, “Yes, I’ve tried all 20 and here are the 100 different permutations and my results.”

(01:49:12) And that type of thing, I think it will work very well, type of scaling to once you solve one problem to make the AI attack a hundred adjacent problems. The things that humans do still… So where the AI really struggles right now is knowing when it’s made a wrong turn. It can say, “Oh, I’m going to solve this problem. I’m going to split up this one into these two cases. I’m going to try this technique.” And sometimes, if you’re lucky and it’s a simple problem, it’s the right technique and you solve the problem and sometimes it will have a problem, it would propose an approach which is just complete nonsense, but it looks like a proof.

(01:49:53) So this is one annoying thing about LLM-generated mathematics. So yeah, we’ve had human generated mathematics as a very low quality, like submissions who don’t have the formal training and so forth, but if a human proof is bad, you can tell it’s bad pretty quickly. It makes really basic mistakes. But the AI-generated proofs, they can look superficially flawless. And it’s partly because what the reinforcement learning has actually trained them to do, to make things to produce tech that looks like what is correct, which for many applications is good enough. So the air is often really subtle and then when you spot them, they’re really stupid. Like no human would’ve actually made that mistake.

Lex Fridman (01:50:36) Yeah, it’s actually really frustrating in the programming context, because I program a lot, and yeah, when a human makes low quality code, there’s something called “code smell”, right? You can tell immediately there’s signs, but with the AI generated code…

Terence Tao (01:50:53) [inaudible 01:50:53].

Lex Fridman (01:50:52) And you’re right, eventually you find an obvious dumb thing that just looks like good code.

Lex Fridman (01:51:00) It’s very tricky, too, and frustrating, for some reason, to have to work.

Terence Tao (01:51:05) So the sense of smell, this is one thing that humans have, and there’s a metaphorical mathematical smell that it’s not clear how to get the AI to duplicate that eventually. So the way AlphaZero and so forth make progress on Go and chess and so forth, is in some sense they have developed a sense of smell for Go and chess positions, that this position is good for white, it’s good for black. They can’t initiate why, but just having that sense of smell lets them strategize. So if AIs gain that ability to a sense of viability of certain proof strategies, because I’m going to try to break up this problem into two small subtasks and they can say, “Oh, this looks good. The two tasks look like they’re simpler tasks than your main task and they’ve still got a good chance of being true. So this is good to try.” Or “No, you’ve made the problem worse, because each of the two subproblems is actually harder than your original problem,” which is actually what normally happens if you try a random thing to try normally it’s very easy to transform a problem into an even harder problem. Very rarely do you transform into a simpler problem. So if they can pick up a sense of smell, then they could maybe start competing with a human level of mathematicians.

Lex Fridman (01:52:24) So this is a hard question, but not competing but collaborating. Okay, hypothetical. If I gave you an Oracle that was able to do some aspect of what you do and you could just collaborate with it, what would you like that Oracle to be able to do? Would you like it to maybe be a verifier, like check, do the codes? Like “Yes, Professor Tao, correct, this is a promising fruitful direction”? Or would you like it to generate possible proofs and then you see which one is the right one? Or would you like it to maybe generate different representation, different totally different ways of seeing this problem?

Terence Tao (01:53:10) Yeah, I think all of the above. A lot of it is we don’t know how to use these tools, because it’s a paradigm that… We have not had in the past. Systems that are competent enough to understand complex instructions that can work at massive scale, but are also unreliable. It’s an interesting… A bit unreliable in subtle ways, whereas was providing sufficiently good output. It’s an interesting combination. I mean, you have graduate students that you work with who kind of like this, but not at scale. And we had previous software tools that can work at scale, but very narrow, so we have to figure out how to use, so Tim Gowers is actually, you mentioned he actually foresaw in 2000. He was envisioning what mathematics would look like in actually two and a half decades.

Terence Tao (01:54:09) Yeah, he wrote his article, a hypothetical conversation between a mathematical assistant of the future and himself. He’s trying to solve a problem and they would have a conversation. Sometimes the human would propose an idea and the AI would evaluate it, and sometimes the AI would propose an idea and sometimes a competition was required and AI would just go and say, “Okay, I’ve checked the 100 cases needed here,” or “The first you set this through for all N, I’ve checked N up to 100 and it looks good so far,” or “Hang on, there’s a problem at N equals 46.” So just a freeform conversation where you don’t know in advance where things are going to go, but just based on, “I think ideas are going to propose on both sides.” Calculations could propose on both sides.

(01:54:53) I’ve had conversations with AI where I say, “Okay, we’re going to collaborate to solve this math problem,” and it’s a problem that I already know the solution to, so I try to prompt it. “Okay, so here’s the problem.” I suggest using this tool, and it’ll find this.” Okay, it might start using this, and then it’ll go back to the tool that it wanted to do before. You have to keep railroading it onto the path you want, and I could eventually force it to give the proof I wanted, but it was like herding cats. And the amount of personal effort I had to take to not just prompt it but also check its output because a lot of what it looked like is going to work, I know there’s a problem on line 17, and basically arguing with it. It was more exhausting than doing it unassisted, but that’s the current state of the art.

Lex Fridman (01:55:44) I wonder if there’s a phase shift that happens to where it’s no longer feels like herding cats. And maybe you’ll surprise us how quickly that comes.

Terence Tao (01:55:54) I believe so. In formalization, I mentioned before that it takes 10 times longer to formalize a proof than to write it by hand. With these modern AI tools and also just better tooling, the Lean developers are doing a great job adding more and more features and making it user-friendly going from nine to eight to seven… Okay, no big deal, but one day it’ll drop a little one. And that’s a phase shift, because suddenly it makes sense when you write a paper to write it in Lean first, or through a conversation with AI, which is generally on the fly with you, and it becomes natural for journals to accept. Maybe they’ll offer expedite refereeing. If a paper has already been formalized in Lean, they’ll just ask the referee to comment on the significance of the results and how it connects to literature and not worry so much about the correctness, because that’s been certified. Papers are getting longer and longer in mathematics, and it’s harder and harder to get good refereeing for the really long ones unless they’re really important. It is actually an issue, and the formalization is coming in at just the right time for this to be.

Lex Fridman (01:57:04) And the easier and easier to guess because of the tooling and all the other factors, then you’re going to see much more like math lib will grow potentially exponentially, as it’s a virtuous cycle.

Terence Tao (01:57:16) I mean, one phase shift of this type that happened in the past was the adoption of LaTeX. So LaTeX is this typesetting language that all mathematicians use now. So in the past people used all kinds of word processors and typewriters and whatever, but at some point LaTeX became easier to use than all other competitors, and people would switch within a few years. It was just a dramatic base shift.

AI winning the Fields Medal

Lex Fridman (01:57:37) It’s a wild, out-there question, but what year, how far away are we from a AI system being a collaborator on a proof that wins the Fields Medal? So that level.

Terence Tao (01:57:55) Okay, well it depends on the level of collaboration, right?

Lex Fridman (01:57:58) No, it deserves to be get the Fields Medal. So half-and-half

Terence Tao (01:58:03) Already. I can imagine if it was a medal-winning paper, having some AI assistance in writing it just like the order complete alone is already, I use it speeds up my own writing. You can have a theorem and you have a proof, and the proof has three cases, and I write down the proof of first case and the autocomplete just suggests that. Now here’s how the proof of second case could work. And it was exactly correct. That was great. Saved me like five, ten minutes of typing.

Lex Fridman (01:58:30) But in that case, the AI system doesn’t get the Fields Medal. Are we talking 20 years, 50 years, a hundred years? What do you think?

Terence Tao (01:58:42) Okay, so I gave a prediction in print by 2026, which is now next year, there will be math collaborations with the AI, so not Fields-Medal winning, but actual research-level papers.

Lex Fridman (01:58:54) Published ideas that are in part generated by AI.

Terence Tao (01:58:58) Maybe not the ideas, but at least some of the computations, the verifications.

Lex Fridman (01:59:03) Has that already happened?

Terence Tao (01:59:04) That already happened. There are problems that were solved by a complicated process conversing with AI to propose things and the human goes and tries it and the contract doesn’t work, but it might pose a different idea. It’s hard to disentangle exactly. There are certainly math results which could only have been accomplished because there was a human authentication and an AI involved, but it’s hard to disentangle credit. I mean, these tools, they do not replicate all the skills needed to do mathematics, but they can replicate some non-trivial percentage of them, 30, 40%, so they can fill in gaps. So coding is a good example. So it’s annoying for me to code in Python. I’m not a native, I’m a professional programmer, but with AI, the friction cost of doing it is much reduced. So it fills in that gap for me. AI is getting quite good at literature review.

(02:00:15) I mean, it’s still a problem with hallucinating references that don’t exist, but this, I think, is a solvable problem. If you train in the right way and so forth and verify using the internet, you should, in a few years, get to the point where you have a lemma that you need and say, “Has anyone proven this lemma before?” And it will do basically a fancy web search and say, yeah, there are these six papers where something similar has happened. I mean, you can ask it right now and it’ll give you six papers of which maybe one is legitimate and relevant, one exists but is not relevant, and four are hallucinated. It has a non-zero success rate right now, but there’s so much garbage, so much the signal-to-noise ratio is so poor, that it’s most helpful when you already somewhat know the relationship, and you just need to be prompted to be reminded of a paper that was already subconsciously in your memory.

Lex Fridman (02:01:14) Versus helping you discover new you were not even aware of, but is the correct citation.

Terence Tao (02:01:20) Yeah, that it can sometimes do, but when it does, it’s buried in a list of options for which the other-

Lex Fridman (02:01:26) That are bad. I mean, being able to automatically generate a related work section that is correct. That’s actually a beautiful thing. That might be another phase shift because it assigns credit correctly. It breaks you out of the silos of thought.

Terence Tao (02:01:42) Yeah, no, there’s a big hump to overcome right now. I mean it’s like self-driving cars. The safety margin has to be really high for it to be feasible. So yeah, there’s a [inaudible 02:01:54]-Morrow problem with a lot of AI applications that they can develop tools that work 20%, 80% of the time, but it’s still not good enough. And in fact, even worse than good, in some ways.

Lex Fridman (02:02:08) I mean, another way of asking the Fields Medal question is what year do you think you’ll wake up and be like real surprised? You read the headline, the news or something happened that AI did, real breakthrough. Something. Like Fields Medal, even a hypothesis. It could be really just this AlphaZero Go moment would go that kind of thing.

Terence Tao (02:02:33) Yeah, this decade I can see it making a conjecture between two things that people would thought was unrelated.

Lex Fridman (02:02:42) Oh, interesting. Generating a conjecture. That’s a beautiful conjecture.

Terence Tao (02:02:45) Yeah. And actually has a real chance of being correct and meaningful.

Lex Fridman (02:02:50) Because that’s actually kind of doable, I suppose, but the word of the data is…

Lex Fridman (02:02:56) No, that would be truly amazing.

Terence Tao (02:02:59) The current models struggle a lot. I mean, so a version of this… The physicists have a dream of getting the AI to discover new laws of physics. The dream is you just feed it all this data, and this is here is a new patent that we didn’t see before, but it actually, even the current state of the art even struggles to discover old laws of physics from the data. Or if it does, there’s a big concern of contamination, that it did it only because it’s somewhere in this training, somehow new, Boyle’s Law or whatever you’re trying to reconstruct.

(02:03:35) Part of it is we don’t have the right type of training data for this. So for laws of physics, we don’t have a million different universes with a million different laws of nature. And a lot of what we are missing in math is actually the negative space of… So we have published things of things that people have been able to prove, and conjectures that end up being verified or we counter examples produced, but we don’t have data on things that were proposed and they’re kind of a good thing to try, but then people quickly realized that it was the wrong conjecture and then they said, “Oh, but we should actually change our claim to modify it in this way to actually make it more plausible.”

(02:04:16) There’s a trial and error process, which is a real integral part of human mathematical discovery, which we don’t record because it’s embarrassing. We make mistakes, and we only like to publish our wins. And the AI has no access to this data to train on. I sometimes joke that basically AI has to go through grad school and actually go to grad courses, do the assignments, go to office hours, make mistakes, get advice on how to correct the mistakes and learn from that.

Grigori Perelman

Lex Fridman (02:04:47) Let me ask you if I may, about Grigori Perelman, you mentioned that you try to be careful in your work and not let a problem completely consume you just you’ve really fall in love with the problem and it really cannot rest until you solve it. But you also hastened to add that sometimes this approach actually can be very successful, and an example you gave is Grigori Perelman who proved the Poincare Conjecture and did so by working alone for seven years, with basically little contact with the outside world. Can you explain this one Millennial Prize problem that’s been solved, Poincare Conjecture, and maybe speak to the journey that Grigori Perelman has been on?

Terence Tao (02:05:31) All right, so it’s a question about curved spaces. Earth is a good example. So think of Earth as a 2-D surface. Injecting around you could maybe be a torus with a hole in it or can have many holes and there are many different topologies, a priori, that a surface could have, even if you assume that it’s bounded and smooth and so forth. So we have figured out how to classify surfaces as a first approximation. Everything is determined by some called the genus, how many holes it has. So a sphere has genus zero, or a donut has genus one, and so forth. And one way you can tell the surfaces apart, probably the sphere has, which is called simply connected. If you take any closed loop on the sphere, like a big closed loop of rope, you can contract it to a point while staying on the surface. And the sphere has this property, but a torus doesn’t. If you’re on a torus and you take a rope that goes around say the outer diameter of torus, there’s no way… It can’t get through the hole. There’s no way to contract it to a point.

(02:06:25) So it turns out that the sphere is the only surface with this property of contractibility, up to continuous deformations of the sphere. So things that are what called topologically equivalent of the sphere. So Poincare asks the same question, higher dimensions, so it becomes hard to visualize because surface you can think of as embedded in three dimensions, but a curved three-space, we don’t have good intuition of four-dimensional space to live it. And there are also three-dimensional spaces that can’t even fit into four dimensions. You need five or six or higher. But anyway, mathematically you can still pose this question, that if you have a bounded three- dimensional space now, which also has this simply connected property that every loop can be contracted, can you turn it into a three-dimensional version of the sphere? And so this is the Poincare conjecture.

(02:07:09) Weirdly, in higher dimensions, four and five was actually easier. So it was solved first in higher dimensions, there’s somehow more room to do the deformation. It is easier to move things around to your sphere. But three was really hard. So people tried many approaches. There’s sort of commentary approaches where you chop up the surface into little triangles or tetrahedra and you just try to argue based on how the faces interact each other. There were algebraic approaches, there’s various algebraic objects like things called the fundamental group that you can attach to these homologies and co-homology and all these very fancy tools. They also didn’t quite work, but Richard Hamilton’s proposed a partial differential equations approach.

(02:07:52) So the problem is that… So you have this object, which is secret is a sphere, but it’s given to you in a weird way. So I think of a ball that’s being crumpled up and twisted, and it’s not obvious that it’s the ball, but if you have some sort of surface, which is a deformed sphere, you could for example, think that as a surface of a balloon, you could try to inflate it, you blow it up and naturally as you fill it with air, the wrinkles will sort of smooth out and it will turn into a nice round sphere, unless of course it was a torus or something, which case it would get stuck at some point.

(02:08:32) If you inflate a torus, there be a point in the middle when the inner ring shrinks to zero, you get a singularity and you can’t blow up any further and you can’t flow further. So he created this flow, which is now called Ricci Flow, which is a way of taking an arbitrary surface or space and smoothing it out to make it rounder and rounder to make it look like a sphere. And he wanted to show that either this process would give you a sphere, or it would create a singularity, actually very much like how PDEs either they have global regularity or finite and blow up. Basically, it’s almost exactly the same thing. It’s all connected. And he showed that for two dimensions, two-dimensional surfaces, if you start to simply connect it, no singularities ever formed, you never ran into trouble and you could flow and it will give you a sphere. So he got a new proof of the two-dimensional result.

Lex Fridman (02:09:20) But by the way, that’s a beautiful explanation of Ricci flow in its application in this context. How difficult is the mathematics here for the 2D case? Is it?

Terence Tao (02:09:27) Yeah, these are quite sophisticated equations on par with the Einstein equations. Slightly simpler, but they were considered hard nonlinear equations to solve, and there’s lots of special tricks in 2D that helped. But in 3D, the problem was that this equation was actually super critical. The same problem as [inaudible 02:09:48]. As you blow up, maybe the curvature could get concentrated in smaller and smaller regions, and it looked more and more nonlinear and things just looked worse and worse. And there could be all kinds of singularities that showed up. Some singularities, these things called neck pinches where the surface behaves like a barbell and it pinches at a point. Some singularities are simple enough that you can sort of see what to do next. You just make a snip and then you can turn one surface into two and e-bolt them separately. But there was the prospect that there’s some really nasty knotted singularities showed up that you couldn’t see how to resolve in any way, that you couldn’t do any surgery to. So you need to classify all the singularities, like what are all the possible ways that things can go wrong? So what Perelman did was, first of all, he made the problem, he turned the problem from a super critical problem to a critical problem. I said before about how the invention of energy, the Hamiltonian, really clarified Newtonian mechanics. So he introduced something which is now called Perelman’s reduced volume and Perelman’s entropy. He introduced new quantities, kind of like energy, that looked the same at every single scale, and turned the problem into a critical one where the non-linearities actually suddenly looked a lot less scary than they did before. And then he had to solve… He still had to analyze the singularities of this critical problem. And that itself was a problem similar to this wave map thing I worked on actually. So on the level of difficulty of that.

(02:11:18) So he managed to classify all the singularities of this problem, and show how to apply surgery to each of these. And through that was able to resolve the Poincare Conjecture. So quite a lot of really ambitious steps, and nothing that a large language model today, for example, could… At best, I could imagine a model proposing this idea as one of hundreds of different things to try, but the other 99 would be complete dead ends. But you’d only find out after months of work, he must have had some sense that this was the right track to pursue. It takes years to get from A to B.

Lex Fridman (02:11:54) So you’ve done, like you said, actually, you see even strictly mathematically, but more broadly in terms of the process, you’ve done similar-

Lex Fridman (02:12:01) In terms of the process, you’ve done similarly difficult things. What can you infer from the process he was going through because he was doing it alone? What are some low points in a process like that when you start to, you’ve mentioned hardship, AI doesn’t know when it’s failing. What happens to you, you’re sitting in your office when you realize the thing you did for the last few days, maybe weeks is a failure?

Terence Tao (02:12:27) Well, for me, I switch to a different problem. So I’m a fox, I’m not a hedgehog.

Lex Fridman (02:12:33) But you’re generally, that is a break that you can take, is to step away and look at a different problem?

Terence Tao (02:12:37) Yeah, yeah. You can modify the problem too. I mean, you can ask some cheater if there’s a specific thing that’s blocking you that some bad case keeps showing up, that for which your tool doesn’t work. You can just assume by fiat this bad case doesn’t occur. So you do some magical thinking, but strategically okay for the point to see if the rest of the argument goes through. If there’s multiple problems with your approach, then maybe you just give up. But if this is the only problem but everything else checks out, then it’s still worth fighting. So yeah, you have to do some forward reconnaissance sometimes too.

Lex Fridman (02:13:18) And that is sometimes productive to assume like, “Okay, we’ll figure it out eventually”?

Terence Tao (02:13:21) Oh, yeah, yeah. Sometimes actually it’s even productive to make mistakes. So one of, there was a project which actually we won some prizes for with four other people. We worked on this PDE problem. Again, actually this blow-off regularity type problem, and it was considered very hard. Jean Bourgiugnon was another Fields mathematist who worked on a special case of this, but he could not solve the general case. And we worked on this problem for two months and we thought we solved it. We had this cute argument that if everything fit, and we were excited, we were planning celebration, to all get together and have champagne or something, and we started writing it up. And one of us, not me actually, but another co-author said, “Oh, in this lemma here, we have to estimate these 13 terms that show up in this expansion.

(02:14:13) And we estimate 12 of them, but in our notes, I can’t find the estimation of the 13th. Can someone supply that?” And I said, “Sure, I’ll look at this.” Yeah, we didn’t cover it, we completely omitted this term and this term turned out to to be worse than the other 12 terms put together. In fact, we could not estimate this term. And we tried for a few more months and all different permutations, and there was always this one term that we could not control. And so this was very frustrating. But because we had already invested months and months of effort in this already, we stuck at this, which we tried increasingly desperate things and crazy things. And after two years we found an approach that was somewhat different, but quite a bit from our initial strategy, which actually didn’t generate these problematic terms and actually solve the problem.

(02:14:58) So we solve the problem after two years, but if we hadn’t had that initial false dawn of nearly solving a problem, we would’ve given up by month two or something and worked on an easier problem. If we had known it would take two years, not sure we would’ve started the project. Sometimes actually having the incorrect, it’s like Columbus struggling in the new world, they had an incorrect measurement of the size of the Earth. He thought he was going to find a new trade route to India, or at least that was how he sold it in his prospectus. I mean, it could be that he actually secretly knew, but.

Lex Fridman (02:15:31) Just from a psychological element, do you have emotional or self-doubt that just overwhelms you in moments like that? Because this stuff, it feels like math is so engrossing that it can break you when you invest so much of yourself in the problem and then it turns out wrong. You could start to… A similar way chess has broken some people.

Terence Tao (02:15:59) Yeah, I think different mathematicians have different levels of emotional investment in what they do. I mean, I think for some people it’s as a job, you have a problem and if it doesn’t work out, you go on the next one. So the fact that you can always move on to another problem, it reduces the emotional connection. I mean, there are cases, so there are certain problems that are what are called mathematical diseases where just latch onto that one problem and they spend years and years thinking about nothing but that one problem. And maybe their career suffers and so forth, but they say, “Okay, I’ve got this big win. Once I finish this problem, it will make up for all the years of lost opportunity.” I mean, occasionally it works, but I really don’t recommend it for people without the right fortitude.

(02:16:54) So I’ve never been super invested in any one problem. One thing that helps is that we don’t need to call our problems in advance. Well, when we do grant proposals, we say we will study this set of problems, but even though we don’t promise, definitely by five years I will supply a proof of all these things. You promise to make some progress or discover some interesting phenomena. And maybe you don’t solve the problem, but you find some related problem that you can say something new about and that’s a much more feasible task.

Twin Prime Conjecture

Lex Fridman (02:17:27) But I’m sure for you, there’s problems like this. You have made so much progress towards the hardest problems in the history of mathematics. So is there a problem that just haunts you? It sits there in the dark corners, twin prime conjecture, Riemann hypothesis, Goldbach’s conjecture?

Terence Tao (02:17:48) Twin prime, that sounds… Look, again, I mean, the problems like the Riemann hypothesis, those are so far out of reach.

Terence Tao (02:17:55) Yeah, there’s no even viable stretch. Even if I activate all the cheats that I know of in this book, there’s just still no way to get from A to B. I think it needs a breakthrough in another area of mathematics to happen first and for someone to recognize that it would a useful thing to transport into this problem.

Lex Fridman (02:18:18) So we should maybe step back for a little bit and just talk about prime numbers.

Lex Fridman (02:18:23) So they’re often referred to as the atoms of mathematics. Can you just speak to the structure that these atoms provide?

Terence Tao (02:18:31) So the natural numbers have two basic operations, addition, and multiplication. So if you want to generate the natural numbers, you can do one of two things. You can just start with one and add one to itself over and over again. And that generates you the natural numbers. So additively, they’re very easy to generate one, two, three, four, five. Or you can take the prime number if you want to generate multiplicatively, you can take all the prime numbers, two, three, five, seven and multiply them all together. Together that gives you all the natural numbers except maybe for one. So there are these two separate ways of thinking about the natural numbers from an additive point of view and a multiplicative point of view. And separately, they’re not so bad. So any question about that natural was it only was addition, it’s relatively easy to solve.

(02:19:11) And any question that only was multiplication is relatively easy to solve. But what has been frustrating is that you combine the two together and suddenly you get the extremely rich… I mean, we know that there are statements in number theory that are actually as undecidable. There are certain polynomials in some number of variables. Is there a solution in the natural numbers? And the answer depends on an undecidable statement whether the axioms of mathematics are consistent or not. But even the simplest problems that combine something more applicative such as the primes with something additives such as shifting by two, separately we understand both of them well, but if you ask when you shift the prime by two, can you get up? How often can you get another prime? It’s been amazingly hard to relate the two.

Lex Fridman (02:19:59) And we should say that the twin prime conjecture is just that, it pauses that there are infinitely many pairs of prime numbers that differ by two. Now the interesting thing is that you have been very successful at pushing forward the field in answering these complicated questions of this variety. Like you mentioned the Green-Tao Theorem. It proves that prime numbers contain arithmetic progressions of any length.

Lex Fridman (02:20:25) It’s just mind-boggling that you could prove something like that.

Terence Tao (02:20:27) Right. Yeah. So what we’ve realized because of this type of research is that different patterns have different levels of indestructibility. What makes the twin prime problem hard is that if you take all the primes in the world, three, five, seven, 11, and so forth, there are some twins in there, 11 and 13 is a twin prime, pair of twin primes and so forth. But you could easily, if you wanted to redact the primes to get rid of these twins. The twins, they’d show up and they’re infinitely many of them, but they’re actually reasonably sparse. There’s not, I mean, initially there’s quite a few, but once you got to the millions, the trillions, they become rarer and rarer. And you could actually just, if someone was given access to the database of primes, you just edit out a few primes here and there.

(02:21:15) They could make the twin prime conjecture false by just removing 0.01% of the primes or something, just well-chosen to do this. And so you could present a censored database of the primes, which passes all of these statistical tests of the primes. It obeys things like the polynomial theorem and other effects of the primes, but doesn’t contain any twin primes anymore. And this is a real obstacle to the twin-prime conjecture. It means that any proof strategy to actually find twin primes in the actual primes must fail when applied to these slightly edited primes. And so it must be some very subtle, delicate feature of the primes that you can’t just get from aggregate statistical analysis.

Lex Fridman (02:22:01) Okay, so that’s out.

Terence Tao (02:22:02) Yeah. On the other hand, progressions has turned out to be much more robust. You can take the primes and you can eliminate 99% of the primes actually, and you can take any 90% you want. And it turns out, and another thing we proved is that you still get arithmetic progressions. Arithmetic progressions are much, they’re like cockroaches.

Lex Fridman (02:22:21) Of arbitrary length though.

Lex Fridman (02:22:25) So for people who don’t know, arithmetic progressions is a sequence of numbers that differ by some fixed amount.

Terence Tao (02:22:32) Yeah. But it’s again like, it’s an infinite monkey type phenomenon. For any fixed length of your set, you don’t get arbitrary length progressions. You only get quite short progressions.

Lex Fridman (02:22:40) But you’re saying twin-prime is not an infinite monkey phenomena. I mean, it’s a very subtle monkey. It’s still an infinite monkey phenomena.

Terence Tao (02:22:48) Right. Yeah. If the primes were really genuinely random, if the primes were generated by monkeys, then yes, in fact the infinite monkey theorem would-

Lex Fridman (02:22:56) Oh, but you’re saying that twin prime, you can’t use the same tools. It doesn’t appear random almost.

Terence Tao (02:23:05) Well, we don’t know. We believe the primes behave like a random set. And so the reason why we care about the twin prime conjecture is a test case for whether we can genuinely confidently say with 0% chance of error that the primes behave like a random set. Random versions of the primes we know contain twins at least with 100% probably, or probably tending to 100% as you go out further and further. So the primes, we believe that they’re random. The reason why arithmetic progressions are indestructible is that regardless of whether it looks random or looks structured like periodic, in both cases the arithmetic progressions appear, but for different reasons. And this is basically all the ways in which the thing was… There are many proofs of this sort of arithmetic progression-type theorems.

(02:23:54) And they’re all proven by some sort of dichotomy where your set is either structured or random and in both cases you can say something and then you put the two together. But in twin primes, if the primes are random, then you are happy, you win. If the primes are structured, they could be structured in a specific way that eliminates the twins. And we can’t rule out that one conspiracy.

Lex Fridman (02:24:16) And yet you were able to make, as I understand, progress on the K-tuple version

Terence Tao (02:24:21) Right. Yeah. So the one funny thing about conspiracies is that any one conspiracy theory is really hard to disprove. That if you believe the word is one by lizards is that here’s some evidence that it’s not [inaudible 02:24:32] work, that it was just talked about lizards. You might have encountered this kind of phenomena.

Terence Tao (02:24:41) There’s almost no way to definitively rule out a conspiracy. And the same is true in mathematics. A conspiracy that is solely devoted to eliminating twin primes, you would have to also infiltrate other areas of mathematics, but it could be made consistent at least as far as we know. But there’s a weird phenomenon that you can make one conspiracy rule out other conspiracies. So if the world is run by lizards, it can’t also be one by aliens, right?

Terence Tao (02:25:09) So one unreasonable thing is hard to disprove, but more than one, there are tools. So yeah, so for example, we know there’s infinitely many primes that no two, which… So there are infinite pairs of primes which differ by at most, 246 actually is the code.

Lex Fridman (02:25:26) Oh, so there’s like a bound on the-

Terence Tao (02:25:28) Right. So there’s twin primes, there’s a thing called cousin primes that differ by four. There’s a thing called sexy primes that differ by six.

Lex Fridman (02:25:36) What are sexy primes?

Terence Tao (02:25:38) Primes that differ by six. The name is much less… It causes much less exciting than the name suggests.

Terence Tao (02:25:45) So you can make a conspiracy rule out one of these, but once you have 50 of them, it turns out that you can’t rule out all of them at once. It requires too much energy somehow in this conspiracy space.

Lex Fridman (02:25:55) How do you do the bound part? How do you develop a bound for the differented team deposit-

Lex Fridman (02:26:01) … that there’s an infinite number of?

Terence Tao (02:26:03) So it’s ultimately based on what’s called the pigeonhole principle. So the pigeonhole principle is a statement that if you have a number of pigeons, and they all have to go into pigeonholes and you have more pigeons than pigeonholes, then one of the pigeonholes has to have at least two pigeons in it. So there has to be two pigeons that are close together. So for instance, if you have 100 numbers and they all range from one to 1,000, two of them have to be at most 10 apart because you can divide up the numbers from one to 100 into 100 pigeonholes. Let’s say if you have 101 numbers. 101 numbers, then two of them have to be a distance less than 10 apart because two of them have to belong to the same pigeonhole. So it’s a basic feature of a basic principle in mathematics.

(02:26:45) So it doesn’t quite work with the primes already because if the primes get sparser and sparser as you go out, that there are fewer and fewer numbers are prime. But it turns out that there’s a way to assign weights to numbers. So there are numbers that are kind of almost prime, but they don’t have no factors at all other than themselves and one. But they have very few factors. And it turns out that we understand almost primes a lot better than primes. And so for example, it was known for a long time that there were twin almost primes. This has been worked out. So almost primes are something we can understand. So you can actually restrict the attention to a suitable set of almost primes. And whereas the primes are very sparse overall relative to the almost primes actually are much less sparse.

(02:27:33) You can set up a set of almost primes where the primes have density like say 1%. And that gives you a shot at proving by applying some sort of pigeonhole principle that there’s pairs of primes that are just only 100 apart. But in order to prove the twin prime conjecture, you need to get the density of primes, this having also up to a threshold of 50%. Once you get up to 50%, you will get twin primes. But unfortunately, there are barriers. We know that no matter what kind of good set of almost primes you pick, the density of primes can never get above 50%. It’s what the parity barrier and I would love to fight. So one of my long-term dreams is to find a way to breach that barrier because it would open up not only the twin prime conjecture but the Goldbach conjecture.

(02:28:12) And many other problems in number theory are currently blocked because our current techniques would require going beyond this theoretical parity barrier. It’s like going fast as the speed of light.

Lex Fridman (02:28:24) Yeah. So we should say a twin prime conjecture, one of the biggest problems in the history of mathematics. Goldbach conjecture also. They feel like next-door neighbors. Has there been days when you felt you saw the path?

Terence Tao (02:28:37) Oh, yeah. Yeah. Sometimes you try something and it works super well. You again, get the sense of mathematical smell we talked about earlier. You learn from experience when things are going too well because there are certain difficulties that you sort of have to encounter. I think the way of colleague might put it is that if you are on the streets of New York and you’re put in a blindfold and you’re put in a car and after some hours the blindfold is off and then you’re in Beijing. I mean that was too easy somehow. There was no ocean being crossed. Even if you don’t know exactly what was done, you’re suspecting that something wasn’t right.

Lex Fridman (02:29:21) But is that still in the back of your head? Do you return to the prime numbers every once in a while to see?

Terence Tao (02:29:29) Yeah, when I have nothing better to do, which is less and less. I get busy with so many things these days. But when I have free time and I’m not, I’m too frustrated to work on my real research projects, and I also don’t want to do my administrative stuff or I don’t want to do some errands for my family. I can play with these things for fun. And usually you get nowhere. You have to just say, “Okay, fine. Once again, nothing happened. I will move on.” Very occasionally one of these problems or actually solved. Well, sometimes as you say, you think you solved it and then you forward for maybe 15 minutes and then you think, “I should check this. This is too easy, too good to be true.” And it usually is.

Lex Fridman (02:30:11) What’s your gut say about when these problems would be solved, the twin prime and Goldbach?

Terence Tao (02:30:16) The twin prime, I’ll think we’ll-

Terence Tao (02:30:19) … keep getting more partial results. It does need at least one… This parity barrier is the biggest remaining obstacle. There are simpler versions of the conjecture where we are getting really close. So I think in 10 years we will have many more much closer results, we may not have the whole thing. So twin primes is somewhat close. The Riemann hypothesis I have no clue. It has happened by accident I think.

Lex Fridman (02:30:47) So the Riemann hypothesis is a kind of more general conjecture about the distribution of prime numbers, right?

Terence Tao (02:30:53) Right. Yeah. It’s states that sort of viewed multiplicatively, for questions only involving multiplication, no addition. The primes really do behave as randomly as you could hope. So there’s a phenomenon in probability called square root cancellation that if you want to poll, say America on some issue, and you ask one or two voters and you may have sampled a bad sample, and then you get a really imprecise measurement of the full average. But if you sample more and more people, the accuracy gets better and better. And it accuracy improves the square root of the number of people you sample. So if you sample 1, 000 people, you can get a 2 or 3% margin of error. So in the same sense, if you measure the primes in a certain multiplicative sense, there’s a certain type of statistic you can measure and it’s called the Riemann’s data function, and it fluctuates up and down.

(02:31:42) But in some sense, as you keep averaging more and more, if you sample more and more, the fluctuation should go down as if they were random. And there’s a very precise way to quantify that. And the Riemann hypothesis is a very elegant way that captures this. But as with many other ways in mathematics, we have very few tools to show that something really genuinely behaves really random. And this is actually not just a little bit random, but it’s asking that it behaves as random as it actually random set, this square root cancellation. And we know because of things related to the parity problem actually, that most of us’ usual techniques cannot hope to settle this question. The proof has to come out of left field. But what that is, no one has any serious proposal. And there’s various ways to solve. As I said, you can modify the primes a little bit and you can destroy the Riemann hypothesis.

(02:32:37) So it has to be very delicate. You can’t apply something that has huge margins of error. It has to just barely work. And there’s all these pitfalls that you dodge very adeptly.

Lex Fridman (02:32:50) The prime numbers is just fascinating.

Lex Fridman (02:32:53) What to you is most mysterious about the prime numbers?

Terence Tao (02:33:00) That’s a good question. Conjecturally, we have a good model of them. I mean, as I said, I mean they have certain patterns, like the primes are usually odd, for instance. But apart from there’s some obvious patterns, they behave very randomly and just assuming that they behave. So there’s something called the Kramer random model of the primes that after a certain point, primes just behave like a random set. And there’s various slight modifications to this model. But this has been a very good model. It matches the numerics. It tells us what to predict. I can tell you with complete certainty the twin prime conjecture is true. The random model gives overwhelming odds it is true, I just can’t prove it. Most of our mathematics is optimized for solving things with patterns in them.

(02:33:39) And the primes have this anti-pattern, as do almost everything really, but we can’t prove that. I guess it’s not mysterious that the primes be random because there’s no reason for them to have any kind of secret pattern. But what is mysterious is what is the mechanism that really forces the randomness to happen? This is just absent.

Collatz conjecture

Lex Fridman (02:34:04) Another incredibly surprisingly difficult problem is the Collatz conjecture.

Lex Fridman (02:34:10) Simple to state, beautiful to visualize in it simplicity and yet extremely difficult to solve. And yet you have been able to make progress. Paul Erdos said about the Collatz conjecture that mathematics may not be ready for such problems. Others have stated that it is an extraordinarily difficult problem, completely out of reach, this is in 2010, out of reach of present-day mathematics, and yet you have made some progress. Why is it so difficult to make? Can you actually even explain what it is, is the key to-

Terence Tao (02:34:41) Oh, yeah. So it’s a problem that you can explain. It helps with some visual aids. But yeah, so you take any natural number, like say 13, and you apply the following procedure to it. So if it’s even, you divide it by two, and if it’s odd, you multiply it by three and add one. So even numbers get smaller, odd numbers get bigger. So 13 would become 40 because 13 times 3 is 39, add one you get 40. So it’s a simple process. For odd numbers and even numbers, they’re both very easy operations. And then you put it together, it’s still reasonably simple. But then you ask what happens when you iterate it? You take the output that you just got and feed it back in. So 13 becomes 40, 40 is now even divide by two is 20. 20 is still even divide by 2, 10, five, and then five times three plus one is 16, and then eight, four, two, one. And then from one it goes one, four, two, one, four, two, one. It cycles forever. So this sequence I just described, 13, 40, 20, 10, so both, these are what is known hailstorm sequences, because there’s an oversimplified model of hailstorm formation which is not actually quite correct but it’s still somehow taught to high school students as a first approximation, is that a little nugget of ice gets an ice crystal forms and clouded. It goes up and down because of the wind. And sometimes when it’s cold it acquires a bit more mass and maybe it melts a little bit. And this process of going up and down creates this partially melted ice which eventually causes hailstorm, and eventually it falls down to the earth. So the conjecture is that no matter how high you start up, you take a number which is in the millions or billions, this process that goes up, if you are odd and down, it eventually comes down to Earth all the time.

Lex Fridman (02:36:23) No matter where you start with very simple algorithm, you end up at one. And you might climb for a while-

Terence Tao (02:36:29) Yeah. So yeah, if you plotted these sequences, they look like Brownian motion. They look like the stock market. They just go up and down in a seemingly random pattern. And in fact, usually that’s what happens, that if you plug in a random number, you can actually prove, at least initially, that it would look like a random walk. And that’s actually a random walk with a downward drift. It’s like if you are always gambling on a roulette at the casino with odds slightly weighted against you. So sometimes you win, sometimes you lose. But over in the long run, you lose a bit more than you win. And so normally your wallet will go to zero if you just keep playing over and over again.

Lex Fridman (02:37:07) So statistically it makes sense that we go here?

Terence Tao (02:37:11) Yes. So the result that I proved roughly speaking such that statistically like 99% of all inputs would drift down to maybe not all the way to one, but to be much, much smaller than what you started. So it’s like if I told you that if you go to a casino, most of the time you end up, if you keep playing it for long enough, you end up with a smaller amount in your wallet then when you started. That’s kind of like the result that I proved.

Lex Fridman (02:37:35) So why is that result… Can you continue down that thread to prove the full conjecture?

Terence Tao (02:37:42) Well, the problem is that I used arguments from probability theory, and there’s always this exceptional event. So in probability, we have this law of large numbers, which tells you things like if you play a casino with a game at a casino with a losing expectation over time you are guaranteed, almost surely with probability as close to 100% as you wish, you’re guaranteed to lose money. But there’s always this exceptional outlier. It is mathematically possible that even in the game is the odds are not in favor, you could just keep winning slightly more often than you lose. Very much like how in Navier-Stokes it could be, most of the time your waves can disperse, there could be just one outlier choice of initial conditions that would lead you to blow up. And there could be one outlier choice of a special number they stick in that shoots off infinity while all other numbers crash to Earth, crash to one.

(02:38:40) In fact, there’s some mathematicians who, Alex Kontorovich for instance, who’ve proposed that actually these collapse iterations are like these similar Automator. Actually, if you look at what they happen in binary, they do actually look a little bit like these game of life type patterns. And in analogy to how the game of life can create these massive self-replicating objects and so forth, possibly you could create some sort of heavier-than-air flying machine. A number which is actually encoding this machine, which is just whose job it’s to encode, is to create a version of something which is larger.

Lex Fridman (02:39:17) Heavier-than-air machine encoded in a number-

Lex Fridman (02:39:20) … that flies forever.

Terence Tao (02:39:22) So Conway in fact, worked on this problem as well.

Terence Tao (02:39:26) Conway, so similar, in fact, that was more on inspirations for the Navier-Stokes project. Conway studied generalizations of the collapse problem where instead of multiplying by three and adding one or dividing by two, you have more complicated branching list. But instead of having two cases, maybe you have 17 cases and then you go up and down. And he showed that once your iteration gets complicated enough, you can actually encode Turing machines and you can actually make these problems undecidable and do things like this. In fact, he invented a programming language for these kind of fractional linear transformations. He called it frac-trat as a play on full-trat. And he showed that you can program, it was Turing-complete, you could make a program that if your number you insert in was encoded as a prime, it would sink to zero.

(02:40:13) It would go down, otherwise it would go up and things like that. So the general class of problems is really as complicated as all the mathematics.

Lex Fridman (02:40:23) Some of the mystery of the cellular automata that we talked about, having a mathematical framework to say anything about cellular automata, maybe this same kind of framework is required. Yeah, Goldbach’s conjecture.

Terence Tao (02:40:35) Yeah. If you want to do it, not statistically, but you really want 100% of all inputs to for the earth. Yeah. So what might be feasible is, yeah, statistically 99% go to one, but everything, that looks hard.

P = NP

Lex Fridman (02:40:50) What would you say is out of these within reach famous problems is the hardest problem we have today? Is it the Riemann hypothesis?

Terence Tao (02:40:59) Well, it’s up there. P equals NP is a good one because that’s a meta problem. If you solve that in the positive sense that you can find a P equals NP algorithm, potentially, this solves a lot of other problems as well.

Lex Fridman (02:41:14) And we should mention some of the conjectures we’ve been talking about. A lot of stuff is built on top of them now. There’s ripple effects. P equals NP has more ripple effects than basically any other-

Terence Tao (02:41:24) Right. If the Riemann hypothesis is disproven, that’d be a big mental shock to the number theorists. But it would have follow-on effects for cryptography, because a lot of cryptography uses number theory, uses number-theory constructions involving primes and so forth. And it relies very much on the intuition that number-theories are built over many, many years of what operations involving primes behave randomly and what ones don’t? And in particular, encryption methods are designed to turn text-written information on it into text, which is indistinguishable from random noise. And hence, we believe to be almost impossible to crack, at least mathematically. But if something has caught our beliefs as the Riemann hypothesis is wrong, it means that there are actual patterns of the primes that we’re not aware of.

(02:42:21) And if there’s one, there’s probably going to be more. And suddenly a lot of our crypto systems are in doubt.

Lex Fridman (02:42:27) Yeah. But then how do you then say stuff about the primes-

Lex Fridman (02:42:34) … that you’re going towards because of the Collatz conjecture again? Because, do you want it to be random, right?

Lex Fridman (02:42:41) You want it to be random?

Terence Tao (02:42:43) Yeah. So more broadly, I’m just looking for more tools, more ways to show that things are random. How do you prove a conspiracy doesn’t happen?

Lex Fridman (02:42:49) Right. Is there any chance to you that P equals NP? Can you imagine a possible universe?

Terence Tao (02:42:57) It is possible. I mean, there’s various scenarios. I mean, there’s one where it is technically possible, but in fact it’s never actually implementable. The evidence is sort of slightly pushing in favor of no, that probably P is not a good NP.

Lex Fridman (02:43:11) I mean, it seems like it’s one of those cases similar to Riemann hypothesis. I think the evidence is leaning pretty heavily on the no.

Terence Tao (02:43:20) Certainly more on the no than on the yes. The funny thing about P equals NP is that we have also a lot more obstructions than we do for almost any other problem. So while there’s evidence, we also have a lot of results ruling out many, many types of approaches to the problem. This is the one thing that the computer science has actually been very good at. It’s actually saying that certain approaches cannot work. No-go theorems. It could be undecidable, yeah, we don’t know.

Fields Medal

Lex Fridman (02:43:43) There’s a funny story I read that when you won the Fields Medal, somebody from the internet wrote you and asked, what are you going to do now that you’ve won this prestigious award? And then you just quickly, very humbly said that a shiny metal is not going to solve any of the problem I’m currently working on, so I’m going to keep working on them. First of all, it’s funny to me that you would answer an email in that context, and second of all, it just shows your humility. But anyway, maybe you could speak to the Fields Medal, but it’s another way for me to ask about Gregorio Perlman. What do you think about him famously declining the Fields Medal and the Millennial Prize, which came with a $1 million of prize money. He stated that, “I’m not interested in money or fame. The prize is completely irrelevant for me. If the proof is correct, then no other recognition is needed.”

Terence Tao (02:44:40) Yeah, no, he’s somewhat of an outlier, even among mathematicians who tend to have somewhat idealistic views. I’ve never met him. I think I’d be interested to meet him one day, but I’ve never had the chance. I know people who met him. He’s always had strong views about certain things. I mean, it’s not like he was completely isolated from the math community. I mean, he would give talks and write papers and so forth, but at some point he just decided not.

Terence Tao (02:45:00) … He talks and write papers and so forth, but at some point he just decided not to engage with the rest of the community. He was disillusioned or something, I don’t know. And he decided to peace out and collect mushrooms in St. Petersburg or something. And that’s fine, you can do that. That’s another sort of flip side. A lot of our problems that we solve, some of them do have practical application and that’s great. But if you stop thinking about a problem, so he hasn’t published since in this field, but that’s fine. There’s many, many other people who’ve done so as well.

(02:45:39) Yeah. So I guess one thing I didn’t realize initially with the Fields Medal is that it sort of makes you part of the establishment. So most mathematicians, just career mathematicians, you just focus on publishing the next paper, maybe promote it one rank, and starting a few projects, may have taken some students or something. But then suddenly people want your opinion on things and you have to think a little bit about things that you might just foolishly say, because you know no one’s going to listen to you, it’s more important now.

Lex Fridman (02:46:11) Is it constraining to you? Are you able to still have fun and be a rebel and try crazy stuff and play with ideas?

Terence Tao (02:46:19) I have a lot less free time than I had previously, mostly by choice. I always say I have the option to sort of decline, so I decline a lot of things. I could decline even more or I could acquire a reputation of being so unreliable that people don’t even ask anymore.

Lex Fridman (02:46:38) I love the different algorithms here. This is great.

Terence Tao (02:46:41) It’s always an option, but there are things that I don’t spend as much time as I do as a postdoc, just working on one problem at a time or fooling around. I still do that a little bit. But yeah, as you advance in your career, the more soft skills, so math somehow front-loads all the technical skills to the early stages of your career. So as a postdoc, you publish or perish. You’re incentivized to basically focus on proving very technical theorems, so prove yourself as well as prove the algorithms. But then as you get more senior, you have to start mentoring and giving interviews and trying to shape direction of field both research-wise and sometimes you have to do various administrative things. And it’s kind the right social contract because you need to work in the trenches to see what can help mathematicians.

Lex Fridman (02:47:40) The other side of the establishment, the really positive thing is that you get to be a light that’s an inspiration to a lot of young mathematicians or young people that are just interested in mathematics. It’s like-

Lex Fridman (02:47:52) … just how the human mind works. This is where I would probably say that I like the Fields Medal, that it does inspire a lot of young people somehow. This is just how human brains work. At the same time, I also want to give sort of respect to somebody like Grigori Perlman, who is critical of awards. In his mind, those are his principles and any human that’s able for their principles to do the thing that most humans would not be able to do, it’s beautiful to see.

Terence Tao (02:48:25) Some recognition is necessary and important, but yeah, it’s also important to not let these things take over your life and only be concerned about getting the next big award or whatever. So again, you see these people try to only solve really big math problems and not work on things that are less sexy, if you wish, but actually still interesting and instructive. As you say, the way the human mind works, we understand things better when they’re attached to humans, and also if they’re attached to a small number of humans. The way our human mind is wired, we can comprehend the relationships between 10 or 20 people. But once you get beyond like 100 people, there’s a limit, I think there’s a name for it, beyond which it just becomes the other.

(02:49:18) And so you have to simplify the [inaudible 02:49:21] 99.9% of humanity becomes the other. Often these models are incorrect, and this causes all kinds of problems. So yeah, to humanize a subject, if you identify a small number of people and say these are representative people of a subject, role models, for example, that has some role, but it can also be too much of it can be harmful because I’ll be the first to say that my own career path is not that of a typical mathematician. The very accelerated education, I skipped a lot of classes. I think I always had very fortunate mentoring opportunities, and I think I was at the right place at the right time. Just because someone doesn’t have my trajectory, it doesn’t mean that they can’t be good mathematicians. They would be, but in a very different style, and we need people of a different style.

(02:50:16) And sometimes too much focus is given on the person who does the last step to complete a project in mathematics or elsewhere that’s really taken centuries or decades with lots and lots of, building on lots of previous work. But that’s a story that’s difficult to tell if you’re not an expert. It’s easier to just say one person did this one thing. It makes for a much simpler history.

Lex Fridman (02:50:40) I think on the whole, it is a hugely positive thing. To talk about Steve Jobs as a representative of Apple, when I personally know and of course everybody knows the incredible design, the incredible engineering teams, just the individual humans on those teams. They’re not a team. They’re individual humans on a team, and there’s a lot of brilliance there, but it’s just a nice shorthand, like π, Steve Jobs, π.

Terence Tao (02:51:08) Yeah, as a starting point, as a first approximation that’s how you-

Lex Fridman (02:51:13) And then read some biographies and then look into much deeper first approximation.

Andrew Wiles and Fermat’s Last Theorem

Lex Fridman (02:51:17) That’s right. So you mentioned you were at Princeton too. Andrew Wiles at that time-

Lex Fridman (02:51:22) … he was a professor there. It’s a funny moment how history is just all interconnected, and at that time, he announced that he proved Fermat’s Last Theorem. What did you think, maybe looking back now with more context about that moment in math history?

Terence Tao (02:51:37) Yeah, so I was a graduate student at the time. I vaguely remember there was press attention and we all had the same, we had pigeonholes in the same mail room, so we all got mail and suddenly Andrew Wiles’ mailbox exploded to be overflowing.

Lex Fridman (02:51:53) That’s a good metric.

Terence Tao (02:51:54) Yeah. We all talked about it at tea and so forth. We didn’t understand. Most of us sort of didn’t understand the proof. We understand high level details. In fact, there’s an ongoing project to formalize it in Lean. Kevin Buzzard is actually-

Lex Fridman (02:52:09) Yeah. Can we take that small tangent? How difficult is that ’cause as I understand the proof for Fermat’s Last Theorem has super complicated objects?

Lex Fridman (02:52:21) It’s really difficult to formalize now.

Terence Tao (02:52:22) Yeah, I guess. Yeah, you’re right. The objects that they use, you can define them. So they’ve been defined in Lean, so just defining what they are can be done. That’s really not trivial, but it’s been done. But there’s a lot of really basic facts about these objects that have taken decades to prove in all these different math papers. And so lots of these have to formalized as well. Kevin Buzzard’s goal, actually he has a five-year grant to formalize Fermat’s Last Theorem, and his aim is that he doesn’t think he will be able to get all the way down to the basic axioms, but he wants to formalize it to the point where the only things that he needs to rely on is black boxes, are things that were known by 1980 to a number of theorists at the time, and then some other person or some other work would have to be done to get from there.

(02:53:13) So it’s a different area of mathematics than the type of mathematics I’m used to. In analysis, which is my area, the objects we study are kind of much closer to the ground. I study things like prime numbers and functions and things that are within scope of a high school math education to at least define. But then, there’s this very advanced algebraic side of number theory where people have been building structures upon structures for quite a while, and it’s a very sturdy structure. It’s been very… At the base, at least it’s extremely well-developed with textbooks and so forth. But it does get to the point where if you haven’t taken these years of study and you want to ask about what is going on at level six of this tower, you have to spend quite a bit of time before they can even get to the point where you can see something that you recognize.

Lex Fridman (02:54:07) What inspires you about his journey that was similar, as we talked about, seven years mostly working in secret?

Terence Tao (02:54:15) Yeah, so it kind of fits with the romantic image I think people have of mathematicians to the extent that they think of them all as these kind of eccentric wizards or something. So that’s certainly kind of accentuated that perspective. It is a great achievement. His style of solving problems is so different from my own, which is great. We need people like that.

Lex Fridman (02:54:46) Can you speak to it, like in terms of you like the collaborative?

Terence Tao (02:54:49) I like moving on from a problem if it’s giving too much difficulty.

Terence Tao (02:54:55) But you need the people who have the tenacity and the fearlessness. I’ve collaborated with people like that where I want to give up ’cause the first approach that we tried didn’t work and the second one didn’t work. But they’re convinced and they have third, fourth, and the fifth, which works. And I’d have to eat my words, “Okay. I didn’t think this was going to work, but yes, you were right all along.”

Productivity

Lex Fridman (02:55:16) And we should say for people who don’t know, not only are you known for the brilliance of your work, but the incredible productivity, just the number of papers, which are all very high quality. So there’s something to be said about being able to jump from topic to topic.

Terence Tao (02:55:31) Yeah, it works for me. But there are also people who are very productive and they focus very deeply. I think everyone has to find their own workflow. One thing which is a shame in mathematics is that mathematics has a sort a one-size-fits-all approach to teaching mathematics, and so we have a certain curriculum and so forth. Maybe if you do math competitions or something, you get a slightly different experience. But I think many people, they don’t find their native math language until very late or usually too late. So they stop doing mathematics and they have a bad experience with a teacher who’s trying to teach them one way to do mathematics that they don’t like it.

(02:56:12) My theory is that humans don’t come, evolution has not given us a math center of a brain directly. We have a vision center and a language center and some other centers, which evolution has honed, but we don’t have an innate sense of mathematics. But our other centers are sophisticated enough that we can repurpose other areas of our brain to do mathematics. So some people have figured out how to use the visual center to do mathematics, and so they think things very visually when they do mathematics. Some people have repurposed their language center and they think very symbolically. Some people, if they are very competitive and they’re gaming, there’s a part of your brain that’s very good at solving puzzles and games, and that can be repurposed.

(02:57:02) But when I talk about the mathematicians, they don’t quite think that, I can tell that they’re using some other different styles of thinking, not disjoint, but they may prefer visual. I don’t actually prefer visual so much. I need lots of visual aids myself. Mathematics provides a common language, so we can still talk to each other even if we are thinking in different ways.

Lex Fridman (02:57:26) But you could tell there’s a different set of subsystems being used in the thinking process?

Terence Tao (02:57:32) Yeah, they take different paths. They’re very quick at things that I struggle with and vice versa, and yet they still get to the same goal.

Terence Tao (02:57:41) But the way we educate, unless you have a personalized tutor or something, education, sort of just financial skill has to be mass-produced, you have to teach the 30 kids. If they have 30 different styles, you can’t teach 30 different ways.

Advice for young people

Lex Fridman (02:57:55) On that topic, what advice would you give to students, young students who are struggling with math, but are interested in it and would like to get better? Is there something in this complicated educational context? What would you advise?

Terence Tao (02:58:10) Yeah, it’s a tricky problem. One nice thing is that there are now lots of sources for mathematical enrichment outside the classroom. So in my days, there were math competitions and there are also popular math books in the library. But now you have YouTube. There are forums just devoted to solving math puzzles. And math shows up in other places. For example, there are hobbyists who play poker for fun and they, for very specific reasons, are interested in very specific probability questions. And actually, there’s a community of amateur probabilists in poker, in chess, in baseball. There’s math all over the place, and I’m hoping actually with these new tools for Lean and so forth, that actually we can incorporate the broader public into math research projects. This almost doesn’t happen at all currently.

(02:59:13) So in the sciences, there’s some scope for citizen science, like astronomers. There are amateurs who would discover comets, and there’s biologists that people who could identify butterflies and so forth. And in math, there are a small number of activities where amateur mathematicians can discover new primes and so forth. But previously, because we had to verify every single contribution, most mathematical research projects, it would not help to have input from the general public. In fact, it’ll just be time-consuming because just error checking and everything. But one thing about these formalisation projects is that they are bringing in more people. So I’m sure there are high school students who’ve already contributed to some of these formalizing projects, who’ve contributed to mathlib. You don’t need to be a PhD holder to just work on one atomic thing.

Lex Fridman (03:00:03) There’s something about the formalisation here that also, as a very first step, opens it up to the programing community too. The people who are already comfortable with program. It seems like programing is somehow maybe just the feeling, but it feels more accessible to folks than math. Math is seen as this extreme, especially modern mathematics is seen as this extremely difficult-to-enter area, and programing is not. So that could be just an entry point.

Terence Tao (03:00:31) You can execute code and you can get results. You can print out the world pretty quickly. If programing was taught as an almost entirely theoretical subject where you’re just taught the computer science, the theory of functions and routines and so forth, and outside of some very specialized homework assignments, you’re not actually programing, like on the weekend for fun, they would be as considered as hard as math. So as I said, there are communities of non-mathematicians where they’re deploying math for some very specific purpose, like optimizing their poker game, and for them, then math becomes fun for them.

Lex Fridman (03:01:13) What advice would you give in general to young people how to pick a career, how to find themselves, what they could be good at?

Terence Tao (03:01:25) That’s a tough, tough, tough question. Yeah, so there’s a lot of uncertainty now in the world. There was this period after the war where, at least in the West, if you came from a good demographic, there was a very stable path to it, to a good career. You go to college, you get an education, you pick one profession and you stick to it. It’s becoming much more a thing of the past. So I think you just have to be adaptable and flexible. I think people will have to get skills that are transferable, like learning one specific programing language or one specific subject of mathematics or something. That itself is not a super transferable skill, but sort of knowing how to reason with abstract concepts or how to problem solve when things go wrong. Anyway, these are things which I think we will still need even as our tools get better, and you’ll be working with AI supports and so forth.

Lex Fridman (03:02:13) But actually you’re an interesting case study. You’re one of the great living mathematicians, and then you had a way of doing things, and then all of a sudden you start learning. First of all, you kept learning new fields, but you learned Lean. That’s not a non-trivial thing to learn. For a lot of people, that’s an extremely uncomfortable leap to take, right?

Lex Fridman (03:02:41) A lot of mathematicians.

Terence Tao (03:02:42) First of all, I’ve always been interested in new ways to do mathematics. I feel like a lot of the ways we do things right now are inefficient. Many of my colleagues, who spend a lot of time doing very routine computations or doing things that other mathematicians would instantly know how to do and we don’t know how to do them, like how we search and get a quick response and so forth. So that’s why I’ve always been interested in exploring new workflows.

(03:03:09) About four or five years ago, I was on a committee where we had to ask for ideas for interesting workshops to run at a math institute. And at the time, Peter Scholze had just formalized one of his new theorems, and there were some other developments in computer-assisted proof that look quite interesting. And I said, “Oh, we should run a workshop on this. This would be a good idea.” And then I was a bit too enthusiastic about this idea, and so I got volun-told to actually run it. So I did with a bunch of other people, Kevin Buzzard and Jordan Ellenberg and a bunch of other people, and it wasn’t a nice success. We pulled together a bunch of mathematicians and computer scientists and other people, and we got up to speed on state of the yard, and it was really interesting developments that most mathematicians didn’t know was going on, lots of nice proofs of concept, just hints of what was going to happen. This was just before ChatGPT, but even then there was one talk about language models and the potential capability of those in the future.

(03:04:11) So that got me excited about the subject. So I started giving talks about this is something more of us should start looking at, now that I had arranged, run this conference. And then ChatGPT came out and suddenly AI was everywhere. And so I got interviewed a lot about this topic and in particular, the interaction between AI and [inaudible 03:04:33]. I said, “Yeah, they should be combined. This is perfect synergy to happen here.” And at some point I realized that I have to actually not just talk the talk, but walk the walk. I don’t work in machine learning and I don’t work in proof formalisation, and there’s a limit to how much I can just rely on authority and say, “I’m a mathematician. Just trust me when I say that this is going to change mathematics,” and I don’t do any of it myself. So I felt like I had to actually justify it.

(03:05:03) A lot of what I get into, actually, I don’t quite see in advance as how much time I’m going to spend on it, and it’s only after I’m sort of waist deep in a project that I realize, but at that point, I’m committed.

Lex Fridman (03:05:15) Well, that’s deeply admirable that you’re willing to go into the fray, be in some small way a beginner, or have some of the challenges that a beginner would, right?

Lex Fridman (03:05:27) New concepts, new ways of thinking, also sucking at a thing that others… I think in that talk, you could be a Fields Medal-winning mathematician and an undergrad knows something better than you.

Terence Tao (03:05:42) Yeah, I think mathematics inherently, mathematics is so huge these days that nobody knows all of modern mathematics. And inevitably, we make mistakes and you can’t cover up your mistakes with just bravado because people will ask for your proofs, and if you don’t have the proofs, you don’t have the proofs.

Terence Tao (03:06:04) Yeah, so it does keep us honest. It’s not a perfect panacea, but I think we do have more of a culture of admitting error because we’re forced to all the time.

The greatest mathematician of all time

Lex Fridman (03:06:17) Big ridiculous question. I’m sorry for it once again. Who is the greatest mathematician of all time, maybe one who’s no longer with us? Who are the candidates? Euler, Gauss, Newton, Ramanujan, Hilbert?

Terence Tao (03:06:32) So first of all, as mentioned before, there’s some time dependence.

Terence Tao (03:06:38) Yeah. Like if you plot cumulatively over time, for example, Euclid is one of the leading contenders, and then maybe some unnamed anonymous mathematicians before that, whoever came up with the concept of numbers.

Lex Fridman (03:06:53) Do mathematicians today still feel the impact of Hilbert, just-

Lex Fridman (03:06:58) Directly of what? Everything that’s happened in the 20th century?

Terence Tao (03:07:00) Yeah, Hilbert spaces, we have lots of things that are named after him of course. Just the arrangement of mathematics and just the introduction of certain concepts, 23 problems have been extremely influential.

Lex Fridman (03:07:12) There’s some strange power to the declaring which problems are hard to solve, the statement of the open problems.

Terence Tao (03:07:19) Yeah, this is bystander effect everywhere. If no one says you should do X, everyone just mills around waiting for somebody else to do something, and nothing gets done. And the one thing that actually you have to teach undergraduates in mathematics is that you should always try something. So you see a lot of paralysis in an undergraduate trying a math problem. If they recognize that there’s a certain technique that can be applied, they will try it. But there are problems which they see and none of their standard techniques obviously applies and the common reaction is than just paralysis, I don’t know what to do. I think there’s a quote from the Simpsons, “I’ve tried nothing and I’m all out of ideas.” So the next step then is to try anything no matter how stupid and in fact almost the stupider, the better, which technically is almost guaranteed to fail, but the way it fails is going to be instructive. It fails ’cause you are not at all taking into account this hypothesis. Oh, this hypothesis must be useful. That’s a clue.

Lex Fridman (03:08:26) I think you also suggested somewhere this fascinating approach, which really stuck with me as they’re using it, and it really works, I think you said it’s called structured procrastination.

Lex Fridman (03:08:37) It’s when you really don’t want to do a thing that you imagine a thing you don’t want to do more that’s worse than that and then in that way, you procrastinate by not doing the thing that’s worse. It’s a nice hack, it actually works.

Terence Tao (03:08:51) Yeah, yeah. With anything, psychology is really important. You talk to athletes like marathon runners and so forth and they talk about what’s the most important thing, is it the training regimen or the diet and so forth? So much of it is psychology, just tricking yourself to think that the problem is feasible so that you’re motivated to do it.

Lex Fridman (03:09:15) Is there something our human mind will never be able to comprehend?

Terence Tao (03:09:21) Well, as a mathematician, [inaudible 03:09:23]. There must be some large number that you can’t understand. That was the first thing that came to mind.

Lex Fridman (03:09:31) So that, but even broadly, is there something about our mind that we’re going to be limited even with the help of mathematics?

Terence Tao (03:09:41) Well, okay, how much augmentation are you willing. Like for example, if I didn’t even have a pen and paper, if I had no technology whatsoever, so I’ve not allowed blackboard, pen and paper-

Lex Fridman (03:09:52) You’re already much more limited than you would be.

Terence Tao (03:09:55) … Incredibly limited. Even language, the English language is a technology. It’s one that’s been very internalized.

Lex Fridman (03:10:03) So you’re right, the formulation of the problem is incorrect ’cause there really is no longer just a solo human already augmented in extremely complicated intricate ways, right?

Lex Fridman (03:10:18) So like a collective intelligence?

Terence Tao (03:10:20) Yes. Yeah, I guess, so humanity plural has much more intelligence in principle on its good days than the individual humans put together. It can have less, but yeah, so the mathematical community plural is incredibly super intelligent entity that no single human mathematician can come closer to replicating. You see it a little bit on these question analysis sites. So this math overflow, which is the math version of stackable flow, sometimes you get this very quick response to very difficult questions from the community, and it’s a pleasure to watch actually, as an expert.

Lex Fridman (03:11:01) I’m a fan spectator of that site, just seeing the brilliance of the different people, the depth and knowledge that people have. And the willingness to engage in the rigor and the nuance of the particular question, it’s pretty cool to watch. It’s almost like just fun to watch. What gives you hope about this whole thing we have going on with human civilization?

Terence Tao (03:11:25) I think the younger generation is always really creative and enthusiastic and inventive. It’s a pleasure working with young students. The progress of science tells us that the problems that used to be really difficult can become trivial to solve. Like navigation, just knowing where you work on the planet was this horrendous problem. People died or lost fortunes because they couldn’t navigate. And we have devices in our pockets that do this automatically for us, like it is a completely solved problem. So things that seem unfeasible for us now, could be maybe just homework exercises.

Lex Fridman (03:12:13) Yeah. One of the things I find really sad about the finiteness of life is that I won’t get to see all the cool things we create as a civilization because in the next 100 years, 200 years, just imagine showing up in 200 years.

Terence Tao (03:12:27) Yeah, well, already plenty has happened. If you could go back in time and talk to your teenage self or something, the internet and now AI, again, they’re getting to internalize and yeah, of course, AI can understand our voice and give reasonable slightly incorrect answers to any question. But yeah, this was mind-blowing even two years ago.

Lex Fridman (03:12:50) And in the moment, it’s hilarious to watch on the internet and so on, the drama, people take everything for granted very quickly, and then we humans seem to entertain ourselves with drama. Out of anything that’s created, somebody needs to take one opinion, another person needs to take an opposite opinion, argue with each other about it. But when you look at the arc of things, just even in the progress of robotics, just to take a step back and be like, “Wow, this is beautiful, that we humans are able to create this.”

Terence Tao (03:13:19) When the infrastructure and the culture is healthy, the community of humans can be so much more intelligent and mature and rational than the individuals within it.

Lex Fridman (03:13:31) Well, one place I can always count on rationality is the Comment section of your blog, which I’m a big fan of. There’s a lot of really smart people there. And thank you, of course, for putting those ideas out on the blog. And I can’t tell you how honored I am that you would spend your time with me today. I was looking forward to this for a long time. Terry, I’m a huge fan. You inspire me, you inspire millions of people. Thank you so much for time.

Terence Tao (03:13:58) Thank you. It was a pleasure.

Lex Fridman (03:14:00) Thanks for listening to this conversation with Terrence Tao. To support this podcast, please check out our sponsors in the description or at lexfridman.com/sponsors. And now, let me leave you with some words from Galileo Galilei, “Mathematics is a language with which God has written the universe.”

(03:14:21) Thank you for listening and hope to see you next time.

Sundar Pichai:谷歌与 Alphabet 首席执行官 (2025-06-05)

Sundar Pichai: CEO of Google and Alphabet (2025-06-05)

深度研报:Sundar Pichai 的科技乐观主义与谷歌的 AI 长征

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:在谷歌(Alphabet)CEO Sundar Pichai 的领导下,公司经历了一年被外界广泛质疑“输掉 AI 竞赛”的舆论风暴后,通过一系列重磅产品发布(如 Gemini 1.5 Pro, Veo, AI Overviews)重新确立其领先地位。此次对话正是在这一“王者归来”的背景下,Pichai 对其个人哲学、领导力反思及谷歌未来 AI 战略的系统性阐述。

  • 核心论点:Pichai 的核心世界观源于其在印度资源匮乏的童年经历,这段经历让他深刻烙印下**“技术是改善人类生活质量的阶跃函数(step-function)”这一信念。他将此信念投射到人工智能上,认为 AI 并非简单的工具,而是超越火与电、能够递归式自我改进并加速“创造”本身的最深刻技术变革**。因此,谷歌的战略核心并非短期追赶,而是基于长期的基础技术投入(如 TPU)、组织架构重塑(合并 DeepMind 与 Brain)和产品哲学演进(将 AI 深度整合为所有服务的智能底层),旨在引领下一代计算范式。他坦然面对外界压力和 AI 风险,但坚信人类集体智慧有自我调节能力,最终能够驾驭这项技术,实现其巨大潜能。

2. 🧠 深度观点解析 (Deep Dive Analysis)

维度一:AI 作为终极通用技术(The Ultimate General-Purpose Technology)

  • 核心观点:Pichai 坚信 AI 将是“人类所从事的最深刻的技术”,其影响力将远超火、电力或互联网。
  • 原理解构:这一判断并非基于近期炒作,而是基于一个第一性原理的差异:AI 是第一项具备递归式自我改进能力的技术。传统的通用技术(如电力)赋能了所有行业,但它们本身不会变得更智能。AI,尤其是 AGI,能够通过自我研究和迭代,加速自身的进步。此外,AI 直接作用于“创造”和“智能”这两个最核心的生产要素,它不是制造工具的工具,而是加速知识发现和应用本身的元工具(meta-tool),这使其具备了改变所有事物发展速度的潜力。
  • 证据/案例
    • AlphaGo:他多次提及 AlphaGo 的学习过程,从零基础到超越人类顶尖棋手,直观地展示了 AI 惊人的自学习和进化能力。
    • 递归自我改进:他强调 AI 能够“dramatically accelerate creation itself”,即加速创造本身,这是它与以往所有技术的核心区别。

维度二:AI 引爆的“创造力平权” (Democratization of Creativity)

  • 核心观点:AI 最直接、最广泛的社会影响将是指数级地降低创造门槛,赋能数十亿人进行深度表达与创造
  • 原理解构:历史上,每一次信息媒介的革命(博客、YouTube)都扩大了创作者的基数。但这些工具仍需要专业技能(写作、拍摄、剪辑)。AI,特别是多模态生成模型,将这个过程简化为“思想的直接物化”。用户只需通过自然语言描述“vibe”(感觉、氛围),AI 就能将其转化为代码、视频、设计等复杂产物,这从根本上解锁了全球 80 亿人的认知与想象力盈余
  • 证据/案例
    • Veo 3:他提到用户已经能通过拼接 prompt 创作出令人惊叹的视频,并强调“This is the worst it’ll ever be”(此刻即最差),暗示未来工具的易用性将飞跃。
    • 类比 YouTube:他将 AI 的影响与 YouTube 对比,后者让成千上万的人成为视频创作者,而 AI 将使这个数字达到“数千万甚至十亿级别”。

维度三:危机领导力:“潜入水下一英尺的平静”

  • 核心观点:面对去年“谷歌已输”的舆论危机,Pichai 的领导策略是过滤噪音、识别信号,并专注于执行少数几个“决定性决策”(consequential decisions)
  • 原理解构:他将管理一家巨型科技公司比作执教巴塞罗那或皇马,外部的批评和赞誉是常态。真正的领导力在于能在“波涛汹涌的海面下,找到一英尺深处的平静”。这意味着不受短期市场情绪的干扰,而是基于对内部技术轨迹和长期战略的信心。这种信心来源于早已布局的基础设施和对核心人才的信任。
  • 证据/案例
    • 决定性决策:明确指出合并 Google Brain 和 DeepMind 是一个关键的决定性决策,旨在集中火力,应对 AI 发展的“the moment”。
    • 长期主义:强调十年前对 TPU(张量处理单元)的投资是今天能够训练大规模模型的基础,这体现了谷歌的战略耐心。
    • 数据佐证:提到内部 Gemini 服务的 token 处理量在过去 12 个月内增长了 50 倍(从 9.7T 到 480T),这是他能保持“内部平静”的硬数据支撑。

维度四:搜索的未来:从“链接列表”到“智能对话入口”

  • 核心观点:Google Search 正在从一个提供“10个蓝色链接”的工具,演变为一个以 AI 为核心的、提供上下文、促进对话式探索的智能层
  • 原理解构:这并非要取代开放网络,而是为了更高效地连接用户与网络。AI 的作用体现在三个层面:1)上下文聚合:通过“查询扇出”(query fan-out)技术,AI 能将一个复杂问题分解为多个子查询,整合网络信息,提供一个浓缩的“AI 概览”(AI Overviews)。2)多语言桥梁:AI 能跨越语言障碍进行推理,让非英语用户也能接触和理解全球的知识库。3)探索伴侣:在“AI 模式”下,搜索变成一场与 AI 的持续对话,激发用户更长、更复杂、更有深度的提问,从而提供质量更高的网络引流。
  • 证据/案例
    • 产品形态:明确了“AI 概览”将融入主搜索页面,而“AI 模式”则作为一个独立的、体验最前沿功能的标签页存在。
    • 核心原则:Pichai 强调,将用户引向开放网络(sending traffic to the web) 仍然是核心设计原则,AI 只是让这个过程更智能、更高效。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识:p(doom) 的自我调节机制 Pichai 提出了一个非常深刻的观点:对 AI 末日论(p(doom))的高概率担忧,本身可能就是防止其发生的保险。当一个威胁足够大,大到能让全人类的目标空前一致时(即“确保不发生”),人类将以前所未有的协同效率来解决这个问题。因此,对风险的广泛认知反而会激发集体行动,形成一种“自我调节”的负反馈循环,这挑战了“技术失控不可避免”的悲观论调。

  • 盲点与局限:从 AGI 到 AJI(人工锯齿智能) Pichai 引用了“AJI - Artificial Jagged Intelligence”的概念,巧妙地指出了行业的一个认知盲区。我们常常被 AI 在某些任务上的超人表现所震撼,却忽略了它在另一些极其简单的任务上(如“数草莓中有几个 R”)的“锯齿状”失败。这提醒我们,当前 AI 的能力图谱是极不平滑的,存在大量“智能断崖”。过度宣传其通用性,而忽视其“锯齿”特性,是行业面临的风险之一。

  • 未解之谜:AI 时代的界面与表达 对话中承认,我们目前仍在用传统的 UI 范式“禁锢”AI。Pichai 指出,一个真正的突破将是让 AI 模型能够自主设计和生成最适合表达其思想的用户界面。这揭示了一个未解的难题:当智能体本身成为设计师,人机交互的终极形态会是怎样?这超越了简单的聊天框,指向一个动态、自适应、甚至由 AI 主导的交互未来。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “AI is the most profound technology humanity will ever work on. It’ll be more profound than fire or electricity.” 中文意译:“AI 是人类有史以来将从事的最深刻的技术。它将比火或电更加深远。” 语境:Pichai 阐述他对 AI 历史地位的根本判断,强调其递归自我改进的独特性。

  2. “Sometimes you jump in the ocean, it’s so choppy, but you go down one feet under, it’s the calmest thing in the entire universe. So there’s a version of that.” 中文意译:“有时你跳入波涛汹涌的大海,但只要下潜一英尺,那里就是全宇宙最宁静的地方。这是一种境界。” 语境:Pichai 描述作为 CEO 如何在巨大的外部压力和舆论噪音下保持内心的平静与专注。

  3. “The thing I always think: this is the worst it’ll ever be, at any given moment in time.” 中文意译:“我总是这样想:在任何一个时间点,眼前的(AI)就是它最糟糕的样子了。” 语境:在评论当前生成式 AI 的能力时,以此强调其未来进步的速度和潜力。

  4. “If p(doom) is actually high, at some point, all of humanity is aligned in making sure that’s not the case… so there is a self-modulating aspect there.” 中文意译:“如果‘末日概率’真的很高,那么在某个节点,全人类都会团结一致去确保它不会发生……所以这里存在一个自我调节的机制。” 语境:Pichai 对 AI 存在风险的乐观回应,认为人类的集体求生欲是最终的安全阀。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)

    • 技术栈竞争:竞争焦点将是全栈优化能力,从自研芯片(TPU)、模型架构到产品集成。仅仅拥有一个强大的模型是不够的,模型家族(如 Gemini 的 Pro, Flash, Nano) 和高效的推理服务能力将成为差异化关键。
    • 产品形态:主流应用(搜索、邮件、办公套件)的“AI 化”将从辅助功能(suggestions)演变为核心交互层(agentic layer)多模态对话将成为下一代操作系统的标配。
    • 竞争格局:拥有海量数据、强大基础设施和成熟分发渠道的巨头(如谷歌)在 AI 竞赛中展现出强大的韧性。创业公司的机会在于发掘“伟大的 UI”,将底层 AI 能力以创新的方式呈现给用户。
  • 长期终局 (5-10年)

    • 人机交互范式:如果 Pichai 的设想成真,我们将从**“应用时代”进入“智能体时代”**。操作系统(Android/iOS)将演变为主动理解用户意图、跨应用执行任务的个人智能体。交互将从点击图标变为与 AI 的自然语言对话。
    • 信息生态:互联网可能分化为**“人类可读网络”“机器可读的代理网络”(Agentic Web)**。内容生态将发生剧变,大量 AI 生成内容涌现,而“纯人类创作”可能成为一种稀缺、高价值的体验,类似于今天的现场音乐会或手工艺品。
    • 社会结构:AI 带来的生产力飞跃将迫使社会重新思考工作的意义。通用基本智能(Universal Basic Intelligence) 将像电力一样普及,人类价值将更多地体现在提出问题、定义目标、情感连接和复杂决策上。
  • 行动建议

    • 开发者:应将 AI 视为核心生产力工具而非替代者。谷歌内部工程效率提升 10% 的数据表明,善用 AI 辅助编程将是基本技能。未来的重点将从编写重复性代码转向系统架构设计和创造性问题解决。
    • 投资者:评估 AI 公司时,应超越短期模型跑分,关注其基础研究能力、计算基础设施规模和将技术转化为亿级用户产品的工程文化。Pichai 的经历表明,具备长期主义和战略耐心的公司,即使短期受挫,也可能后来居上。
    • 创业者:巨大的机会在于“创造力平权”。开发赋能新一代创作者的工具,无论是在视频、游戏、教育还是科学研究领域。同时,打造面向 AI 智能体的服务和 API,将是未来“代理网络”经济的基础。

这是一份基于 Google 首席执行官 Sundar Pichai 与 Lex Fridman 对话内容的深度行业分析报告。


🚀 Google 的 AGI 转型与“新火”时代:Sundar Pichai 对话深度解析

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:在 Google 经历“落后于 AI 浪潮”的舆论风暴一年后,CEO Sundar Pichai 接受专访。这不仅是一次公关回应,更是 Google 战略重组(DeepMind 与 Brain 合并)后的核心愿景阐述。
  • 核心论点:Pichai 将 AI 定义为人类历史上比“火”或“电”更深远的变革性技术。他认为,当前的 AI 并非孤立的工具,而是一个类似“新石器时代工具包(Neolithic Package)”的文明乘数,将通过“递归自我改进”打破生产力天花板。对话旨在证明:Google 已从防御姿态转向进攻,通过整合全栈算力(TPU)、多模态世界模型(Gemini)和物理实体(Waymo/Robotics),正在构建一个 agentic(代理式)的未来生态。

2. 🧠 深度观点解析 (Deep Dive Analysis)

I. 技术本质观:AI 是文明级的“新火”

  • 核心观点:AI 是人类历史上最具深远意义的技术,其重要性超过火和电。
  • 原理解构:与以往单向提升效率的工具不同,AI 具有**递归自我改进(Recursively Self-improving)**的特性。它不仅加速了信息的获取,更直接加速了“创造”过程本身。Pichai 认为 AI 将缩短从“意图”到“产出”的回路,使人类从繁琐的细节(Grant work)中解放。
  • 案例:引用了 AlphaGo 从零开始在一天内超越人类,以及 Veo 3 模型在训练过程中物理理解能力的指数级跃迁。

II. 搜索的范式转移:从“链接”到“代理”

  • 核心观点:Google 搜索正在从“10 个蓝色链接”转向一个融合 AI 概览(AI Overviews)与 AI 模式(AI Mode)的连续体。
  • 原理解构:搜索不再仅仅是信息的索引,而是**查询扇出(Query Fan-out)**的过程。AI 层对全网信息进行实时处理、汇总并提供上下文,其核心设计目标是“满足好奇心”而非简单的答案提供。Pichai 强调,Google 仍将坚持将流量引向“人类创造的 Web 内容”,维持生态平衡。
  • 证据:提到 Gemini 每月产出 480 万亿个 Token,这一数字在一年内增长了 50 倍,证明了用户查询行为正变得更长、更复杂、更具交互性。

III. 组织架构重组:消除“技术噪音”的深海哲学

  • 核心观点:Google 的回击依赖于对 DeepMind 和 Brain 两个世界级团队的暴力整合。
  • 原理解构:面对“Google 迷失”的批评,Pichai 采用了**“潜水(Scuba Diving)”哲学**——忽略海面的波涛汹涌(舆论压力),关注深海的平静(底层技术进展)。通过将研究资源集中于 Google DeepMind,并由 Demis Hassabis 统一领导,解决了内部科研割裂的问题。
  • 案例:提到为了迎接 AI 浪潮,Google 在 10 年前就开始布局 TPU(张量处理器),这种长期主义的硬件投资是目前 Google 能够快速迭代 Ultra 和 Pro 系列模型的底气。

IV. 物理 AI 的终局:世界模型的统一

  • 核心观点:Gemini 既是语言模型,也是机器人和自动驾驶的世界模型。
  • 原理解构Project AstraWaymo 的成功证明了多模态能力的通用性。未来的 Android OS 将不再是“App 的排列”,而是一个主动感知的 Agentic OS。这种横向的一致性使得一次技术投资(训练大模型)可以同时驱动搜索、机器人和 XR(扩展现实)三大业务。
  • 证据:Waymo 已实现 1000 万次付费行程,且在没人看好的低谷期,Pichai 反直觉地决定加大对 L4 级自动驾驶的投入。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识:AI 是 P(doom) 的解药 主流观点担心 AI 导致人类毁灭(P-doom),但 Pichai 提出了一个自调制(Self-modulating)机制:当技术风险越高,全人类的利益一致性就越强。他甚至认为,没有 AI 的世界风险更高,因为人类无法解决资源匮乏导致的冲突,而 AI 能通过提高效率将世界从“零和博弈”转变为“正向博弈”。
  • 盲点与局限:AJI(人工参差不齐智能) Pichai 坦诚地使用了 AJI (Artificial Jagged Intelligence) 这一术语。即 AI 在处理高难度科研(如 AlphaFold)时表现惊人,但在数单词中的字母(如 Strawberry 有几个 R)这种简单任务上却会犯错。这种“智能的参差不齐”是当前 scaling laws(规模法则)尚未完全解决的短板。
  • 未解之谜:人类本质的溢价 当 AI 能写出 100% 的代码并生成 50% 的顶级视频内容时,人类的价值何在?Pichai 承认,虽然机器可以模仿 Messi(梅西)的动作,但人类更在乎“人类在压力下的挣扎与艺术性”。他并未给出明确答案,但暗示未来价值将从“专业技能”转向“人类精神的链接”。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “AI will be more profound than fire or electricity.” (AI 将比火或电更深远地影响人类。)—— 语境:讨论技术对文明生产力的倍增效应。
  2. “It’s like coaching Barcelona or Real Madrid. You have a bad season… but you go down one feet under the ocean, it’s the calmest thing in the entire universe.” (这就像执教巴萨或皇马,你会经历糟糕的赛季……但只要你潜入水下一英尺,那里就是宇宙中最平静的地方。)—— 语境:回应媒体对他个人领导力的质疑。
  3. “The irony is there is a self-modulating aspect there… If humanity collectively puts their mind to solving a problem, whatever it is, I think we can get there.” (讽刺的是,这其中存在一种自调制机制……只要人类集体致力于解决某个问题,无论多难,我们都能做到。)—— 语境:讨论 AI 毁灭人类的概率(P-doom)。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)
    • 开发效率革命:Google 内部 30% 代码已由 AI 生成,工程速率提升 10%。软件开发将转向“Vibe Coding”(氛围编程/意图编程),程序员将从代码编写者进化为架构设计师。
    • XR 设备的重生:随着 Project Astra 这种低延迟多模态模型的成熟,轻量化眼镜将取代手机成为新的交互入口。
  • 长期终局 (5-10年)
    • AGI 时间线:Pichai 预测 2030 年前后将出现爆发式进展,但“定义” AGI 已不重要,因为其产生的正负外部性将迫使社会彻底重构信息真实性的验证机制。
    • 去专业化趋势:当 AI 成为所有领域的超级专家,人类的价值将体现在“通用性”和“整合能力”上,即能指挥多个 AI Agent 解决复杂跨学科问题的“通才”。
  • 行动建议
    • 对于开发者:不要排斥 AI 生成的代码,要学习如何在大规模代码库中通过 AI 进行重构(Refactoring)和迁移。
    • 对于创业者:关注“代理式 Web(Agentic Web)”的机会。未来的网站不是给人看的,而是给 AI Agent 读取并执行交易的。
    • 对于投资者:关注拥有底层算力基础(如 TPU/自研芯片)和闭环数据模型(如 YouTube/Waymo)的垂直巨头,它们的横向扩张能力将被 AI 放大。

分析师总结:Sundar Pichai 展现了一位成熟大厂统帅的定力。他不仅在技术上押注算力规模,更在哲学层面试图将 Google 的使命(组织全球信息)转化为 AI 时代的基石。Google 的未来不再是寻找答案的工具,而是作为人类好奇心的“联合处理器”。

深度研报:Sundar Pichai 与 Lex Fridman 对谈——AI 时代的引擎、人性与未来

1. 🎯 核心论题与背景

对话背景: 本次对话由科技分析师 Lex Fridman 与 Alphabet(谷歌)首席执行官 Sundar Pichai 深度展开。这是一场跨越数小时的访谈,不仅回顾了 Pichai 从印度贫困家庭到掌舵万亿科技帝国的个人历程,更重要的是,它发生在谷歌经历了一年重大的组织重构与公众质疑(被指在 AI 大战中掉队)之后。此次对话处于谷歌Gemini 2.5和Project Astra 等技术面世的紧要关口。

核心论点: Pichai 重申并深化了其长期观点:人工智能(AI)是生产力倍增器,其革命性远超电力或互联网。这段对话的核心在于重构对“AI 负面叙事”的认知——谷歌并未跌落神坛,而是正通过整合 Google Brain 与 DeepMind 打造“双引擎”,在“Deceptive Jazz Intelligence”(人造锯齿智能,暂且意译为 AJI 阶段)中稳步迈向 AGI。Pichai 强调,AI 的终极使命不是为了取代人类,而是通过解放繁琐劳动,让人类从“专才”进化为既能统筹 AI 智能又能追求“人性本质”(艺术、创造、情感连接)的“通才”,从而实现文明的跃迁。


2. 🧠 深度观点解析

维度一:AI 的本质是“递归性自我进化的创造者”

  • 核心观点:AI 绝不仅仅是“智能工具”,它是人类历史上第一个能够递归式自我改进并创造新事物的技术。它会成为加速器,让我们比过去几千年创造的一切总和都要多。
  • 原理解构:Pichai 指出,电力是通用的,但 AI 独特之处在于它正在撰写自己的代码,甚至探索科学的边界(如 AlphaGo 的自我博弈、AlphaFold 的蛋白质折叠)。
  • 证据/案例
    • Pichai 将 AI 与电力、火泽作比,认为其深层逻辑在于加速创造本身
    • 提到 Veo 3 模型在训练过程中的物理理解能力显著提升,甚至在 30% 和 60% 完成度时的代码生成就显示出质变。
    • 类比:就像无麻醉手术是他(Pichai)当年认为的“最伟大发明”,AI 的能力正在突破当前认知的“天花板”。

维度二:当前的“AJI 阶段”(Artificial Jagged Intelligence)

  • 核心观点:我们正处于“人工锯齿智能”阶段。这意味着我们会经历指数级的进步,但同时也伴随着离散的、看似“掉链子”的错误
  • 原理解构:人类之所以容易产生对 AI 的不信任感,是因为这种技术具备能力上的跳跃性——在某些任务上表现出惊人的全知全能,却在最基础的常识逻辑(如数学计数)上依然笨拙。这是一个过渡期特征。
  • 证据/案例
    • Pichai 明确指出这种状态:“你会看到巨大的进步,但你也容易发现它们在做数学运算或计算字母数量时的数值错误。”
    • 引用 AlphaGo 的棋局(如著名的“第 37 手”),展示了 AI 能够通过自我对弈进行不可解释但完美的创新。

维度三:工程生产力的乘数效应

  • 核心观点:AI 不仅仅是写代码的工具,更是工程效率的倍增器。在使用之初,它提升了工程师的产出效率(约 10%),但长远来看,它允许工程师脱离重复性劳动,回归到设计和架构的核心价值。
  • 原理解构:关键不在于 AI 写了 30% 的代码,而在于 AI 帮助整个工程团队解决了旧代码库的维护、协作和重组问题。这是一个“杠杆效应”。
  • 证据/案例
    • Google 内部数据:30% 的代码使用了 AI 建议,但工程效率整体提升了 10%
    • Google 并没有因为 AI 产能提升而裁员,反而计划招聘更多工程师,因为“有机会空间在变大”。

维度四:AI 的边界与人本主义的共存

  • 核心观点:在处理“敏感内容”和“艺术自由”时,模型本身的智能水平本身就是最好的过滤器,而不是通过事后的人为审慎来扼杀。
  • 原理解构:随着 Gemini 2.5 模型敏锐度的提升,它比以往版本更敢于回答涉及暴力、痛苦或争议性历史的话题,因为它具备了足够的语境推理能力来进行客观、中性的描述,而不是为了避嫌而提供含糊其辞的答案。
  • 证据/案例
    • 日常讨论中,过往 Gemini 容易在涉及战争或历史暴力话题时变得过于谨慎。
    • 现在,它能够深度、客观地回应用户关于希特勒、成吉思汗或二战历史的好奇,而不仅仅是给出“中性”的回答。

3. 💡 反直觉与批判性视角

  • 打破共识(反 p(doom)的乌托邦主义)

    • 面对普遍存在的 AI 崩溃论(p(doom)),Pichai 提出了一个极具哲学意味的论点:人类本身具有能解决问题的自我调节能力。他认为,一旦“灭绝概率”变高,全人类将空前团结(像《雪崩预警》中的设定),这种集体意志本身就是一种自我修复机制。这与许多认为人类将因 AI 失控而毁灭的观点完全不同,将危机转化为团结的动力。
  • 盲点与局限(对旧媒体架构的误判)

    • Pichai 虽然承认“黑箱”机器能提供极佳的客观信息,但他有意忽视了主流媒体面临的生存危机。对话中虽然提到了“内容创作不容欺诈”和“记者需要被保护”,但他依然坚持 AI 会与人类媒体共存并实现互补。这可能是押注于广告商和平台终将遵守道德准则的乐观预测,在众包信息和算法极化分发的现实面前,这种“商业信息”的商业模式在未来可能会面临颠覆性挑战。
  • 未解之谜(通往 AGI 的硅基路径)

    • 对于 Chrome 等基础软件的演进(UI 变革),Pichai 意识到“我们可以利用 AI 来设计 UI”,但真正的挑战在于——当 AI 成为系统思考者时,人类如何用声控之外的“自由意念”来与它交互? 这一点在目前的讨论中虽然有暗示(Agent IQ),但在技术实现和交互伦理上,依然是巨大的未解之谜。

4. 💎 金句与高光时刻

  1. “If you watch AlphaGo start from scratch, be clueless, and become better through the course of a day, really, it hits you when you see that happen.”
    • 语境:描述 AI 自主学习时的震撼,那种从懵懂到精通的递归进化速度。
  2. “I think if p(doom) is actually high, at some point, all of humanity is aligned in making sure that’s not the case, and so we’ll actually make more progress against it…”
    • 语境:关于 AI 风险(毁灭概率)的哲学回答,强调人类的集体智慧和自我纠错能力。
  3. “The fundamental value of ads are it enables access to deploy the services to billions of people.”
    • 语境:解释 AI 模式下广告存在的必要性,认为商业信息依然是支撑免费服务的基础设施。
  4. “When you work on something very ambitious, it attracts the best people… you pretty much have the path to yourselves.”
    • 语境:在谈论为何应该坚持“登月计划”式的开发时,给出的战略建议。
  5. “P(doom) without AI is maybe higher than p(doom) with AI…”
    • 语境:Lex Fridman 的一句补充,指出 AI 有可能成为解决人类内部冲突和资源匮乏的救世主,而非毁灭者。

5. 🚀 行业启示与未来推演

短期影响 (1-3年)

  • 人机进化的“粘合剂”时代:开发者将看到工程效率的巨大提升,编程将从“手工作坊”转向“指挥 AI 作坊”。这将迫使教育体系重新思考“什么样的编程技能是值得学习的”。
  • 搜索模式的根本性重构:传统的“10个蓝链接”将逐渐被“AI 模式”取代。用户将习惯于通过多轮对话式提问来获取答案,搜索引擎将不再只是指向链接,而是直接综合、翻译、推理信息。
  • AR 硬件的实用化:Android XR 眼镜和 Project Astra 将不再是概念玩具,而是具备实用价值的翻译和助手工具,尤其是实时翻译将成为标配。

长期终局 (5-10年)

  • AGI 的时间窗口:谷歌内部预计距离像人类一样聪明的 AGI 可能要等到 2030 年之后,但在这之前,2025-2030 年间我们将面临的是“熵增与秩序并存”的剧烈变革期
  • “通才”的崛起:随着 AI 占据了所有专业领域的尖端,人类社会最稀缺的价值将从“专科专家”转向“跨界整合者”和“人性体验提供者”。人类的特殊性将回归到艺术、情感和复杂问题的决策上。
  • 从信息网络到物理机器:最令 Pichai 兴奋的是,用于理解世界的 AI 模型(如 Gemini)与物理世界的 Robotaxi 和机器人将共用同一个“世界模型”,最终实现 AI 指挥物理机械进入现实世界的终极形态。

逐字稿

Episode highlight

Sundar Pichai (00:00:00) It was a five-year waiting list, and we got a rotary telephone. But it dramatically changed our lives. People would come to our house to make calls to their loved ones. I would have to go all the way to the hospital to get blood test records and it would take two hours to go and they would say, “Sorry, it’s not ready. Come back the next day.”, two hours to come back. And that became a five-minute thing. So as a kid, this light bulb went in my head, this power of technology to change people’s lives.

(00:00:32) We had no running water. It was a massive drought, so they would get water in these trucks, maybe eight buckets per household. So me and my brother, sometimes my mom, we would wait in line, get that and bring it back home. Many years later, we had running water and we had a water heater, and you could get hot water to take a shower. For me, everything was discreet like that.

(00:01:02) So, I’ve always had this thing, first-hand feeling of how technology can dramatically change your life, and the opportunity it brings. I think if p(doom) is actually high, at some point, all of humanity is aligned in making sure that’s not the case, and so we’ll actually make more progress against it, I think. So the irony is there is a self-modulating aspect there. I think if humanity collectively puts their mind to solving a problem, whatever it is, I think we can get there.

(00:01:38) Because of that, I think I’m optimistic on the p(doom) scenarios, but that doesn’t mean I think the underlying risk is actually pretty high. But I have a lot of faith in humanity rising up to meet that moment.

Lex Fridman (00:01:55) Take me through that experience, when there’s all these articles saying, ” You’re the wrong guy to lead Google through this. Google’s lost. It’s done. It’s over.”

Introduction

Lex Fridman (00:02:08) The following is a conversation with Sundar Pichai, the CEO of Google and Alphabet on this, the Lex Fridman podcast.

Growing up in India

Lex Fridman (00:02:18) Your life story is inspiring to a lot of people. It’s inspiring to me. You grew up in India, whole family living in a humble two-room apartment, very little, almost no access to technology. And from those humble beginnings, you rose to lead a $2 trillion technology company.

(00:02:41) If you could travel back in time and told that, let’s say, twelve-year-old Sundar that you’re now leading one of the largest companies in human history, what do you think that young kid would say?

Sundar Pichai (00:02:51) I would’ve probably laughed it off. Probably too far-fetched to imagine or believe at that time.

Lex Fridman (00:03:00) You would have to explain the internet first.

Sundar Pichai (00:03:02) For sure. Computers to me, at that time, I was 12 in 1984, so probably… By then, I’d started reading about them, but I hadn’t seen one.

Lex Fridman (00:03:16) What was that place like? Take me to your childhood.

Sundar Pichai (00:03:19) I grew up in Chennai. It’s in south of India. It’s a beautiful, bustling city, lots of people, lots of energy, simple life. Definitely fond memories of playing cricket outside the home. We just used to play on the streets. All the neighborhood kids would come out and we would play until it got dark and we couldn’t play anymore, barefoot. Traffic would come. We would just stop the game. Everything would drive through and you would just continue playing, just to get the visual in your head.

(00:03:51) Pre computers, there a lot of free time, now that I think about it. Now you have to go and seek that quiet solitude or something. Newspapers, books is how I gained access to the world’s information at the time [inaudible 00:04:06].

(00:04:07) My grandfather was a big influence. He worked in the post office. He was so good with language. His English… His handwriting, till today, is the most beautiful handwriting I’ve ever seen. He would write so clearly. He was so articulate, and so he got me introduced into books. He loved politics. We could talk about anything.

(00:04:33) That was there in my family throughout. Lots of books, trashy books, good books, everything from Ayn Rand to books on philosophy to stupid crime novels. Books was a big part of my life, but the soul, it’s not surprising I ended up at Google, because Google’s mission always resonated deeply with me. This access to knowledge, I was hungry for it.

(00:04:58) But definitely have fond memories of my childhood. Access to knowledge was there, so that’s the wealth we had. Every aspect of technology I had to wait for a while. I’ve obviously spoken before about how long it took for us to get a phone, about five years, but it’s not the only thing.

Sundar Pichai (00:05:16) There was a five-year waiting list, and we got a rotary telephone. But it dramatically changed our lives. People would come to our house to make calls to their loved ones. I would have to go all the way to the hospital to get blood test records, and it would take two hours to go and they would say, “Sorry, it’s not ready. Come back the next day.”, two hours to come back. And that became a five-minute thing. So as a kid, this light bulb went in my head, this power of technology to change people’s lives.

(00:05:48) We had no running water. It was a massive drought, so they would get water in these trucks, maybe eight buckets per household. So me and my brother, sometimes my mom, we would wait in line, get that and bring it back home. Many years later, we had running water and we had a water heater, and you could get hot water to take a shower. For me, everything was discreet like that. So, I’ve always had this thing, first-hand feeling of how technology can dramatically change your life, and the opportunity it brings. That was a subliminal takeaway for me throughout growing up. I actually observed it and felt it.

(00:06:41) We had to convince my dad for a long time to get a VCR. Do you know what a VCR is?

Sundar Pichai (00:06:49) I’m trying to date you now. Because before that, you only had one TV channel. That’s it. So, you can watch movies or something like that, but this was by the time I was in 12th grade, we got a VCR. It was a Panasonic, which we had to go to some shop which had smuggled it in, I guess, and that’s where we bought a VCR. But then being able to record a World Cup football game or get bootleg videotapes and watch movies, all that.

(00:07:26) So I had these discrete memories growing up, and so always left me with the feeling of how getting access to technology drives that step change in your life.

Lex Fridman (00:07:38) I don’t think you’ll ever be able to equal the first time you get hot water.

Sundar Pichai (00:07:42) To have that convenience of going and opening a tap and have hot water come out? Yeah.

Lex Fridman (00:07:47) It’s interesting. We take for granted the progress we’ve made. If you look at human history, just those plots that look at GDP across 2,000 years, and you see that exponential growth to where most of the progress happened since the Industrial Revolution, and we just take for granted, we forget how far we’ve gone. So, our ability to understand how great we have it and also how quickly technology can improve is quite poor.

Sundar Pichai (00:08:17) Oh. I mean, it’s extraordinary. I go back to India now, the power of mobile. It’s mind blowing to see the progress through the arc of time. It’s phenomenal.

Advice for young people

Lex Fridman (00:08:27) What advice would you give to young folks listening to this all over the world, who look up to you and find your story inspiring, who want to be maybe the next Sundar Pichai, who want to start, create companies, build something that has a lot of impact in the world?

Sundar Pichai (00:08:45) You have a lot of luck along the way, but you obviously have to make smart choices, you’re thinking about what you want to do, your brain is telling you something. But when you do things, I think it’s important to get that… Listen to your heart and see whether you actually enjoy doing it. That feeling of if you love what you do, it’s so much easier, and you’re going to see the best version of yourself. It’s easier said than done. I think it’s tough to find things you love doing. But I think listening to your heart a bit more than your mind in terms of figuring out what you want to do, I think is one of the best things I would tell people.

(00:09:26) The second thing is trying to work with people who you feel… At various points in my life I’ve worked with people who I felt were better than me. You almost are sitting in a room talking to someone and they’re wow. you want that feeling a few times. Trying to get yourself in a position where you’re working with people who you feel are stretching your abilities is what helps you grow, I think, so putting yourself in uncomfortable situations. And I think often you’ll surprise yourself.

(00:10:01) So, I think being open minded enough to put yourself in those positions is maybe another thing I would say.

Styles of leadership

Lex Fridman (00:10:09) What lessons can we learn? Maybe from an outsider perspective, for me, looking at your story and gotten to know you a bit, you’re humble, you’re kind. Usually when I think of somebody who has had a journey like yours and climbs to the very top of leadership in a cutthroat world, they’re usually going to be a bit of an asshole. What wisdom are we supposed to draw from the fact that your general approach is of balance, of humility, of kindness, listening to everybody. What’s your secret?

Sundar Pichai (00:10:41) I do get angry. I do get frustrated. I have the same emotions all of us do in the context of work and everything. But a few things: I think I… Over time I figured out the best way to get the most out of people. You find mission-oriented people who are in the shared journey, who have this inner drive to excellence to do the best. You motivate people and you can achieve a lot that way. It often tends to work out that way.

(00:11:19) But have there been times I lose it? Yeah. Maybe less often than others, and maybe over the years less and less so, because I find it’s not needed to achieve what you need to do.

Lex Fridman (00:11:35) So, losing your shit has not been productive?

Sundar Pichai (00:11:38) Less often than not. I think people respond to that.

Sundar Pichai (00:11:41) They may do stuff to react to that. You actually want them to do the right thing. I’m a sports fan. In soccer, not football, people often talk about man management. Great coaches do. I think there is an element of that in our lives. How do you get the best out of the people you work with?

(00:12:08) At times, you’re working with people who are so committed to achieving, if they’ve done something wrong, they feel it more than you do, so you treat them differently than… Occasionally, there are people who you need to clearly let them know that wasn’t okay or whatever it is. But I’ve often found that not to be the case.

Lex Fridman (00:12:28) And sometimes the right words at the right time spoken firmly can reverberate through time.

Sundar Pichai (00:12:35) Also sometimes, the unspoken words. People can sometimes see that you’re unhappy without you saying it, and so sometimes the silence can deliver that message even more.

Lex Fridman (00:12:48) Sometimes less is more.

(00:12:50) Who’s the greatest soccer player of all time? Messi, Ronaldo or Pelé or Maradona?

Sundar Pichai (00:12:55) I’m going to make… In this question…

Lex Fridman (00:12:58) Is this going to be a political answer, Sundar?

Sundar Pichai (00:12:58) I’m not going to lie. I will tell the truthful answer, the truthful answer.

Lex Fridman (00:13:03) So it’s Messi, okay.

Sundar Pichai (00:13:05) It is. It’s been interesting. Because my son is a big Cristiano Ronaldo fan, and so we’ve had to watch El Clasicos together with that dynamic in there. I so admire CR7s. I mean, I’ve never seen an athlete more committed to that kind of excellence, and so he’s one of the all-time greats. But for me, Messi is it.

Lex Fridman (00:13:31) When I see Lionel Messi, you just are in awe that humans are able to achieve that level of greatness and genius and artistry. We’ll talk about AI, maybe robotics and this kind of stuff, that level of genius, I’m not sure you can possibly match by AI in a long time. It’s just an example of greatness. And you have that kind of greatness in other disciplines, but in sport, you get to visually see it, unlike anything else. Just the timing, the movement, there’s just genius.

Sundar Pichai (00:14:03) Had the chance to see him a couple of weeks ago. He played in San Jose against the Quakes, so I went to see the game. I had good seats, knew where he would play in the second half hopefully. And even at his age, just watching him when he gets the ball, that movement… You’re right, that special quality. It’s tough to describe, but you feel it when you see it, yeah.

Impact of AI in human history

Lex Fridman (00:14:27) He’s still got it. If we rank all the technological innovations throughout human history… Let’s go back maybe the history of human civilizations, 12,000 years ago, and you rank them by how much of a productivity multiplier they’ve been. We can go to electricity or the labor mechanization of the Industrial Revolution, or we can go back to the first Agricultural Revolution 12,000 years ago. In that long list of inventions, do you think AI… When history is written 1,000 years from now, do you think it has a chance to be the number one productivity multiplier?

Sundar Pichai (00:15:08) It’s a great question. Many years ago, I think it might’ve been 2017 or 2018, I said at the time, AI is the most profound technology humanity will ever work on. It’ll be more profound than fire or electricity. So, I have to back myself. I still think that’s the case.

(00:15:27) When you ask this question, I was thinking, do we have a recency bias? In sports, it’s very tempting to call the current person you’re seeing the greatest…

Sundar Pichai (00:15:36) … player. Is there a recency bias? I do think, from first principles I would argue, AI will be bigger than all of those. I didn’t live through those moments. Two years ago, I had to go through a surgery, and then I processed that. There was a point in time people didn’t have anesthesia when they went through these procedures. At that moment, I was like, that has got to be the greatest invention humanity has ever, ever done. We don’t know what it is to have lived through those times.

(00:16:12) Many of what you’re talking about were this general things, which pretty much affected everything: electricity or internet, et cetera. But I don’t think we’ve ever dealt with the technology both which is progressing so fast, becoming so capable it’s not clear what the ceiling is, and the main, unique…. It’s recursively self-improving, it’s capable of that.

(00:16:41) The fact it is the first technology will dramatically accelerate creation itself, like creating things, building new things, can improve and achieve things on its own, I think puts it in a different. So, I think the impact it’ll end up having will far surpass everything we’ve seen before. Obviously, with that comes a lot of important things to think and wrestle with, but I definitely think that’ll end up being the case.

Lex Fridman (00:17:15) Especially if it gets to the point of where we can achieve superhuman performance on the AI research itself. So, it’s a technology that may… It’s an open question, but it may be able to achieve a level to where the technology itself can create itself better than it could yesterday.

Sundar Pichai (00:17:33) It’s like the move 37 of Alpha research or whatever it is.

Sundar Pichai (00:17:39) You’re right, when it can do novel, self-directed research. Obviously, for a long time we’ll have hopefully always humans in the loop and all that stuff. These are complex questions to talk about. But yes, I think the underlying technology… I’ve said this, if you watched seeing AlphaGo start from scratch, be clueless, and become better through the course of a day, really, it hits you when you see that happen.

(00:18:13) Even the Veo 3 models, if you sample the models when they were 30% done and 60% done, and looked at what they were generating, and you see how it all comes together, I would say it’s inspiring, a little bit unsettling, as a human. So all of that is true, I think.

Lex Fridman (00:18:36) The interesting thing of the Industrial Revolution, electricity, like you mentioned. You can go back to, again, the first Agricultural Revolution, there’s what’s called the Neolithic package of the first Agricultural Revolution. It wasn’t just that the nomads settled down and started planting food, but all this other kinds of technology was born from that, and it’s included this package. So, it wasn’t one piece of technology.

(00:19:05) There’s these ripple effects, second- and third-order effects that happen, everything from something profound like pottery, it can store liquids and food, to something we take for granted: social hierarchies and political hierarchies. Early government was formed. Because it turns out if humans stop moving and have some surplus food, they get bored and they start coming up with interesting systems. And then trade emerges, which turns out to be a really profound thing, and like I said, government. Second- and third-order effects from that, including that package, is incredible and probably extremely difficult. If you ask one of the people in the nomadic tribes to predict that, it would be impossible, and it’s difficult to predict.

(00:19:56) But all that said, what do you think are some of the early things we might see in the, quote, unquote, “AI package”?

Sundar Pichai (00:20:07) Most of it probably we don’t know today, but the one thing which we can tangibly start seeing now is… Obviously with the coding progress, you got a sense of it. It’s going to be so easy to imagine… Thoughts in your head, translating that into things that exist. That’ll be part of the package. It’s going to empower almost all of humanity to express themselves.

(00:20:34) Maybe in the past you could have expressed with words, but you could build things into existence. Maybe not fully today, we are at the early stages of vibe coding. I’ve been amazed at what people have put out online with Veo 3. But it takes a bit of work, you have to stitch together a set of prompts. But all this is going to get better. The thing I always think: this is the worst it’ll ever be, at any given moment in time.

Lex Fridman (00:21:02) It’s interesting you went there as a first thought: an exponential increase of access to creativity.

Sundar Pichai (00:21:11) Software, creation… Are you creating a program, a piece of content to be shared with others, games down the line? All of that just becomes infinitely more possible.

Lex Fridman (00:21:25) I think the big thing is that it makes it accessible. It unlocks the cognitive capabilities of the entire 8 billion.

Sundar Pichai (00:21:33) I agree. Think about 40 years ago, maybe in the US there were five people who could do what you were doing.

Sundar Pichai (00:21:41) Go do a interview… But today, think about, with YouTube and other products, et cetera, how many more people are doing it. I think this is what technology does. When the internet created blogs, you heard from so many more people. But with AI, I think that number won’t be in the few hundreds of thousands. It’ll be tens of millions of people, maybe even a billion people putting out things into the world in a deeper way.

Lex Fridman (00:22:17) And I think it’ll change the landscape of creativity. And it makes a lot of people nervous. For example, whatever, Fox, MSNBC, CNN are really nervous about this podcast. You mean this dude in a suit could just do this? And YouTube and thousands of others, tens of thousands, millions of other creators can do the same kind of thing? That makes them nervous. And now you get a podcast from Notebook LM that’s about five to 10 times better than any podcast I’ve ever done.

Sundar Pichai (00:22:17) Not true, but yeah.

Lex Fridman (00:22:47) I’m joking at this time, but maybe not. And that changes. You have to evolve. On the podcasting front, I’m a fan of podcasts much more than I am a fan of being a host or whatever. If there’s great podcasts that are both AIs, I’ll just stop doing this podcast. I’ll listen to that podcast. But you have to evolve and you have to change, and that makes people really nervous, I think. But it’s also really exciting future.

Sundar Pichai (00:23:11) The one thing I may say is, I do think in a world in which there are two AI, I think people value and choose… Just like in chess, you and I would never watch Stockfish 10 or whatever and AlphaGo play against each… It would be boring for us to watch. But Magnus Carlsen and Gukesh, that game would be much more fascinating to watch. So, it’s tough to say.

(00:23:36) One way to say is you’ll have a lot more content, and so you will be listening to AI-generated content because sometimes it’s efficient, et cetera. But the premium experiences you value might be a version of the human essence wherever it comes through. Going back to what we talked earlier about watching Messi dribble the ball, I don’t know, one day I’m sure a machine will dribble much better than Messi. But I don’t know whether it would evoke that same emotion in us, so I think that’ll be fascinating to see.

Lex Fridman (00:24:05) I think the element of podcasting or audio books that is about information gathering, that part might be removed, or that might be more efficiently and in a compelling way done by AI. But then it’ll be just nice to hear humans struggle with the information, contend with the information, try to internalize it, combine it with the complexity of our own emotions and consciousness and all that kind of stuff. But if you actually want to find out about a piece of history, you go to Gemini. If you want to see Lex struggle with that history, or other humans, you look at that.

(00:24:47) The point is, it’s going to continue to change the nature of how we discover information, how we consume the information, how we create that information, the same way that YouTube changed everything completely. It changed the news. And that’s something our society’s struggling with.

Sundar Pichai (00:25:04) YouTube enabled… You know this better than anyone else. It’s enabled so many creators. There is no doubt in me that we will enable more filmmakers than have ever been. You’re going to empower a lot more people. So I think there is an expansionary aspect of this, which is underestimated, I think. I think it’ll unleash human creativity in a way that hasn’t been seen before. It’s tough to internalize. The only way is if you brought someone from the ’50s or ’40s and just put them in front of YouTube, I think it would blow their mind away. Similarly, I think we would get blown away by what’s possible in a 10- to 20-year timeframe.

Lex Fridman (00:25:45) Do you think there’s a future? How many years out is it that, let’s say… Let’s put a marker on it… 50% of good content is generated by Veo 4, 5, 6?

Sundar Pichai (00:25:59) I think it depends on what it is for. Maybe if you look at movies today with CGI, there are great filmmakers. You still look at who the directors are and who use it. There are filmmakers who don’t use it at all. You value that. There are people who use it incredibly. Think about somebody like a James Cameron, like what he would do with these tools in his hands.

(00:26:24) But I think there’ll be a lot more content created. Just like writers today use Google Docs and not think about the fact that they’re using a tool like that, people will be using the future versions of these things. It won’t be a big deal at all to them.

Veo 3 and future of video

Lex Fridman (00:26:40) I’ve gotten a chance to get to know Darren Aronofsky. He’s been really leaning in and trying to figure out… It’s fun to watch a genius who came up before any of this was even remotely possible. He created Pi, one of my favorite movies. And from there, he just continued to create a really interesting variety of movies. And now he’s trying to see how can AI be used to create compelling films. You have people like that.

(00:27:07) You have people I’ve gotten just to know, edgier folks, they are AI firsts, like Dor Brothers. Both Aronofsky and Dor Brothers create at the edge of the Overton window society. They push, whether it’s sexuality or violence. It’s edgy, like artists are, but it’s still classy. It doesn’t cross that line. Whatever that line is. Hunter S. Thompson has this line, “The only way to find out where the edge, where the line is, is by crossing it.” And I think for artists, that’s true. That’s their purpose sometimes. Comedians and artists just cross that line.

(00:27:49) I wonder if you can comment on the weird place that it puts Google. Because Google’s line is probably different than some of these artists. How do you think about, specifically Veo and Flow, how to allow artists to do crazy shit, but also the responsibility for it not to be too crazy?

Sundar Pichai (00:28:15) It’s a great question. You mentioned Darren. He’s a clear visionary. Part of the reason we started working with him early on Veo is, he’s one of those people who’s able to see that future, get inspired by it, and showing the way for how creative people can express themselves with it. I think when it comes to allowing artistic free expression… It’s one of the most important values in a society, I think. Artists have always been the ones to push boundaries, expand the frontiers of thought.

(00:28:56) I think that’s going to be an important value we have, so I think we will provide tools and put it in the hands of artists for them to use and put out their work. Those APIs, I almost think of that as infrastructure. Just like when you provide electricity to people or something, you want them to use it, and you’re not thinking about the use cases on top of it.

Lex Fridman (00:29:20) It’s a paintbrush.

Sundar Pichai (00:29:20) Yeah. So, I think that’s how. Obviously, there have to be some things. And society needs to decide at a fundamental level what’s okay, what’s not, will be responsible with it. But I do think when it comes to artistic free expression, I think that’s one of those values we should work hard to defend.

Lex Fridman (00:29:44) I wonder if you can comment on maybe earlier versions of Gemini were a little bit careful on the kind of things it’d be willing to answer. I just want to comment on I was really surprised, and pleasantly surprised, and enjoy the fact that Gemini 2.5 Pro is a lot less careful, in a good sense. Don’t ask me why, but I’ve been doing a lot of research on Genghis Khan and the Aztecs, so there’s a lot of violence there in that history. It’s a very violent history. I’ve also been doing a lot of research on World War I and World War II.

(00:30:19) Earlier versions of Gemini were very… Basically this sense, are you sure you want to learn about this. And now, it’s actually very factual, objective, talks about very difficult parts of human history, and does so with nuance and depth. It’s been really nice. But there’s a line there that I guess Google has to walk. And it’s also an engineering challenge how to do that at scale across all the weird queries that people ask.

(00:30:49) Can you just speak to that challenge? How do you allow Gemini to say… Again, forgive, pardon my French… crazy shit, but not too crazy?

Sundar Pichai (00:31:00) I think one of the good insights here has been as the models are getting more capable, the models are really good at this stuff. And so I think in some ways, maybe a year ago, the models weren’t fully there, so they would also do stupid things more often. So you’re trying to handle those edge cases, but then you make a mistake in how you handle those edge cases and it compounds. But I think with 2.5, what we particularly found is once the models cross a certain level of intelligence and sophistication, they are able to reason through these nuanced issues pretty well.

(00:31:37) And I think users really want that. You want as much access to the raw model as possible. I think it’s a great area to think about. Over time, we should allow more and more closer access to it. Obviously, let people custom prompts if they wanted to and experiment with it, et cetera. I think that’s an important direction.

(00:32:04) The first principles we want to think about it is, from a scientific standpoint, making sure the models… And I’m saying scientific in the sense of how you would approach math or physics or something like that. From first principles, having the models reason about the world, be nuanced, et cetera, from the ground up is the right way to build these things, not some subset of humans hard-coding things on top of it. I think it’s the direction we’ve been taking and I think you’ll see us continue to push in that direction.

Lex Fridman (00:32:43) I took extensive notes and I gave them to Gemini and said, “Can you ask a novel question that’s not in these notes?”, and it wrote… Gemini continues to really surprise me, really surprise me. It’s been really beautiful. It’s an incredible model. The question it generated was, “You…”, meaning Sundar, “… told the world Gemini is churning out 480 trillion tokens a month. What’s the most life-changing, five-word sentence hiding in that haystack?”. That’s a Gemini question.

(00:33:17) I don’t think you can answer that, but it woke me up to all of these tokens are providing little aha moments for people across the globe. So, that’s like learning. Those tokens are people are curious, they ask a question and they find something out, and it truly could be life-changing.

Sundar Pichai (00:33:37) Oh, it is. I had the same feeling about Search many, many years ago. These tokens per month has grown 50 times in the last 12 months.

Lex Fridman (00:33:49) Is that accurate, by the way? The 4…

Sundar Pichai (00:33:49) Yeah, it is. It is. It is accurate. I’m glad it got it right. But that number was 9.7 trillion tokens per month, 12 months ago. It’s gone up to 480. It’s a 50x…

Sundar Pichai (00:34:00) … right, it’s gone up to 480, it’s a 50 X increase. So there’s no limit to human curiosity. And I think it’s one of those moments, I don’t think it is there today, but maybe one day there’s a five word phrase which says what the actual universe is or something like that and something very meaningful, but I don’t think we are quite there yet.

Scaling laws

Lex Fridman (00:34:25) Do you think the scaling laws are holding strong on, there’s a lot of ways to describe the scaling laws for AI, but on the pre-training, on post-training fronts, so the flip side of that, do you anticipate AI progress will hit a wall? Is there a wall?

Sundar Pichai (00:34:42) It’s a cherished micro kitchen conversation, once in a while I have it, like when Demis is visiting or if Demis, Koray, Jeff, Norm, Sergey, a bunch of our people, we sit and talk about this. Look, we see a lot of headroom ahead, I think. We’ve been able to optimize and improve on all fronts, pre-training, post-training, test time compute, tool use, over time, making these more agentic. So getting these models to be more general world models in that direction.

(00:35:22) Like Veo 3, the physics understanding is dramatically better than what Veo 1 or something like that was. So you kind of see on all those dimensions, I feel progress is very obvious to see and I feel like there is significant headroom. More importantly, I’m fortunate to work with some of the best researchers on the planet, they think there is more headroom to be had here. And so I think we have an exciting trajectory ahead. It’s tougher to say… Each year I sit and say, okay, we are going to throw 10 X more compute over the course of next year at it and will we see progress? Sitting here today, I feel like the year ahead will have a lot of progress.

Lex Fridman (00:36:11) And do you feel any limitations like the bottlenecks, compute limited, data limited, idea limited, do you feel any of those limitations or is it full steam ahead on all fronts?

Sundar Pichai (00:36:24) I think it’s compute limited in this sense, part of the reason you’ve seen us do Flash, Nano Flash and Pro models, but not an Ultra model, it’s like for each generation we feel like we’ve been able to get the Pro model at, I don’t know, 80, 90% of Ultra’s capability, but Ultra would be a lot more slow and lot more expensive to serve. But what we’ve been able to do is to go to the next generation and make the next generation’s Pro as good as the previous generation’s Ultra, but be able to serve it in a way that it’s fast and you can use it and so on. So I do think scaling laws are working, but it’s tough to get, at any given time, the models we all use the most, this maybe a few months behind the maximum capability we can deliver because that won’t be the fastest, easiest to use, et cetera.

Lex Fridman (00:37:26) Also, that’s in terms of intelligence, it becomes harder and harder to measure ” performance” because you could argue Gemini Flash is much more impactful than Pro just because of the latency, it’s super intelligent already. I mean sometimes latency is maybe more important than intelligence, especially when the intelligence is just a little bit less and Flash not, it’s still incredibly smart model. And so you have to now start measuring impact and then it feels like benchmarks are less and less capable of capturing the intelligence of models, the effectiveness of models, the usefulness, the real world usefulness, of models.

AGI and ASI

(00:38:07) Another kitchen question. So lots of folks are talking about timelines for AGI or ASI, artificial super intelligence. So AGI loosely defined is basically human expert level at a lot of the main fields of pursuit for humans. And then ASI is what AGI becomes, presumably quickly, by being able to self-improve. So becoming far superior in intelligence across all these disciplines than humans. When do you think we’ll have AGI? It’s 2030 a possibility?

Sundar Pichai (00:38:41) There’s one other term we should throw in there. I don’t know who used it first, maybe Karpathy did, AJI. Have you heard AJI, the artificial jagged intelligence? Sometimes feels that way, both their progress and you see what they can do and then you can trivially find they make numerical errors or counting R’s in strawberry or something, which seems to trip up most models or whatever it is. So maybe we should throw that term in there. I feel like we are in the AJI phase where dramatic progress, some things don’t work well, but overall you’re seeing lots of progress.

(00:39:19) But if your question is will it happen by 2030? Look, we constantly move the line of what it means to be AGI. There are moments today like sitting in a Waymo in a San Francisco street with all the crowds and the people and work its way through, I see glimpses of it there. The car is sometimes impatient, trying to work its way using Astra like in Gemini Live or asking questions about the world.

Speaker 1 (00:39:49) What’s a skinny building doing in my neighborhood?

Speaker 2 (00:39:51) It’s a street light, not a building.

Sundar Pichai (00:39:54) You see glimpses, that’s why use the word AJI because then you see stuff which obviously we are far from AGI too, so you have both experiences simultaneously happening to you. I’ll answer your question, but I’ll also throw out this. I almost feel the term doesn’t matter, what I know is by 2030 there’ll be such dramatic progress. We’ll be dealing with the consequences of that progress, both the positive externalities and the negative externalities that come with it in a big way by 2030. So that I strongly feel.

(00:40:31) Whatever, we may be arguing about the term or maybe Gemini can answer what that moment is in time in 2030, but I think the progress will be dramatic. So that I believe in. Will the AI think it has reached AGI by 2030? I would say we will just fall short of that timeline, so I think it’ll take a bit longer. It’s amazing, in the early days of Google DeepMind in 2010, they talked about a 20-year timeframe to achieve AGI, which is kind of fascinating to see, but for me, the whole thing, seeing what Google Brain did in 2012, and when we acquired DeepMind in 2014, right close to where we are sitting, in 2012, Jeff Dean showed the image of when the neural networks could recognize a picture of a cat and identify it. This is the early versions of Brain.

(00:41:24) And so we all talked about couple decades. I don’t think we’ll quite get there by 2030, so my sense is it’s slightly after that, but I would stress it doesn’t matter what that definition is because you will have mind-blowing progress on many dimensions. Maybe AI can create videos. We have to figure out as a society, we need some system by which we all agree that this is AI-generated and we have to disclose it in a certain way because how do you distinguish reality otherwise?

Lex Fridman (00:41:58) Yeah, there’s so many interesting things you said. So first of all, just looking back at this recent, now feels like distant, history with Google Brain, I mean that was before TensorFlow, before TensorFlow was made public, and open-sourced. So the tooling matters too. Combined with GitHub, ability to share code. Then you have the ideas of a potential transformers and the diffusion now and then there might be a new idea that seems simple in retrospect but will change everything, and that could be the post-training, the inference time innovations.

(00:42:28) And I think shadcn Tweeted that Google is just one great UI from completely winning the AI race, meaning UI is a huge part of it. How that intelligence, I think the [inaudible 00:42:45] Project likes to talk about this right now, it’s an LLM, but when is it going to become a system where you’re talking about shipping systems versus shipping a particular model? Yeah, that matters too, how the system manifests itself and how it presents itself to the world. That really, really matters

Sundar Pichai (00:43:02) Oh, hugely so. There are simple UI innovations which have changed the world and I absolutely think so. We will see a lot more progress in the next couple of years as I think AI itself on a self-improving track for UI itself. Today, we are constraining the models, the models can’t quite express themselves in terms of the UI to people. But if you think about it, we’ve kind of boxed them in that way, but given these models can code, they should be able to write the best interfaces to express their ideas over time.

Lex Fridman (00:43:46) That is an incredible idea. So the API is already open, so you create a really nice agentic system that continuously improves the way you can be talking to an AI. But a lot of that is the interface. And then of course the incredible multimodal aspect of the interface that Google has been pushing.

Sundar Pichai (00:44:08) These models are natively multimodal. They can easily take content from any format, put it in any format, they can write a good user interface, they probably understand your preferences better over time. And so all this is the evolution ahead. And so that goes back to where we started the conversation, I think there’ll be dramatic evolutions in the years ahead.

P(doom)

Lex Fridman (00:44:34) Maybe one more kitchen question. This even further ridiculous concept of p(doom). So the philosophically minded folks in the AI community, think about the probability that AGI and then ASI might destroy all of human civilization. I would say my p(doom) is about 10%. Do you ever think about this kind of long-term threat of ASI and what would your p(doom) be?

Sundar Pichai (00:45:03) Look, I mean for sure. Look, I’ve both been very excited about AI, but I’ve always felt this is a technology you have to actively think about the risks and work very, very hard to harness it in a way that it all works out well. On the p(doom) question, look, it wouldn’t surprise you to say that’s probably another micro kitchen conversation that pops up once in a while. And given how powerful the technology is maybe stepping back, when you’re running a large organization, if you can align the incentives of the organization, you can achieve pretty much anything. If you can get people all marching towards a goal, in a very focused way, in a mission-driven way, you can pretty much achieve anything.

(00:45:50) But it’s very tough to organize all of humanity that way. But I think if p(doom) is actually high, at some point, all of humanity is aligned in making sure that’s not the case. And so we’ll actually make more progress against it, I think. So the irony is, so there is a self-modulating aspect there. I think if humanity collectively puts their mind to solving a problem, whatever, it is, I think we can get there. So because of that, I think I’m optimistic on the p(doom) scenarios, I think the underlying risk is actually pretty high, but I have a lot of faith in humanity kind of rising up to meet that moment.

Lex Fridman (00:46:39) That’s really, really, well put. I mean, as the threat becomes more concrete and real, humans do really come together and get their shit together. Well, the other thing I think people don’t often talk about is probability of doom without AI. So there’s all these other ways that humans can destroy themselves and it’s very possible, at least I believe so, that AI will help us become smarter, kinder to each other, more efficient. It’ll help more parts of the world flourish where it wouldn’t be less resource constrained, which is often the source of military conflict and tensions and so on. So we also have to load into that, what’s the [inaudible 00:47:22] without AI? p(doom) with AI, p(doom) without AI, because it’s very possible that AI will be the thing that saves us, saves human civilizations from all the other threats.

Sundar Pichai (00:47:32) I agree with you. I think it’s insightful. Look, I felt to make progress on some of the toughest problems would be good to have AI, like Pear, helping you, and so that resonates with me for sure. Yeah.

Lex Fridman (00:47:48) Quick pause, bathroom break? [inaudible 00:47:51].

Lex Fridman (00:47:53) If NotebookLM was the same, like what I saw today with Beam, if it was compelling in the same kind of way, blew my mind. It was incredible. I didn’t think it’s possible. I didn’t think it’s [inaudible 00:48:06].

Sundar Pichai (00:48:05) Can you imagine the US president and the Chinese president being able to do something like Beam with the live Meet translation working well, so they’re both sitting and talking, make progress a bit more.

Lex Fridman (00:48:20) Just for people listening, we took a quick bathroom break and now we’re talking about the demo I did. We’ll probably post it somewhere somehow maybe here. I got a chance to experience Beam and it’s hard to describe in words how real it felt with just, what is it, six cameras. It’s incredible. It’s incredible.

Sundar Pichai (00:48:42) It’s one of the toughest products of, you can’t quite describe it to people. Even when we show it in slides, et cetera, you don’t know what it is. You have to kind of experience it.

Lex Fridman (00:48:54) On the world leaders front, on politics, geopolitics, there’s something really special again with studying World War II and how much could have been saved if Chamberlain met Stalin in person. And I sometimes also struggle explaining to people, articulating, why I believe meeting in person for world leaders is powerful. It just seems naive to say that, but there is something there in person and with Beam, I felt that same thing, and then I’m unable to explain, all I kept doing is what a child does. You look real. And I mean, I don’t know if that makes meetings more productive or so on, but it certainly makes them more, the same reason you want to show up to work versus remote sometimes, that human connection. I don’t know what that is, it’s hard to put into words. There’s something beautiful about great teams collaborating on a thing that’s not captured by the productivity of that team or by whatever on paper. Some of the most beautiful moments you experience in life is at work. Pursuing a difficult thing together for many months, there’s nothing like it.

Sundar Pichai (00:50:13) You’re in the trenches. And yeah, you do form bonds that way, for sure.

Lex Fridman (00:50:17) And to be able to do that somewhat remotely in that same personal touch, I don’t know, that’s a deeply fulfilling thing. I know a lot of people, I personally hate meetings because a significant percent of meetings when done poorly don’t serve a clear purpose. But that’s a meeting problem, that’s not a communication problem. If you could improve the communication for the meetings that are useful, that’s just incredible. So yeah, I was blown away by the great engineering behind it. And then we get to see what impact that has, that’s really interesting, but just incredible engineering. Really impressive.

Sundar Pichai (00:50:51) No, it is. And obviously we’ll work hard over the years to make it more and more accessible. But yeah, even on a personal front outside of work meetings, a grandmother who’s far away from her grandchild and being able to have that kind of an interaction, all that I think will end up being very… Nothing substitutes being in person but it’s not always possible. You could be a soldier deployed trying to talk to your loved one. So I think so that’s what inspires us.

Toughest leadership decisions

Lex Fridman (00:51:24) When you and I hung out last year and took a walk, I don’t think we talked about this, but I remember outside of that seeing dozens of articles written by analysts and experts and so on, that Sundar Pichai should step down because the perception was that Google was definitively losing the AI race, has lost its magic touch, in the rapidly evolving technological landscape,. And now a year later, it’s crazy. You showed this plot of all the things that were shipped over the past year. It’s incredible. And Gemini Pro is winning across many benchmarks and products as we sit here today. So take me through that experience when there’s all these articles saying you’re the wrong guy to lead Google through this. Google is lost, is done, it’s over, to today where Google is winning again. What were some low points during that time?

Sundar Pichai (00:52:27) Look, lots to unpack. Obviously, the main bet I made as a CEO was to really make sure the company was approaching everything in a AI-first way, really setting ourselves up to develop AGI responsibly, and make sure we are putting out products which embodies that, things that are very, very useful for people. So look, I knew even through moments like that last year, I had a good sense of what we were building internally. So I’d already made many important decisions bringing together teams of the caliber of Brain and DeepMind and setting up Google DeepMind. There were things like we made the decision to invest in TPUs 10 years ago, so we knew we were scaling up and building big models.

(00:53:33) Anytime you’re in a situation like that, a few aspects. I’m good at tuning out noise, separating signal from noise. Do you scuba dive? Have you…?

Sundar Pichai (00:53:47) It’s amazing. I’m not good at it, but I’ve done it a few times. But sometimes you jump in the ocean, it’s so choppy, but you go down one feet under, it’s the calmest thing in the entire universe. So there’s a version of that. Running Google, you may as well be coaching Barcelona or Real Madrid. You have a bad season. So there are aspects to that. But look, I’m good at tuning out the noise. I do watch out for signals. It’s important to separate the signal from the noise. So there are good people sometimes making good points outside, so you want to listen to it, you want to take that feedback in, but internally, you’re making a set of consequential decisions.

(00:54:39) As leaders, you’re making a lot of decisions, many of them are inconsequential it feels like, but over time you learn that most of the decisions you’re making on a day-to-day basis doesn’t matter. You have to make them and you’re making them just to keep things moving. But you have to make a few consequential decisions and we had set up the right teams, right leaders, we had world-class researchers, we were training Gemini.

(00:55:15) Internally, there are factors which were, for example, outside people may not have appreciated. I mean TPUs are amazing, but we had to ramp up TPUs too. That took time to scale actually having enough TPUs to get the compute needed. But I could see internally the trajectory we were on and I was so excited internally about the possible, to me this moment felt like one of the biggest opportunities ahead for us as a company that the opportunity space ahead or the next decade, next 20 years, is bigger than what has happened in the past. And I thought we were set up better than most companies in the world to go realize that vision.

Lex Fridman (00:56:04) I mean, you had to make some consequential, bold decisions like you mentioned the merger of DeepMind and Brain. Maybe it’s my perspective, just knowing humans, I’m sure there’s a lot of egos involved, it’s very difficult to merge teams, and I’m sure there were some hard decisions to be made. Can you take me through your process of how you think through that? Do you go to pull the trigger and make that decision? Maybe what were some painful points? How do you navigate those turbulent waters?

Sundar Pichai (00:56:36) Look, we were fortunate to have two world-class teams, but you’re right, it’s like somebody coming and telling to you, take Stanford and MIT and then put them together and create a great department, easier said than done. But we were fortunate in phenomenal teams, both had their strengths, they were run very differently. Brain was kind of a lot of diverse projects, bottoms up and out of it came a lot of important research breakthroughs. DeepMind at the time had a strong vision of how you want to build AGI, and so they were pursuing their direction. But I think through those moments, luckily tapping into, Jeff had expressed a desire to go back to more of a scientific individual contributor roots. He felt like management was taking up too much of his time. And Demis naturally I think was running DeepMind and was a natural choice there.

(00:57:41) But I think, you are right, it took us a while to bring the teams together, credit to Demis, Jeff, Koray, all the great people there. They worked super hard to combine the best of both worlds when you set up that team. A few sleepless nights here and there, as we put that thing together. We were patient in how we did it so that it works well for the long term and some of that in that moment. I think, yes, with things moving fast, I think you definitely felt the pressure, but I think we pulled off that transition well, and I think they’re obviously doing incredible work and there’s a lot more incredible things ahead coming from them.

Lex Fridman (00:58:26) Like we talked about, you have a very calm, even-tempered, respectful demeanor, during that time, whether it’s the merger or just dealing with the noise, were there times where frustration boiled over? Did you have to go a bit more intense on everybody than you usually would?

Sundar Pichai (00:58:48) Probably. You’re right. I think in the sense that there was a moment where we were all driving hard, but when you’re in the trenches working with passion, you’re going to have days, you disagree, you argue. But all that, I mean just part of the course of working intensely. And at the end of the day, all of us are doing what we are doing because the impact it can have, we are motivated by it.

(00:59:21) For many of us, this has been a long-term journey, and so it’s been super exciting. The positive moments far outweigh the kind of stressful moments. Just early this year, I had a chance to celebrate back-to-back over two days Nobel Prize for Geoff Hinton and the next day a Nobel Prize for Demis and John Jumper. You worked with people like that, all that is super inspiring.

Lex Fridman (00:59:48) Is there something like with you where you had to put your foot down maybe with less versus more or, I’m the CEO and we’re doing this?

Sundar Pichai (01:00:01) To my earlier point about consequential decisions you make, there are decisions you make, people can disagree pretty vehemently, but at some point you make a clear decision and you just ask people to commit. You can disagree, but it’s time to disagree and commit so that we can get moving. And whether it’s putting the foot down, it’s a natural part of what all of us have to do. And I think you can do that calmly and be very firm in the direction you are making the decision, and I think if you’re clear actually people over time respect that, if you can make decisions with clarity.

(01:00:43) I find it very effective in meetings where you’re making such decisions to hear everyone out. I think it’s important, when you can, to hear everyone out. Sometimes what you’re hearing actually influences how you think about, and you’re wrestling with it and making a decision. Sometimes you have a clear conviction and you state, so look, this is how I feel and this is my conviction, and you kind of place the bet and you move on.

Lex Fridman (01:01:13) Are there big decisions like that? I kind of intuitively assume the merger was the big one?

Sundar Pichai (01:01:19) I think that was a very important decision for the company to meet the moment. I think we had to make sure we were doing that and doing that well. I think that was a consequential decision. There were many other things. We set up a AI infrastructure team to really go meet the moment to scale up the compute we needed to and really brought teams from disparate parts of the company, created it to move forward.

(01:01:51) Getting people to work together physically, both in London with DeepMind at what we call Gradient Canopy, which is where the Mountain View Google DeepMind teams are. But one of my favorite moments is I routinely walk multiple times per week to the Gradient Canopy building where our top researchers are working on the models, Sergey is often there amongst them, just looking at getting an update on the model, seeing the loss curves, so all that. I think that cultural part of getting the teams together back with that energy, I think ended up playing a big role too.

Lex Fridman (01:02:32) What about the decision to recently add AI mode? So Google Search is, as they say, the front page of the internet, it’s like a legendary minimalist thing with 10 blue links. When people think internet, they think that page and now you’re starting to mess with that. So the AI mode, which is a separate tab, and then integrating AI in the results, I’m sure there were some battles in meetings on that one.

Sundar Pichai (01:03:02) Look, in some ways when mobile came, people wanted answers to more questions, so we are kind of constantly evolving it, but you’re right, this moment, that evolution because the underlying technology is becoming much more capable. You can have AI give a lot of context, but one of our important design goals though, is when you come to Google Search, you are going to get a lot of context, but you’re going to go and find a lot of things out on the web. So that will be true in AI mode, in AI overviews, and so on.

(01:03:39) Pertaining to our earlier conversation, we’re still giving you access to links, but think of the AI as a layer, which is giving you context, summary, maybe in AI mode, you can have a dialogue with it back and forth on your journey, but through it all, you’re kind of learning what’s out there in the world. So those core principles don’t change. But I think AI mode allows us to push the… We have our best models there, models that are using search as a deep tool, really for every query you’re asking, kind of fanning out doing multiple searches, kind of assembling that knowledge in a way so that you can go and consume what you want to, and that’s how we think about it.

Lex Fridman (01:04:25) I got a chance to listen to a bunch of Elizabeth, Liz Reid, describe, there’s two things stood out to me that you mentioned. One thing is what you were talking about is the query fan-out, which I didn’t even think about before, is the powerful aspect of integrating a bunch of stuff on the web for you in one place, so that, yes, it provides that context so that you can decide which page to then go onto. The other really, really big thing speaks to the earlier in terms of productivity multiply that we’re talking about, that she mentioned, was language.

(01:05:01) So one of the things you don’t quite understand is through AI mode for non-English speakers, you make, let’s say, English language websites accessible in the reasoning process as you’ve tried to figure out what you’re looking for. Of course once you show up to a page, you can use a basic translate, but that process of figuring it out, if you empathize with a large part of the world that doesn’t speak English, their web is much smaller in that original language. And so it, again, unlocks that huge cognitive capacity there. You take for granted here with all the bloggers and the journalists writing about AI mode, you forget that this now unlocks because Gemini is really good at translation.

Sundar Pichai (01:05:54) Oh it is. I mean the multimodality, the translation, it’s ability to reason, we’re dramatically improving tool use, and putting that power in the flow of Search, look, I’m super excited with AI overviews. We’ve seen the product has gotten much better, we measured using all kinds of user metrics. It’s obviously driven strong growth of the product, and we’ve been testing AI mode. It’s now in the hands of millions of people and the early metrics are very encouraging. So look, I’m excited about this next chapter of Search.

Lex Fridman (01:06:36) For people who are not thinking through or aware of this, so there’s the 10 blue links with the AI overview on top, that provides a nice summarization, you can expand it.

Sundar Pichai (01:06:45) And you have sources and links now embedded.

Lex Fridman (01:06:49) Yeah, I believe, at least Liz said so, I actually didn’t notice it, but there’s ads in the AI overview also. I don’t think there’s ads in AI mode. When ads in AI mode, Sundar? When do you think…? Okay, we should say that in the nineties, I remember the animated GIFs, banner GIFs, that take you to some shady websites that have nothing to do with anything. AdSense revolutionized advertisement. It’s one of the greatest inventions in recent history because it allows us, for free, to have access to all these kinds of services. So ads fuel a lot of really powerful services. And at its best it’s showing you relevant ads, but also very importantly in a way that’s not super annoying, in a classy way. So when do you think it’s possible to add ads into AI mode and what does that look like from a classy, non-annoying perspective?

Sundar Pichai (01:07:52) Two things. Early part of AI mode, we’ll obviously focus more on the organic experience to make sure we are getting it right. I think the fundamental value of ads are-

Sundar Pichai (01:08:00) I think the fundamental value of ads are it enables access to deploy the services to billions of people. Second is ads are the reason we’ve always taken ads seriously is we view ads as commercial information, but it’s still information. So we bring the same quality metrics to it. I think with AI mode, to our earlier conversation about… I think AI itself will help us, over time, figure out the best way to do it. I think given we are giving context around everything, I think it’ll give us more opportunities to also explain, “Okay, here’s some commercial information.” Like today as a podcaster, you do it at certain spots, and you probably figure out what’s best in your podcast. I think so, there are aspects of that, but I think the underlying need of people value commercial information, businesses are trying to connect to users.

(01:08:58) All that doesn’t change in an AI moment, but look, we will rethink it. You’ve seen us in YouTube now do a mixture of subscription and ads. Like, obviously, we are now introducing subscription offerings across everything. So as part of that, the optimization point will end up being a different place as well.

Lex Fridman (01:09:23) Do you see a trajectory in the possible future where AI mode completely replaces the 10 blue links plus AI overview?

Sundar Pichai (01:09:32) Our current plan is AI mode is going to be there as a separate tab for people who really want to experience that, but it’s not yet at the level there, our main search pages. But as features work will keep migrating it to the main page, and so you can view it as a continuum. AI mode will offer you the bleeding edge experience, but things that work will keep overflowing to AI overviews and the main experience.

Lex Fridman (01:10:02) And the idea that AI mode will still take you to the web to human created web?

Sundar Pichai (01:10:06) Yes, that’s going to be a core design principle for us.

Lex Fridman (01:10:08) So really, if users decide, right? They drive this.

Lex Fridman (01:10:13) It’s just exciting. A little bit scary that it might change the internet because Google has been dominating with a very specific look and idea of what it means to have the internet. As you move to AI mode, I mean, it’s just a different experience. I think Liz was talking about it. I think you’ve mentioned that you ask more questions. You ask longer questions.

Sundar Pichai (01:10:41) Dramatically different types of questions.

Lex Fridman (01:10:43) Yeah, it actually fuels curiosity. I think, for me, I’ve been asking just a much larger number of questions of this black box machine, let’s say, whatever it is, and with the AI overview, it’s interesting because I still value the human… I still ultimately want to end up on the human created web, but like you said, the context really helps.

Sundar Pichai (01:11:09) It helps us deliver higher-quality referrals, right? Where people, they have much higher likelihood of finding what they’re looking for. They’re exploring. They’re curious. Their intent is getting satisfied more. So that’s what all our metrics show.

Lex Fridman (01:11:25) It makes the humans that create the web nervous. The journalists are getting nervous. They’ve already been nervous. Like we mentioned, CNN is nervous because the podcasts… It makes people nervous.

Sundar Pichai (01:11:37) Look, I think news and journalism will play an important role in the future. We are pretty committed to it, right? So I think making sure that ecosystem, in fact, I think we’ll be able to differentiate ourselves as a company over time because of our commitment there. So it’s something, I think, I definitely value a lot, and as we are designing, we’ll continue prioritizing approaches.

Lex Fridman (01:12:05) I’m sure for the people who want, they can have a fine-tuned AI model that’s clickbait hit pieces that will replace current journalism. That’s a shot of journalism. Forgive me. But I find that if you’re looking for really strong criticism of things, that Gemini is very good at providing that.

Sundar Pichai (01:12:23) Oh, absolutely. I.

Lex Fridman (01:12:24) T’s better than anything they… For now, I mean. People are concerned that there would be bias that’s introduced that as the AI systems become more and more powerful, there’s incentive from sponsors to roll in and try to control the output of the AI models. But for now, the objective criticism that’s provided is way better than journalism.

(01:12:46) Of course, the argument is the journalists are still valuable, but then, I don’t know, the crowdsourced journalism that we get on the open internet is also very, very powerful.

Sundar Pichai (01:12:56) I feel like they’re all super important things. I think it’s good that you get a lot of crowdsourced information coming in, but I feel like there is real value for high-quality journalism, right? I think these are all complimentary, I think. Like, I view it as I find myself constantly seeking out, also, like, try to find objective reporting on things too. Sometimes you get more context from the crowd-funded sources you read online, but I think both end up playing a super important role.

Lex Fridman (01:13:32) So you’ve spoken a little about this. Dennis talked about this, it’s sort of the slice of the web that will increasingly become about providing information for agents. So we can think about as two layers of the web. One is for humans, one is for agents. Do you see the AI agents? Do you see the one that’s for AI agents growing over time? Do you there still being long-term 5, 10 years value for the human created for the purpose of human consumption web, or will it all be agents in the end?

Sundar Pichai (01:14:09) Today, not everyone does, but you go to a big retail store, you love walking the aisle, you love shopping or grocery store, picking out food, et cetera, but you’re also online shopping, and they’re delivering, right? So both are complementary, and that’s true for restaurants, et cetera. So I do feel like, over time, websites will also get better for humans. They will be better design. AI might actually design them better for humans.

(01:14:41) So I expect the web to get a lot richer, and more interesting, and better to use. At the same time, I think there’ll be an agentic web, which is also making a lot of progress, and you have to solve the business value and the incentives to make that work well, right? For people to participate in it.

(01:15:05) But I think both will coexist, and obviously, the agents may not need the same… Not may not. They won’t need the same design and the UI paradigms which humans need to interact with. But I think both will be there.

Google Chrome

Lex Fridman (01:15:23) I have to ask you about Chrome. I have to say, for me personally, Google Chrome is probably, I don’t know, I’d like to see where I would rank it, but in this temptation, and this is not a recency bias, although it might be a little bit, but I think it’s up there, top three, maybe the number one piece of software for me of all time. It’s incredible. It’s really incredible.

(01:15:46) The browser is our window to the web, and Chrome really continues for many years. But even initially, to push the innovation on that front when it was stale, and it continues to challenge. It continues to make it more performant, so efficient, and just innovate constantly, and the Chromium aspect of it.

(01:16:07) Anyway, you were one of the pioneers of Chrome pushing for it when it was an insane idea, probably one of the ideas that was criticized, and doubted, and so on. So can you tell me the story of what it took to push for Chrome? What was your vision?

Sundar Pichai (01:16:29) Look, it was such a dynamic time around 2004, 2005 with AJAX, the web suddenly becoming dynamic. In a matter of few months, Flickr, Gmail, Google Maps, all kind of came into existence, right? Like, the fact that you have an interactive dynamic web. The web was evolving from simple text pages, simple HTML to rich dynamic applications, but at the same time, you could see the browser was never meant for that world, right? Like, JavaScript execution was super slow.

(01:17:12) The browser was far away from being an operating system for that rich modern web which was coming into place. So that’s the opportunity we saw. It’s an amazing early team. I still remember the day we got a shell on WebKit running and how fast it was. We had the clear vision for building a browser. We wanted to bring Core OS principles into the browser, right?

(01:17:44) So we built a secure browser, sandbox. Each tab was its own. These things are common now, but at the time, it was pretty unique. We found an amazing team in Aarhus, Denmark with a leader who built the JavaScript VM, which at the time, was 25 times faster than any other JavaScript VM out there. By the way, you are right. We open-sourced it all and put it in Chromium too, but we really thought the web could work much better, much faster, and you could be much safer browsing the web, and the name Chrome came because literally felt people were… Or the Chrome of the browser was getting clunkier.

(01:18:32) We wanted to minimize it. So that was the origins of the project. Definitely, obviously, highly-biased person here talking about Chrome, but it’s the most fun I’ve had building a product from the ground up, and it was an extraordinary team. My co-founders on the project were terrific, so definite fond memories.

Lex Fridman (01:18:56) So for people who don’t know, Sundar, it’s probably fair to say, you’re the reason we have Chrome. Yes, I know there’s a lot of incredible engineers, but pushing for it inside a company that probably was opposing it because it’s a crazy idea, because as everybody probably knows, it’s incredibly difficult to build a browser.

Sundar Pichai (01:19:13) Yeah, look, Eric was the CEO at the time. I think it was less that he was supposed to it. He kind of first-hand knew what a crazy thing it is to go build a browser, and so he definitely was like, “This is…” There was a crazy aspect to actually wanting to go build a browser, but he was very supportive. Everyone… The founders were.

(01:19:36) I think once we started building something, and we could use it. And see how much better, from then on, you’re really tinkering with the product and making it better. It came to life pretty fast.

Lex Fridman (01:19:48) What wisdom do you draw from that? From pushing through on a crazy idea in the early days that ends up being revolutionary, for future crazy ideas like it?

Sundar Pichai (01:20:00) I mean, this is something Larry and Sergey have articulated clearly. I really internalized this early on, which is their whole feeling around working on moonshots as a way. When you work on something very ambitious, first of all, it attracts the best people, right? So that’s an advantage you get. Number two, because it’s so ambitious, you don’t have others working on something crazy. So you pretty much have the path to yourselves, right? It’s like Waymo and self-driving. Number three, even if you end up quite not accomplishing what you set out to do and you end up doing 60, 80% of it, it’ll end up being a terrific success. So that’s the advice I would give people, right? I think it’s just aiming for big ideas, has all these advantages, and it’s risky, but it also has all these advantages which people I don’t think fully internalize.

Lex Fridman (01:20:57) I mean, you mentioned one of the craziest biggest moonshots, which is Waymo. It’s when I first saw, over a decade ago, a Waymo vehicle, a Google self-driving car vehicle. For me, it was an aha moment for robotics. It made me fall in love with robotics even more than before. It gave me a glimpse into the future. So it’s incredible. I’m truly grateful for that project, for what it symbolizes, but it’s also a crazy moonshot.

(01:21:28) For a long time, Waymo’s been, like you mentioned with scuba diving, just not listening to anybody, just calmly improving the system better, and better, more testing, just expanding the operational domain more and more. First of all, congrats on the 10 million paid Robotaxi rides. What lessons do you take from Waymo about, like, the perseverance, the persistence on that project?

Sundar Pichai (01:21:57) Really proud of the progress we have had with Waymo. One of the things I think we were very committed to, the final 20% can look like… I mean, we always say, right? The first 80% is easy, the final 20% takes 80% of the time. I think we definitely were working through that phase with Waymo, but I was aware of that, but we knew we were at that stage.

(01:22:21) We knew while there were many other self-driving companies, we knew the technology gap was there. In fact, right at the moment, when others were doubting Waymo is when, I don’t know, made the decision to invest more in Waymo, right? Because so in some ways it’s counterintuitive, but I think, look, we’ve always been a deep technology company, and waymo is a version of kind of building a AI robot that works well, and so we get attracted to problems like that. The caliber of the teams there, phenomenal teams.

(01:23:03) So I know you followed the space super closely. I’m talking to someone who knows the space well, but it was very obvious, it’s going to get there, and there’s still more work to do, but it’s a good example where we always prioritized being ambitious and safety at the same time, right? Equally committed to both and pushed hard and couldn’t be more thrilled with how it’s working, how much people love the experience. This year, definitely, we’ve scaled up a lot, and we’ll continue scaling up in ’26.

Lex Fridman (01:23:42) That said, the competition is heating up. You’ve been friendly with Elon even though, technically, he’s a competitor, but you’ve been friendly with a lot of tech CEOs, in that way, just showing respect towards them and so on. What do you think about the Robotaxi efforts that Tesla is doing? Do you see it as competition? What do you think? Do you like the competition?

Sundar Pichai (01:24:02) We are one of the earliest and biggest backers of SpaceX as Google, right? So thrilled with what SpaceX is doing and fortunate to be investors as a company there, right? We don’t compete with Tesla directly. We are not making cars, et cetera, right? We are building L4, 5 autonomy. We are building a Waymo driver, which is general purpose and can be used in many settings.

(01:24:32) They’re obviously working on making Tesla self-driving too. I’ve just assumed it’s a de facto that Elon would succeed in whatever it does. So that is not something I question, but I think we are so far from… These spaces are such vast spaces. Like, I think about transportation, the opportunity space, the Waymo driver is a general purpose technology we can apply in many situations. So you have a vast green space in all future scenarios, I see Tesla doing well and Waymo doing well.

Lex Fridman (01:25:13) Like we mentioned with the Neolithic package, I think it’s very possible that in the “AI package” when the history is written, autonomous vehicles, self-driving cars is like the big thing that changes everything. Imagine, over a period of a decade or two, just the complete transition from manually-driven to autonomous, in ways we might not predict, it might change the way we move about the world completely.

(01:25:41) So the possibility of that and then the second and third order effects, as you’re seeing now with Tesla, very possibly, would see some… Internally, with Alphabet, maybe Waymo, maybe some of the Gemini robotics stuff, it might lead you into the other domains of robotics because we should remember that Waymo is a robot.

Lex Fridman (01:26:05) It just happens to be on four wheels. So you said that the next big thing, we can also throw that into AI package. The big aha moment might be in the space of robotics. What do you think that would look like?

Sundar Pichai (01:26:20) Demis and the Google DeepMind team is very focused on Gemini robotics, right?

Sundar Pichai (01:26:23) So we are definitely building the underlying model as well. So we have a lot of investments there, and I think we are also pretty cutting-edge in our research there. So we are definitely driving that direction. We obviously are thinking about applications in robotics. We’ll kind of work CSD. We are partnering with a few companies today, but it’s an area I would say stay tuned.

(01:26:48) We are yet to fully articulate our plans outside, but it’s an area we are definitely committed to driving a lot of progress. But I think AI ends up driving that massive progress on robotics. The field has been held back for a while. I mean, hardware has made extraordinary progress. The software had been the challenge, but with AI now and the generalized models we are building, we are building these models, getting them to work in the real world in a safe way, in a generalized way is the frontier we are pushing pretty hard on.

Lex Fridman (01:27:25) Well, it’s really nice to see the models and the different teams integrated to where all of them are pushing towards one world model that’s being built. So from all these different angles, multimodal, you’re ultimately trying to get Gemini. So the same thing that would make AI mode really effective in answering your questions, which requires a kind of world model is the same kind of thing that would help a robot be useful in the physical world. So everything’s aligned.

Sundar Pichai (01:27:54) That is what makes this moment so unique because running, a company for the first time, you can do one investment in a very deep horizontal way. On top of it, you can drive multiple businesses forward, right? That’s effectively what we are doing in Google and Alphabet, right?

Lex Fridman (01:28:14) Yeah, it’s all coming together. Like, it was planned ahead of time, but it’s not, of course. It’s all distributed. I mean, if Gmail, and Sheets, and all these other incredible services, I can sing Gmail praises for years. I mean, just this revolutionized email.

(01:28:28) But the moment you start to integrate AI Gemini into Gmail, I mean that’s the other thing, speaking of productivity multiplier, people complain about email, but that changed everything. Email, like the invention of email changed everything, and it has been ripe. There’s been a few folks trying to revolutionize email. Some of them on top of Gmail, but that’s like ripe for innovation, not just spam filtering, but you demoed a really nice demo of-

Sundar Pichai (01:28:55) Personalized responses, right?

Lex Fridman (01:28:56) Personalized responses. At first, I felt really bad about that, but then I realized that there’s nothing wrong to feel bad about because the example you gave is when a friend asks you went to whatever hiking location, “Do you have any advice?” It just searches through all your information to give them good advice, and then you put the cherry on top, maybe some love, or whatever camaraderie, but the informational aspect, the knowledge transfer, it does for you.

Sundar Pichai (01:29:28) I think there’ll be important moments. Like, today, if you write a card in your own handwriting and send it to someone, that’s a special thing. Similarly, there’ll be a time, I mean, to your friends, maybe your friend wrote and said he’s not doing well or something, those are moments you want to save your times for writing something, reaching out. But like saying, “Give me all the details of the trip you took to me makes a lot of sense for AI assistant to help you.” Right?

(01:29:59) So I think both are important, but I think I’m excited about that direction.

Lex Fridman (01:30:04) Yeah, I think, ultimately, it gives more time for us humans to do the things we humans find meaningful. I think it scares a lot of people because we’re going to have to ask ourselves the hard question of what do we find meaningful? I’m sure there’s answers, and it’s the old question of the meaning of existence. As you have to try to figure that out, that might be ultimately parenting, or being creative in some domains of art or writing, and it challenges to…

(01:30:32) It’s a good question of to ask yourself like, “In my life, what is the thing that brings me most joy and fulfillment?” If I’m able to actually focus more time on that, that’s really powerful.

Sundar Pichai (01:30:45) I think that’s the holy grail. If you get this right, I think it allows more people to find that.

Programming

Lex Fridman (01:30:52) I have to ask you, on the programming front, AI is getting really good at programming. Gemini, both the agentic and just the LLM has been incredible, so a lot of programmers are really worried that they will lose their jobs. How worried should they be, and how should they adjust so they can be thriving in this new world, or more and more code is written by AI?

Sundar Pichai (01:31:16) I think a few things. Looking at Google, we’ve given various stats around 30% of code now uses AI- generated suggestions or whatever it is. But the most important metric, and we carefully measure it is, like, how much has our engineering velocity increased as a company due to AI, right? It’s tough measure, and we rigorously try to measure it, and our estimates are that number is now at 10%, right?

(01:31:51) Like, now, across the company, we’ve accomplished a 10% engineering velocity increase using AI, but we plan to hire more engineers next year, right? Because the opportunity space of what we can do is expanding too, right?

Sundar Pichai (01:32:15) So I think, hopefully, at least in the near to midterm, for many engineers, it frees up more and more of the… Even in engineering and coding, there are aspects which are so much fun. You’re designing. You’re architecting. You’re solving a problem. There’s a lot of grant work, which all goes hand in hand, but hopefully, it takes a lot of that away, makes it even more fun to code ,frees you up more time to create, problem, solve, brainstorm with your fellow colleagues and so on, right? So that’s the opportunity there.

(01:32:56) Second, I think it’ll attract, it’ll put the creative power in more people’s hands, which means people will create more. That means there’ll be more engineers doing more things. So it’s tough to fully predict, but I think in general, in this moment, it feels like people adopt these tools and be better programmers. Like, there are more people playing chess now than ever before, right? So it feels positive that way, to me, at least, speaking from within a Google context, is how I would talk to them about it.

Lex Fridman (01:33:36) Still. I just know anecdotally, a lot of great programmers are generating a lot of code, so their productivity, they’re not always using all the code. There’s still a lot of editing, but even for me, still programming as a side thing, I think I’m like 5x more productive. I think even for a large code base that’s touching a lot of users like Google’s does, I’m imagining, very soon, that productivity should be going up even more.

Sundar Pichai (01:34:08) No. The big unlock will be as we make the agentic capabilities much more robust, right? I think that’s what unlocks that next big wave. I think the 10% is a massive number. Like, if tomorrow, I showed up and said, “You can improve a large organization’s productivity by 10%,” when you have tens of thousands of engineers, that’s a phenomenal number, and that’s different than what other site or statistic saying like, “This percentage of code is now written by AI.”

(01:34:41) I’m talking more about, like, overall-

Lex Fridman (01:34:42) The actual productivity.

Sundar Pichai (01:34:43) The actual productivity. Right? Engineering productivity, which is two different things, which is the more important metric, but I think it’ll get better, right? I think there’s no engineer who, tomorrow, if you magically became 2x more productive, it’s just going to create more things. You’re going to create more value-added things, and so I think you’ll find more satisfaction in your job, right?

Lex Fridman (01:35:08) There’s a lot of aspects. I mean, the actual Google code base might just improve because it’ll become more standardized, more easier for people to move about the code base because AI will help with that, and therefore, that will also allow the AI to understand the entire code base better, which makes the engineering aspect.

(01:35:25) So I’ve been using Cursor a lot as a way to program with Gemini and other models. One of its powerful things is it’s aware of the entire code base, and that allows you to ask questions of it. It allows the agents to move about that code base in a really powerful way. I mean, that’s a huge unlock.

Sundar Pichai (01:35:44) Think about, like, migrations, refactoring old code bases.

Lex Fridman (01:35:52) Refactoring, yeah.

Sundar Pichai (01:35:52) Yeah. I mean, think about once we can do all this in a much better, more robust way than where we are today.

Lex Fridman (01:35:57) I think in the end, everything will be written in JavaScript and run in Chrome. I think it’s all going to that direction. I mean, just for fun, Google has legendary coding interviews, like rigorous interviews for the engineers. Can you comment on how that has changed in the era of AI? It’s just such a weird… The whiteboard interview, I assume, is not allowed to have some prompts.

Sundar Pichai (01:36:24) Such a good question. Look, we are making sure we’ll introduce at least one round of in-person interviews for people just to make sure the fundamentals are there. I think they’ll end up being important, but it’s an equally important skill. Look, if you can use these tools to generate better code, I think that’s an asset. So overall, I think it’s a massive positive.

Lex Fridman (01:36:56) Vibe coding engineer, do you recommend people, students interested in programming still get an education in computer science in college education? What do you think?

Sundar Pichai (01:37:06) I do. If you have a passion for computer science, I would. Computer science is obviously a lot more than programming alone, so I would. I still don’t think I would change what you pursue. I think AI will horizontally allow impact every field. It’s pretty tough to predict in what ways. So any education in which you’re learning good first principles thinking, I think, is good education.

Android

Lex Fridman (01:37:37) You’ve revolutionized web browsing. You’ve revolutionized a lot of things over the years. Android changed the game. It’s an incredible operating system. We could talk for hours about Android. What does the future of Android look like? Is it possible it becomes more and more AI-centric, especially now you throw into the mix, Android XR, with being able to do augmented reality, and mixed reality, and virtual reality in the physical world?

Sundar Pichai (01:38:09) The best innovations in computing have come through a paradigm IO change, right? When with GUI, and then with a graphical user interface, and then with multi-touch in the context of mobile voice later on. Similarly, I feel like AR is that next paradigm. I think it was held back. Both the system integration challenges of making good AR is very, very hard.

(01:38:38) The second thing is you need AI to actually kind of… Otherwise, the IO is too complicated for you to have a natural seamless IO to that paradigm. AI ends up being super important, and so this is why Project Astra ends up being super critical for that Android XR world. But it is. I think when you use glasses and… Always been amazed at how useful these things are going to be.

(01:39:10) So look, I think it’s a real opportunity for Android. I think XR is one way it’ll kind of really come to life, but I think there’s an opportunity to rethink the mobile OS too, right? I think we’ve been kind of living in this paradigm of apps and shortcuts. All that won’t go away.

(01:39:28) But again, if you’re trying to get stuff done at an operating system level, it needs to be more agentic so that you can kind of describe what you want to do or it proactively understands what you’re trying to do, learns from how you’re doing things over and over again and kind of as adapting to you all. That is kind of like the unlock we need to go and do.

Lex Fridman (01:39:51) Well, the basic efficient minimalist UI. I’ve gotten a chance to try the glasses and they’re incredible. It’s the little stuff. It’s hard to put into words, but no latency. It just works. Even that little map demo, where you look down and you look up, and there’s a very smooth transition between the two, and very small amount of useful information is shown to you, enough not to distract from the world outside, but enough to provide a bit of context when you need it.

(01:40:25) In order to bring that into reality, you have to solve a lot of the OS problems to make sure it works when you’re integrating the AI into the whole thing. So everything you do launches an agent that answers some basic question.

Sundar Pichai (01:40:39) Good moonshot, you know?

Sundar Pichai (01:40:42) I love it. But I think we are, but it’s much closer to reality than other moonshots. We expect to have classes in the hands of developers later this year and in consumer science next year. So it’s an exciting time.

Lex Fridman (01:40:59) Yeah, well, extremely well-executed beam, all this stuff, because sometimes you don’t know. Like, somebody commented on a top comment on one of the demos of Beam. They said, “This will either be killed off in five weeks or revolutionize all meetings in five years.” And there’s very much, Google tries so many things, and sometimes, sadly, kills off very promising projects. But because there’s so many other things to focus on.

(01:41:27) I use so many Google products. Google Voice, I still use. I’m so glad that’s not being killed off. That’s still alive. Thank you, whoever is defending that, because it’s awesome, and it’s great. They keep innovating. I just want to list off, just as a big thank you, so Search, obviously, Google revolutionized, Chrome, and all of these could be multi-hour conversations. Gmail, I’ve been singing Gmail praises forever. Maps, incredible technological innovation on revolutionizing mapping. Android, like we talked about. YouTube, like we talked about. AdSense, Google Translate for the academic mind…

Lex Fridman (01:42:01) … Google Translate. For the academic mind Google Scholar is incredible. And also the scanning of the books. So making all the world’s knowledge accessible, even with that knowledge is a kind of niche thing, which Google Scholar is. And then obviously with DeepMind, with AlphaZero, AlphaFold and AlphaEvolve, I could talk forever about AlphaEvolve. That’s mind-blowing. All of that released. And as part of that set of things you’ve released in this year when those brilliant articles were written about Google is done. And like we talked about, pioneering self-driving cars and quantum computing, which could be another thing that is low-key that’s scuba diving its way to changing the world forever. So another pothead/ [inaudible 01:42:53] question. If you build AGI, what kind of question would you ask it? What would you want to talk about? Definitively, Google has created AGI that can basically answer any question. What topic are you going to? Where are you going?

Questions for AGI

Sundar Pichai (01:43:14) It’s a great question. Maybe it’s proactive by then and should tell me a few things I should know. But I think if I were to ask it, I think it’ll help us understand ourselves much better in a way that’ll surprise us, I think. And so maybe that, you already see people do it with the products, but in a AGI context, I think that’ll be pretty powerful.

Lex Fridman (01:43:43) On a personal level, or a general human nature?

Sundar Pichai (01:43:46) At a personal level.

Sundar Pichai (01:43:47) So you talking to AGI, I think there is some chance it’ll understand you in a very deep way, I think in a profound way, that’s a possibility. I think there is also the obvious thing of maybe it helps us understand the universe better in a way that expands the frontiers of our understanding of the world. That is something super exciting. But look, I really don’t know. I think I haven’t had access to something that powerful yet, but I think those are all possibilities.

Lex Fridman (01:44:29) I think on the personal level, asking questions about yourself, a sequence of questions like that about what makes me happy, I think we would be very surprised to learn through a sequence of questions and answers, we might explore some profound truths in a way that sometimes art reveals to us, great books reveal to us, great conversations with loved ones reveal. Things that are obvious in retrospect, but are nice when they’re said. But for me, number one question is about, how many alien civilizations are there? 100%.

Sundar Pichai (01:45:05) That’s going to be your first question?

Lex Fridman (01:45:06) Number one, how many living and dead alien civilizations? Maybe a bunch of follow-ups, like how close are they? Are they dangerous? If there’s no alien civilizations, why? Or if there’s no advanced alien civilizations, but bacteria-like life everywhere. Why? What is the barrier preventing it from getting to that? Is it because that when you get sufficiently intelligent, you end up destroying ourselves, because you need competition in order to develop an advanced civilization. And when you have competition it’s going to lead to military conflict, and conflict eventually kills everybody. I don’t know, I’m going to have that kind of discussion.

Sundar Pichai (01:45:47) Get an answer to the Fermi Paradox, yeah.

Lex Fridman (01:45:49) Exactly. And have a real discussion about it. I’m realizing now with your answer is a more productive answer, because I’m not sure what I’m going to do with that information. But maybe it speaks to the general human curiosity that Liz talked about, that we’re all just really curious, and making the world’s information accessible allows our curiosity to be satiated some with AI even more, we can be more and more curious and learn more about the world, about ourselves. And in so doing, I always wonder, I don’t know if you can comment on, is it possible to measure the, not the GDP productivity increase like we talked about, but maybe whatever that increases, the breadth and depth of human knowledge that Google has unlocked with Google Search, and now with AI mode with Gemini, it’s a difficult thing to measure.

Sundar Pichai (01:46:47) Many years ago there was, I think it was a MIT study, they just estimated the impact of Google Search. And they basically said it’s the equivalent to, on a per person basis, it’s few thousands of dollars per year per person, like is the value that got created per year. But yeah, it’s tough to capture these things, right? You kind of take it for granted as these things come, and the frontier keeps moving. But how do you measure the value of something like AlphaFold over time, and so on?

Lex Fridman (01:47:25) And also the increasing quality of life when you learn more. I have to say with some of the programming I do done by AI, for some reason I’m more excited to program.

Lex Fridman (01:47:36) And so the same with knowledge, with discovering things about the world, it makes you more excited to be alive. It makes you more curious, and the more curious, you are more exciting it is to live and experience the world. And it’s very hard to… I don’t know if that makes you more productive. Probably not nearly as much as it makes you happy to be alive. And that’s a hard thing to measure, the quality of life increases some of these things do. As AI continues to get better and better at everything that humans do, what do you think is the biggest thing that makes us humans special?

Future of humanity

Sundar Pichai (01:48:14) Look, I think [inaudible 01:48:19] the essence of humanity, there’s something about the consciousness we have, what makes us uniquely human, maybe the lines will blur over time. And it’s tough to articulate. But I hope, hopefully we live in a world where if you make resources more plentiful and make the world lesser of a zero-sum game over time, which it’s not, but in a resource constrained environment, people perceive it to be. And so I hope the values of what makes us uniquely human, empathy, kindness, all that surfaces more is the aspirational hope I have.

Lex Fridman (01:49:11) Yeah, it multiplies the compassion, but also the curiosity, just the banter, the debates we’ll have about the meaning of it all. And I also think in the scientific domains, all the incredible work that DeepMind is doing, I think we’ll still continue to play, to explore scientific questions, mathematical questions, physics questions, even as AI gets better and better at helping us solve some of the questions. Sometimes the question itself is a really difficult thing.

Sundar Pichai (01:49:43) Both the right new questions to ask and the answers to them and the self-discovery process, which it’ll drive, I think. Our early work with both co-scientist and AlphaEvolve, just super exciting to see.

Lex Fridman (01:49:59) What gives you hope about the future of human civilization.

Sundar Pichai (01:50:04) I’m an optimist, and I look at, if you were to say you take the journey of human civilization, we have relentlessly made the world better in many ways. At any given moment in time, there are big issues to work through it may look, but I always ask myself the question, would you have been born now or any other time in the past? I most often, not most often, almost always would rather be born now. And so that’s the extraordinary thing the human civilization has accomplished, and we’ve kind of constantly made the world a better place. And so something tells me as humanity, we always rise collectively to drive that frontier forward. So I expect it to be no different in the future.

Lex Fridman (01:51:00) I agree with you totally. I’m truly grateful to be alive in this moment. And I’m also really excited for the future, and the work you and the incredible teams here are doing is one of the big reasons I’m excited for the future. So thank you. Thank you for all the cool products you’ve built. And please don’t kill Google Voice. Thank you, Sundar.

Lex Fridman (01:51:22) Thank you for talking today. This was incredible. Thank you.

Sundar Pichai (01:51:24) Real pleasure. Appreciate it.

Demo: Google Beam

Lex Fridman (01:51:27) Thanks for listening to this conversation with Sundar Pichai. To support this podcast, please check out our sponsors in the description or at lexfridman.com/sponsors. Shortly before this conversation, I got a chance to get a couple of demos that frankly blew my mind. The engineering was really impressive. The first demo was Google Beam, and the second demo was the XR glasses. And some of it was caught on video, so I thought I would include here some of those video clips.

Andrew (01:52:01) Hey Lex, my name’s Andrew.

Andrew (01:52:03) I lead the Google Beam team and we’re going to be excited to show you a demo. We’re going to show you, I think, a glimpse of something new. So that’s the idea, a way to connect, a way to feel present from anywhere with anybody you care about. Here’s Google Beam. This is a development platform that we’ve built. So there’s a prototype here of Google Beam. There’s one right down the hallway. I’m going to go down and turn that on in a second. We’re going to experience it together. We’ll be back in the same room.

Lex Fridman (01:52:26) Wonderful. Whoa. Okay.

Lex Fridman (01:52:27) All right. This is real already. Wow.

Andrew (01:52:37) Good to see you. This is Google Beam. We’re trying to make it feel like you and I could be anywhere in the world, but when these magic windows open, we’re back together. I see you exactly the same way you see me. It’s almost like we’re sitting at the table sharing a table together, I could learn from you, talk to you, share a meal with you, get to know you.

Lex Fridman (01:52:37) So you can feel the depth of this.

Andrew (01:52:37) Yeah, great to meet you.

Lex Fridman (01:52:58) Wow. So for people who probably can’t even imagine what this looks like, there’s a 3D version. It looks real. You look real.

Andrew (01:53:06) Yeah. It looks to me. It looks real to you.

Lex Fridman (01:53:06) It looks like you’re coming out of the screen.

Andrew (01:53:09) We quickly believe once we’re in Beam that we’re just together. You settle into it.

Andrew (01:53:15) You’re naturally attuned to seeing the world like this, and you just get used to seeing people this way, but literally from anywhere in the world with these magic screens.

Lex Fridman (01:53:23) This is incredible.

Andrew (01:53:23) It’s a neat technology.

Lex Fridman (01:53:25) Wow. So I saw demos of this, but they don’t come close to the experience of this. I think one of the top YouTube comments and one of the demos I saw was like, why would I want a high definition? I am trying to turn off the camera. But this actually, this feels like the camera has been turned off and we’re just in the same room together. This is really compelling.

Andrew (01:53:44) That’s right. I know it’s kind of late in the day too. So I brought you a snack just in case you’re a little bit hungry.

Lex Fridman (01:53:50) So can you push it farther and it just becomes-

Andrew (01:53:52) Yeah. Let’s try to float it between rooms. It kind of fades it from my room into yours.

Lex Fridman (01:53:56) And then you see my hand. The depth of my hand.

Andrew (01:54:00) Of course, yeah. It feels like you… Try this, try give me a high five. And there’s almost a sensation of being in touch.

Andrew (01:54:06) Because you’re so attuned to that should be a high five, it feeling like you could connect with somebody that way.

Andrew (01:54:11) So it’s kind of a magical experience.

Lex Fridman (01:54:12) Oh, this is really nice. How much does it cost?

Andrew (01:54:14) Yeah. We’ve got a lot of companies testing it. We just announced that we’re going to be bringing it to offices soon as a set of products. We’ve got some companies helping us build these screens. But eventually, I think this will be in almost every screen.

Lex Fridman (01:54:26) There’s nothing, I’m not wearing anything. Well, I’m wearing a suit and tie to clarify, I am wearing clothes. This is not CGI. But outside of that, cool. And the audio is really good. And you can see me in the same three-dimensional way.

Andrew (01:54:40) Yeah, the audio is spatialized. So if I’m talking from here, of course it sounds like I’m talking from here. If I move to the other side of the room to here.

Andrew (01:54:48) So these little subtle cues, these really matter to bring people together, all the non-verbals, all the emotion, the things that are lost today. Here it is. We put it back into the system.

Lex Fridman (01:54:57) You pulled this off. Holy shit, they pulled it off. And integrated into this, I saw the translation also. This is the-

Andrew (01:55:05) Yeah, we’ve got a bunch of things. Let me show you a couple kind of cool things. Let’s do a little bit of work together. Maybe we could critique one of your latest videos. So you and I work together, so of course we’re in the same room. But with the super power, I can bring other things in here with me. And it’s nice. It’s like we could sit together, we could watch something. We could work. We’ve shared meals as a team together in this system. But once you do the presence aspect of this, you want to bring some other superpowers to it.

Lex Fridman (01:55:35) Wow. And so you could do review code together.

Andrew (01:55:38) Yeah, yeah, exactly. I’ve got some slides I’m working on. Maybe you could help me with this. Keep your eyes on me for a second. I’ll slide back into the center. I didn’t really move. But the system just kind of puts us in the right spot and knows where we need to be.

Lex Fridman (01:55:50) Oh, so you just turned to your laptop, the system moves you, and then it does the overlay automatically.

Andrew (01:55:55) It kind of warps the room to put things in the spot that they need to be in.

Andrew (01:55:59) Everything has a place in the room, everything has a sense of presence or spatial consistency. And that makes it feel like we’re together with us and other things.

Lex Fridman (01:56:06) I should also say, you’re not just three-dimensional, it feels like you’re leaning out of the screen, you’re coming out of the screen. You’re not just in that world three-dimensionaly. Yeah, exactly. Holy crap. Move back to center. Okay.

Andrew (01:56:23) Let me tell you how this works. You probably already have the premise of it. But there’s two things, two really hard things that we put together. One is a AI video model. So there’s a set of cameras, you asked about those earlier. There’s six color cameras, just like webcams that we have today, taking video streams and feeding them into our AI model and turning that into a 3D video of you and I. It’s effectively a light field. So it’s kind of an interactive 3D video that you can see from any perspective. That’s transmitted over to the second thing. And that’s a light field display. And it’s happening bidirectionally. I see you and you see me both in our light field displays. These are effectively flat televisions or flat displays, but they have the sense of dimensionality, depth, size is correct. You can see shadows and lighting are correct. And everything’s correct from your vantage point.

(01:57:12) So if you move around ever so slightly, and I hold still, you see a different perspective here. You see kind of things that were included become revealed. You see shadows that move in the way they should move. All of that’s computed and generated using our AI video model for you. It’s based on your eye position, where does the right scene need to be placed in this light field display for you just to feel present?

Lex Fridman (01:57:33) It’s real time. No latency. I’m not seeing latency. You weren’t freezing up at all.

Andrew (01:57:37) No, no, I hope not. I think it’s you and I together real time. That’s what you need for real communication. And at a quality level it’s realistic.

Lex Fridman (01:57:46) This is awesome. Is it possible to do three people? Is that going to move that way also?

Andrew (01:57:50) Yeah. Let me kind of show you. So if she enters the room with us, you can see her, you can see me. And if we had more people, you eventually lose a sense of presence. You kind of shrink people down. You lose a sense of scale. So think of it as the window fits a certain number of people. If you want to fit a big group of people, you want the boardroom or the big room, you need a much wider window. If you want to see just grandma and the kids, you can do smaller windows. So everybody has a seat at the table, or everybody has a sense of where they belong, and there’s this sense of presence that’s obeyed. If you have too many people, you kind of go back to 2D metaphors that we’re used to people in tiles placed anywhere.

Lex Fridman (01:58:27) For the image I’m seeing, did you have to get scanned?

Andrew (01:58:29) I mean, I see you without being scanned. So it’s just so much easier if you don’t have to wear anything. You don’t have to pre-scan.

Andrew (01:58:34) And you just do it the way it’s supposed to happen without anybody having to learn anything or put anything on.

Lex Fridman (01:58:39) I thought you had to solve the scanning problem. But here you don’t. It’s just cameras. Its just vision.

Andrew (01:58:46) That’s right. It’s video. Yeah, we’re not trying to make an approximation of you, because everything you do every day matters. I cut myself shaving, I put on a pin. All the little kind of aspects of you, those just happen. We don’t have the time to scan or kind of capture those or dress avatars. We kind of appear as we appear. And so all that’s transmitted truthfully as it’s happening.

Demo: Google XR Glasses

Speaker 3 (01:59:12) Nice to meet you. So as Max mentioned, got the eye glasses here. We start with the foundation of great glasses, something stylish, lightweight, wearable. Then we say how can we build great technology and experiences on top of that? One of the core tenets of the Android XR platform, this idea of a multimodal conversational device. See what you see, hear what you hear. So you’ve got a camera, you’ve got speakers, multiple microphones for speaker isolation. I’ll give you a chance to try these yourself. Yeah, sorry, I woke it up there.

Speaker 3 (01:59:39) Yeah. So the first thing you see is a super simple, straightforward home screen.

Speaker 3 (01:59:45) So you probably see the time, the weather, calendar, appointments there. This is designed to be sort of your one-stop shop for quick glanceable information throughout the day. We want it to do something that’s easy to get what you need, and then go back to what you’re doing. So you can imagine turning on the display, getting that quick glance, and then continuing on your way. You can be fully conversational at this point. You can ask questions for example about the paintings. You can interrupt. You can ask follow up questions. And as I mentioned before, if you want to at any time, pause, you just tap there on the right.

Lex Fridman (02:00:12) Gemini, how much does this painting cost?

Gemini (02:00:15) The painting is called the Bridge of Louis-Philippe, 1875 by Armand Guillaumin. It appears to be a print. And it would be impossible to know it’s cost without finding the seller.

Lex Fridman (02:00:25) Okay, so this is fake. I understand. Why do humans pay so much money for paintings?

Gemini (02:00:30) People pay a lot for paintings, because of their artistic merits, the artist’s name and rarity, and for the appreciation you learn.

Speaker 3 (02:00:38) So a few more things that we want to show you just for sake of time, you go ahead and long press on the side again to salute Gemini there. There you go. Did you catch Google I/O last week by any chance?

Speaker 3 (02:00:48) So you might’ve seen on stage the Google Maps experience very briefly. I wanted to give you a chance to get a sense of what that feels like today. You can imagine you’re walking down the street. If you look up like you’re walking straight ahead, you get quick turn-by-turn directions, so you have a sense of what the next turn is like.

Speaker 3 (02:01:05) Keeping your phone in your pocket.

Lex Fridman (02:01:06) Oh, that’s so intuitive.

Speaker 3 (02:01:07) Sometimes you need that quick sense of which way’s the right way?

Speaker 3 (02:01:14) Yeah. So let’s say you’re coming out of Subway, getting out of a cab. You can just glance down at your feet. We have it set up to translate from Russian to English. I think I get to wear the glasses and you speak to me, if you don’t mind.

Lex Fridman (02:01:22) I can speak Russian. [foreign language 02:01:27].

Speaker 3 (02:01:29) I’m doing well. How are you doing?

Lex Fridman (02:01:30) I’m tempted to swear, tempted to say inappropriate things. [foreign language 02:01:37].

Speaker 3 (02:01:41) I see it transcribed in real time. And so obviously based on the different languages and the sequence of subjects and verbs, there’s a slight delay sometimes, but it’s really just like subtitles for the real world. Cool.

Biggest invention in human history

Lex Fridman (02:01:53) Thank you for this. All right, back to me. Hopefully watching videos of me having my mind blown like the apes in 2001 Space Odyssey playing with a monolith was somewhat interesting. Like I said, I was very impressed. And now I thought, if it’s okay, I could make a few additional comments about the episode and just in general. In this conversation with Sundar Pichai, I discussed the concept of the Neolithic package, which is the set of innovations that came along with the first agricultural revolution about 12,000 years ago, which included the formation of social hierarchies, the early primitive forms of government, labor specialization, domestication of plants and animals, early forms of trade, large scale cooperations of humans like that required to build, yes, the pyramids and temples like Göbekli Tepe. I think this may be the right way to actually talk about the inventions that changed human history, not just as a single invention, but as a kind of network of innovations and transformations that came along with it.

(02:03:02) And the productivity multiplier framework that I mentioned in the episode, I think is a nice way to try to concretize the impact of each of these inventions under consideration. And we have to remember that each node in the network of the fast follow-on inventions is in itself a productivity multiplier. Some are additive, some are multiplicative. So in some sense, the size of the network in the package is the thing that matters when you’re trying to rank the impact of inventions on human history. The easy picks for the period of biggest transformation, at least in sort of modern day discourse is the Industrial Revolution, or even in the 20th century, the computer or the internet. I think it’s because it’s easiest to intuit for modern day humans, the exponential impact of those technologies.

(02:04:05) But recently, I suppose this changes week to week, but I have been doing a lot of reading on ancient human history. So recently my pick for the number one invention would have to be the first agricultural revolution, the Neolithic package that led to the formation of human civilizations. That’s what enabled the scaling of the collective intelligence machine of humanity, and for us to become the early bootloader for the next 10,000 years of technological progress, which yes, includes AI and the tech that builds on top of AI. And of course it could be argued that the word invention doesn’t properly apply to the agricultural revolution. I think actually Yuval Noah Harari argues that it wasn’t the humans who were the inventors, but a handful of plant species, namely wheat, rice and potatoes. This is strictly a fair perspective. But I’m having fun, like I said, with this discussion. Here, I just think of the entire earth as a system that continuously transforms. And I’m using the term invention in that context. Asking the question of when was the biggest leap on the log-scale plot of human progress?

(02:05:23) Will AI, AGI, ASI eventually take the number one spot on this ranking? I think it has a very good chance to do so due again to the size of the network of inventions that will come along with it. I think we discuss in this podcast the kind of things that would be included in the so-called AI package. But I think there’s a lot more possibilities, including discussed in previous podcasts and many previous podcasts, including with Dario Amodei, talking on the biological innovation side, the science progress side. And this podcast, I think we talk about something that I’m particularly excited about in the near term, which is unlocking the cognitive capacity of the entire landscape of brains that is the human species. Making it more accessible through education and through machine translation, making information, knowledge and the rapid learning and innovation process accessible to more humans, to the entire 8 billion, if you will. So I do think language or machine translation apply to all the different methods that we use on the internet to discover knowledge is a big unlock. But there are a lot of other stuff in the so-called AI package like discussed with Dario, curing all major human diseases. He really focuses on that in The Machines of Love and Grace essay. I think there will be huge leaps in productivity for human programmers and semi-autonomous human programmers. So humans in the loop, but most of the programming is done by AI agents. And then moving that towards a superhuman AI researcher that’s doing the research that develops and programs the AI system in itself. I think there’ll be huge transformative effects from autonomous vehicles. These are the things that we maybe don’t immediately understand, or we understand from an economics perspective, but there will be a point when AI systems are able to interpret, understand, interact with the human world to sufficient degree to where many of the manually controlled human in the loop systems we rely on become fully autonomous.

(02:07:43) And I think mobility is such a big part of human civilization that there will be effects on that, that they’re not just economic, but are social cultural and so on. And there’s a lot more things I could talk about for a long time. So obviously the integration utilization of AI in the creation of art, film, music, I think the digitalization and automating basic functions of government, and then integrating AI into that process, thereby decreasing corruption and costs and increasing transparency and efficiency. I think we as humans, individual humans, will continue to transition further and further into cyborgs. There’s already a AI in the loop of the human condition, and that will become increasingly so as AI becomes more powerful. The thing I’m obviously really excited about is major breakthroughs in science, and not just on the medical front but on fundamental physics, which would then lead to energy breakthroughs increasing the chance that we become, we actually become a Kardashev Type I civilization. And then enabling us in so doing to do interstellar exploration of space and colonization of space. I think there also in the near term, much like with the industrial revolution that led to rapid specialization of skills of expertise, there might be a great sort of de-specialization. So as the AI system become superhuman experts at particular fields, there might be greater and greater value to being the integrator of AIs for humans to be generalists. And so the great value of the human mind will come from the generalists, not the specialists. That’s a real possibility that that changes the way we are about the world, that we want to know a little bit of a lot of things and move about the world in that way. That could have when passing a certain threshold, a complete shift in who we are as a collective intelligence as a human species. Also as an aside, when thinking about the invention that was the greatest in human history, again for a bit of fun, we have to remember that all of them build on top of each other.

(02:10:15) And so we need to look at the Delta, the step change on the, I would say impossibly to perfectly measure plot of exponential human progress. Really we can go back to the entire history of life on earth. And a previous podcast guest, Nick Lane does a great job of this in his book Life Ascending, listing these 10 major inventions throughout the evolution of life on earth like DNA, photosynthesis, complex cells, sex, movement, sight, all those kinds of things. I forget the full list that’s on there. But I think that’s so far from the human experience that my intuition about, let’s say productivity multipliers of those particular inventions completely breaks down, and a different framework is needed to understand the impact of these inventions of evolution. The origin of life on Earth, or even the Big Bang itself of course is the OG invention that set the stage for all the rest of it. And there are probably many more turtles under that which are yet to be discovered.

(02:11:26) So anyway, we live in interesting times, fellow humans. I do believe the set of positive trajectories for humanity outnumber the set of negative trajectories, but not by much. So let’s not mess this up. And now let me leave you with some words from French philosopher Jean de La Bruyère, “Out of difficulties, grow miracles.” Thank you for listening, and hope to see you next time.

DeepSeek、中国、OpenAI、英伟达、xAI、台积电、Stargate 与 AI 超级集群 (2025-02-03)

DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters (2025-02-03)

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:资深分析师 Dylan Patel (SemiAnalysis) 与 AI 研究科学家 Nathan Lambert (Allen Institute for AI),在被称为“DeepSeek 时刻”的行业震动之后,共同探讨了这一事件的技术原理、商业模式及其对全球 AI 竞争格局的深远影响。
  • 核心论点:本次对话的核心论题是,DeepSeek 的崛起并非孤立的技术突破,而是AI发展进入新阶段的标志性事件。它揭示了前沿AI能力正从“算力与数据量的暴力美学”转向“架构与工程效率的极致优化”。DeepSeek通过在混合专家(MoE)架构、底层硬件优化和强化学习(RL)应用上的激进创新,在受限的硬件条件下实现了与顶级模型相当的性能,并大幅降低了训练与推理成本。这不仅重塑了开源与闭源、中美AI竞赛的动态平衡,更预示着未来AI竞争的焦点将是**“单位算力的智能产出效率”**。这一转变将深刻影响从芯片设计(NVIDIA、TSMC)、模型训练方法论(OpenAI、Meta)到国家级AI战略(美国出口管制)的每一个层面。

2. 🧠 深度观点解析 (Deep Dive Analysis)

维度一:DeepSeek 的“成本-效率”革命

  • 核心观点:DeepSeek 通过在模型架构和系统工程上的深度优化,实现了在训练和推理成本上的数量级压缩,证明了极致的工程执行力是当前阶段AI竞争的关键壁垒
  • 原理解构
    1. 高稀疏度混合专家模型 (Highly-Sparsified MoE):传统模型是“稠密”的,每次计算都激活所有参数。MoE 架构模仿大脑分区工作的原理,将模型分为多个“专家”,每次只激活一小部分。DeepSeek 将此推向极致,采用了 256 个专家中激活 8 个(稀疏度 1/32) 的超高稀疏度设计,远超 Mistral 的 8 选 2(稀疏度 1/4)。这使得模型总参数量(知识容量)可以非常大(670B),但单次计算量(成本)却很小(仅 37B 左右),从根本上打破了模型规模与计算成本的线性关系。
    2. 多头潜在注意力 (MLA, Multi-head Latent Attention):这是 DeepSeek 在 Transformer 核心注意力机制上的架构创新。通过复杂的低秩近似数学,MLA 显著减少了注意力计算过程中的内存占用(可节省 80-90% 的注意力内存),这对于处理长上下文和降低推理成本至关重要,尤其是在生成数万 token 的推理(Reasoning)任务中。
    3. 垂直整合的系统工程 (Full-Stack Engineering):DeepSeek 的团队展现了罕见的跨层优化能力,他们直接在 NVIDIA 的 PTX(类似汇编)层面编写代码,绕过了高层级的 NCCL 通信库,以手动调度 GPU 核心(SMs)的方式来优化数据通信。这种做法源于他们使用的 H800 芯片在互联带宽上受限,是“限制倒逼创新”的典型案例,最终获得了比通用库更高的训练效率。
  • 证据/案例:DeepSeek-V3 的训练成本声称为 500 万美元,远低于业界对同级别模型动辄数千万甚至上亿美元的估算。其推理API定价,尤其是 DeepSeek-R1,比 OpenAI 的 o1 便宜 27 倍($2/百万token vs $60/百万token),直接引发了市场对 NVIDIA 股价的恐慌,因为这挑战了“AI进步必须依赖更多、更昂贵的芯片”这一核心假设。

维度二:强化学习(RL)开启“涌现式推理”新范式

  • 核心观点:以 DeepSeek-R1 和 OpenAI o1 为代表的“推理模型”,其核心突破在于应用强化学习(Reinforcement Learning)在可验证任务(verifiable tasks)上进行大规模的“试错学习”,从而使模型自发“涌现”出复杂的、类似人类的思考过程(Chain of Thought)。
  • 原理解构:对话引用了 Andrej Karpathy 的观点,将 AI 学习分为两种模式:
    1. 模仿学习 (Imitation Learning):即传统的预训练(Pre-training)和监督微调(SFT),模型学习模仿人类数据。这是基础,但无法超越数据本身。
    2. 试错学习 (Trial-and-Error Learning):即强化学习。模型被给予一个目标(如解对一道数学题、通过一个单元测试)和一个环境(可以生成无限的解题尝试),通过奖励正确的尝试、惩罚错误的尝试来学习。在这个过程中,模型为了最大化奖励,会自发探索出最高效的解决策略。这些策略,如“让我再检查一遍”、“这个假设似乎是错的,换个思路”,是人类无法直接标注或教授的,它们是模型为了达成目标而**“发现”**的。
  • 证据/案例
    • AlphaGo vs AlphaZero:AlphaGo 结合了模仿人类棋谱和自我对弈(RL),而 AlphaZero 完全放弃人类数据,仅通过自我对弈学习,最终变得更强。这证明了摆脱人类先验知识的 RL 能达到更高的高度。
    • DeepSeek-R1 的思考过程:在回答“关于人类的一个真正新颖的见解”时,R1 展示了长达 157 秒的思考过程,其中包含了自我质疑(“用户想要一些别处看不到的东西,让我挖得更深”)、类比和概念重组,最终给出了一个深刻的答案。这种过程无法通过模仿学习得到,是 RL 训练的直接产物。

维度三:地缘政治下的“硅幕”与算力竞赛

  • 核心观点:美国的芯片出口管制并未阻止中国AI的发展,反而加速了其在工程效率上的创新,并将全球AI竞赛推向了“算力部署总量”和“算力利用效率”两个维度的竞争。这场竞赛的本质是一场关于时间差的赌博。
  • 原理解构
    1. 出口管制的逻辑:美国政府的战略目标并非完全阻止中国训练大模型,而是限制其大规模部署AI应用的能力。训练一个前沿模型可能只需要几千或几万张卡,但要将AI能力渗透到经济和军事的方方面面,则需要数百万张卡的推理算力。通过限制高性能芯片的供应,美国意在拉大两国在AI总算力上的差距。
    2. 中国的反制与优势:限制倒逼了 DeepSeek 这样的公司在软件和算法层面榨干受限硬件(如 H800)的每一分性能。长期来看,中国的巨大工业产能和国家动员能力使其在建设数据中心所需的基础设施(电力、钢铁、建筑)上拥有无与伦比的优势。一旦中国在芯片制造上取得突破(即使是落后几代),其整体算力规模可能会反超。
    3. 时间差赌博:这场博弈的关键在于 AGI 或“超级AI”到来的时间点。如果超级AI在未来 5-10 年内出现,美国的算力领先将转化为决定性的地缘政治优势。但如果这个过程需要更长时间,出口管制反而会削弱美国芯片公司(如 NVIDIA)的收入,同时给予中国充足的时间建立起自给自足的、规模庞大的AI产业链。
  • 证据/案例
    • NVIDIA 的“中国特供版”芯片:从 H800(削减互联带宽)到 H20(削减算力但增强内存),NVIDIA 在美国法规的夹缝中求生存,而中国公司则利用这些芯片的特性进行针对性优化。
    • DeepSeek 的背景:其母公司 High-Flyer 是一家量化对冲基金,早在2021年就拥有上万张 A100 GPU,在出口管制生效前已积累了大量算力。
    • 中美数据中心建设规模:OpenAI 的 Stargate 项目计划达到 2.2 GW 的功率,而中国的钢铁厂或铝厂单个设施的功耗就已达到千兆瓦级别,显示了其在能源和基建上的潜力。

维度四:“开放”的演进与战略价值

  • 核心观点:“开源”在AI时代正演变为一场复杂的战略博弈,其定义从代码和数据,扩展到了模型权重、技术报告的透明度,甚至是商业许可的限制。DeepSeek 的激进开放策略正在重塑这一领域的格局。
  • 原理解构
    1. 开放的光谱:对话区分了几个层次:
      • 真·开源 (True Open Source):开放权重、训练代码、训练数据(如 Allen AI 的追求)。
      • 开放权重 (Open Weights):仅开放模型权重,附带技术报告(如 Llama、DeepSeek)。这是当前主流。
      • 闭源 (Closed Source):API-only(如 OpenAI GPT-4, Anthropic Claude)。
    2. 开放的战略动机
      • 对内:人才吸引。Meta 和 DeepSeek 都将开放作为吸引顶尖人才的工具。
      • 对外:标准制定。Mark Zuckerberg 明确表示,让 Llama 成为全球开源标准,符合美国的国家利益,可以防止中国标准主导世界。
      • 商业竞争:DeepSeek 采用极其宽松的 MIT 许可,允许无限制的商业使用和数据再利用,这给 Llama 相对更严格的许可带来了巨大压力。
  • 证据/案例
    • DeepSeek vs Llama:DeepSeek-R1 的 MIT 许可允许用户基于其输出创造合成数据,而 Llama 的许可对此有限制。DeepSeek 的技术论文在训练细节上的披露比 Llama 3 更为详尽和可操作。
    • 安全与开放的矛盾:Anthropic 出于对安全的极度重视,即使拥有更强的模型也选择不发布,而 DeepSeek 则采取“快速发布”的策略,这在客观上拉低了行业对发布前安全审查的门槛。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识

    • 效率提升 ≠ 需求下降 (Jevons Paradox):主流观点认为 DeepSeek 降低了AI成本会打击 NVIDIA 的销售。然而,现实恰恰相反。AI 能力的成本越低,应用场景就越广,从而创造了对算力更大规模的需求。对话指出,在 DeepSeek 发布后,AWS 上的 H100 和 H200 GPU 租用价格和需求实际上上升了。
    • 出口管制可能长期“资敌”:普遍认为出口管制能有效遏制中国。但嘉宾提出,如果 AGI 发展周期较长(>10年),这些管制措施将削弱美国公司的全球市场和研发投入,同时倒逼中国建立完整的自主产业链,最终可能让中国在工业和制造能力的优势下实现反超。
    • 推理模型的成本瓶颈是内存而非算力:人们通常认为 GPU 的核心是算力(FLOPS)。但对于生成长篇“思考过程”的推理模型,其瓶颈是KV缓存(KV Cache)带来的巨大内存压力。这意味着,拥有更大内存带宽和容量的芯片(如 NVIDIA H200,甚至中国的 H20)在某些推理任务上可能比纯算力更高的芯片更具优势。
  • 盲点与局限

    • 对“模型训练成本”的误读:媒体和市场过度关注 DeepSeek 报告的 500 万美元“预训练成本”,却忽略了背后数倍于此的研发、实验、失败运行和数据处理的隐性成本。一个前沿模型的真实投入远不止最后一次“YOLO run”。
    • 安全审查成为创新“减速带”:以 Anthropic 为例,过度的安全审查和发布流程的延迟(Claude 3.5 Sonnet 据称训练完成 9-10 个月后才发布),使其在快速迭代的竞争中处于不利地位。这暴露了西方AI安全理念与中国“快速迭代”模式之间的文化和战略冲突。
  • 未解之谜

    • 推理能力的泛化之谜:目前通过 RL 在数学、代码等可验证领域获得的强大推理能力,如何有效地迁移到哲学、艺术、战略等开放、无标准答案的领域,仍然是一个核心的开放性问题。
    • AI Agent 的“可靠性鸿沟”:从能够执行单步任务到能够可靠地完成多步、长链条任务的“智能体”(Agent),存在一个巨大的可靠性鸿沟。即便是 99.9% 的单步成功率,在成百上千步的复杂任务中,最终的失败率也接近 100%。如何跨越这个“六西格玛”难题尚无答案。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “Almost every single shocking result of deep learning and the source of all magic is always [trial-and-error learning, reinforcement learning] … These thoughts are emergent. Three exclamation points. And this is actually seriously incredible, impressive, and new.”

    • 中文意译:“几乎每一个深度学习中令人震惊的结果,以及所有‘魔法’的来源,都源于(试错学习,即强化学习)……这些(模型自发的)思考过程是涌现出来的!!!这确实是令人难以置信、印象深刻的全新事物。”
    • 语境:引用 Andrej Karpathy 的评论,解释为什么 DeepSeek-R1 的“思考过程”如此重要。它不是被教会的,而是模型在追求目标(解对题)的过程中自己发现的,这标志着 AI 从模仿智能到发现智能的飞跃。
  2. “Superhuman persuasion will happen before superhuman intelligence.”

    • 中文意译:“超人类的说服力,将先于超人类的智能出现。”
    • 语境:引用 Sam Altman 的观点,警示在 AGI 到来之前,AI 更直接的风险可能是其影响和操纵人类思想与情感的能力。这为讨论模型中可能存在的文化偏见和后门提供了深刻背景。
  3. “For a successful technology, reality must take precedence over public relations, for nature cannot be fooled.”

    • 中文意译:“对于一项成功的技术,现实必须优先于公关,因为自然是无法被欺骗的。”
    • 语境:播客结尾引用物理学家费曼的名言。这总结了整场对话的精神内核:无论市场如何炒作、地缘政治如何博弈,最终决定AI走向的,是底层的物理定律、工程效率和算法的真实能力。DeepSeek 的成功正是“现实”对“公关”的一次冲击。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)

    • 技术栈:行业将全面转向高稀疏度MoE架构基于RL的后训练(Post-training)。对 GPU 的需求将从单纯追求 FLOPS 转向对内存带宽和容量的高度重视(利好 H200 及后续产品)。
    • 产品形态:“推理过程可见”将成为前沿模型的标配,Chain of Thought (CoT) 将从一个技术术语变为核心用户体验。这将催生更多需要深度思考和规划的AI应用。
    • 竞争格局:开源模型(特别是拥有宽松许可的)将对闭源商业模型构成更严峻的成本和性能压力,迫使 OpenAI/Anthropic 等公司加速创新并可能调整定价策略。AI应用层的“封装厂”(wrappers)将迎来黄金时代,因为底层模型能力将持续快速提升且成本下降。
  • 长期终局 (5-10年)

    • AI 竞赛的终局是能源和基建竞赛:如果嘉宾的设想成真,训练万亿参数、具备复杂推理能力的模型将需要千兆瓦(GW)级别的专用数据中心。AI 领导地位将不再仅由算法和人才决定,更取决于一个国家建设和运营大规模能源及计算基础设施的能力。科技巨头将越来越像能源和重工业公司。
    • 两个平行的AI生态系统:地缘政治和出口管制可能最终导致世界分裂为两个相对独立的AI生态系统:一个由美国主导,依赖 NVIDIA-TSMC 供应链和西方价值观;另一个由中国主导,建立在自主芯片和国内数据之上。两者将在技术标准、应用伦理和内容审查上表现出显著差异。
    • “后训练”成为算力消耗的主体:随着预训练数据逐渐被耗尽,大部分算力将被用于模型的持续“训练”或“自我完善”,即在虚拟沙盒(如模拟计算机操作、机器人仿真)中进行无休止的强化学习。AI 将从“一次性毕业的大学生”变为“终身学习的有机体”。
  • 行动建议

    • 开发者必须掌握强化学习(RL),尤其是如何为特定领域构建可验证的奖励函数。同时,应深入研究 MoE 模型的部署和优化,因为高效推理将是核心竞争力。利用AI编程助手(Copilot)不应只停留在代码补全,而应转向更高层次的架构设计和系统调试。
    • 投资者:投资逻辑需从纯软件向硬件、能源和基础设施延伸。关注在内存技术、光通信、先进散热和电网技术上有突破的公司。理解Jevons悖论,不要因为AI效率提升而看空上游硬件供应商。
    • 创业者:最大的机会在于垂直领域的AI Agent。与其追求通用智能,不如专注于一个可通过API和自动化测试构建“可验证环境”的狭窄领域(如特定软件的自动化、工业流程优化),通过RL训练出“专家级”Agent。同时,利用能力不断增强且成本持续下降的开源模型,构建依赖于“未来模型会更强”这一预期的应用。

这是一份基于 Lex Fridman 播客(嘉宾:Dylan Patel 与 Nathan Lambert)的深度行业分析报告。本报告旨在解析 DeepSeek 事件背后的技术范式迁移、半导体地缘政治博弈以及 AGI 的竞争终局。


深度研报:DeepSeek 冲击波下的 AI 范式转移与全球博弈

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:本次对话发生于 DeepSeek R1 震撼全球 AI 领域之后。嘉宾 Dylan Patel(SemiAnalysis 创始人)负责硬核半导体与基础设施分析,Nathan Lambert(AI2 研究科学家)负责大模型后训练与强化学习视角。
  • 核心论点:DeepSeek R1 的崛起并非偶然,它标志着 AI 竞争正从“算力暴力美学”转向“架构与算法效率”的精细化时代。对话揭示了一个残酷的现实:尽管美国在顶尖芯片上筑起围墙,但中国的工程师通过底层算子优化(PTX/CUDA 级)、创新架构(MLA/MoE)以及强化学习(RL)的突破,成功实现了对硅谷昂贵范式的“降维打击”。这场竞赛的本质已演变为:谁能以最低的成本实现单位智能的规模化生产。

2. 🧠 深度观点解析 (Deep Dive Analysis)

A. 效率的底层重构:MLA 与稀疏 MoE

  • 核心观点:DeepSeek 并非靠堆显卡获胜,而是通过彻底重写内存管理和权重激活逻辑。
  • 原理解构
    • MLA (Multi-head Latent Attention):传统 Transformer 的 KV Cache(键值缓存)随上下文长度呈二次方增长,成为内存瓶颈。MLA 通过“低秩压缩”技术将内存需求降低了 80-90%,使得模型在长文本推理时极具成本优势。
    • 极度稀疏的 MoE:DeepSeek V3 拥有 671B 参数,但每次推理仅激活 37B。这种“高总容量、低激活成本”的架构让模型在拥有“海量知识存储”的同时,保持了极高的运行速度。
  • 证据:DeepSeek R1 的推理成本仅为 OpenAI o1 的 1/27(2美元 vs 60美元/百万 Token)。

B. 强化学习(RL)的“魔法时刻”

  • 核心观点:推理能力的突破不再依赖“模仿人类数据”,而在于“试错后的奖励”。
  • 原理解构:Andre Karpathy 指出,DeepSeek R1 证明了 RL 在可验证领域(数学/代码)的自发涌现能力。模型通过数百万次的自我博弈(Self-play),在没有人类示教的情况下,自发学会了“反思”、“检查错误”和“重新尝试”。
  • 案例:DeepSeek R1-Zero 仅靠基础模型加 RL 训练,就展示出了极强的推理思维链(CoT),证明了“计算换智能”在推理侧的巨大潜力。

C. 极限压力下的工程创新:绕过出口管制

  • 核心观点:美国的禁令反而迫使中国实验室在软件底层实现了对 NVIDIA 硬件的极限压榨。
  • 原理解构:由于 H800/H20 芯片的互联带宽受限,DeepSeek 的工程师绕过了 NVIDIA 官方的通信库(NCCL),直接使用汇编级的 PTX 指令手动调度流处理器(SM)。这种“在手术刀上跳舞”的优化,让他们在带宽减半的情况下依然维持了极高的集群利用率。
  • 证据:DeepSeek 仅用约 2000 块被阉割的 H800 芯片,就完成了万亿参数量级模型的训练。

D. 智能成本的“Jevons 悖论”

  • 核心观点:智能价格的崩塌不会导致需求减少,反而会引发算力需求的指数级爆发。
  • 原理解构:尽管 DeepSeek 让推理变得便宜,但由于“推理换智能”范式的确立(如 o1/R1 在思考时会消耗成千上万个 Token),用户为了解决复杂问题愿意支付更高的总成本。
  • 历史类比:GPT-3 到现在的 3 年内,单位智能成本下降了 1200 倍。这种成本下降直接推动了 NVIDIA H100/H200 芯片在租赁市场上的持续供不应求。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识(安全税风险):对话含蓄地指出,美国实验室(如 Anthropic/OpenAI)因过度执迷于“对齐”和“安全测试”,导致发布周期被拖长(9-10个月)。DeepSeek 的“快速发布”策略实际上在利用这种“安全监管窗口期”抢夺全球开发者生态。
  • 盲点与局限(DeepSeek 的服务危机):DeepSeek 虽然模型强悍,但在云端基础设施(Serving)上极度匮乏。即便模型开源,由于其 MoE 架构对内存的特殊要求,全球中小型云厂商在短期内很难高效、盈利地部署它。
  • 出口管制的双刃剑:管制虽然短期拖慢了中国 AGI 的进度,但长期看,它正在催生一个完全独立于美国的、从原材料到软件栈的闭环生态。一旦中国实现 5nm/3nm 的自主化,美国的制裁手段将彻底失效。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “The source of all magic is trial-and-error learning.”
    • (所有魔法的源泉都是试错学习。) —— 语境:讨论强化学习(RL)如何让模型学会像人类一样在思考中反思。
  2. “Superhuman persuasion will happen before superhuman intelligence.”
    • (超级说服力会早于超级智能出现。) —— 语境:警告 AI 可能会通过潜移默化的叙事操纵人类,而非通过暴力。
  3. “Semiconductor manufacturing is a hive of ants where everyone knows exactly which specific plasma etch they will focus on for their entire career.”
    • (半导体制造就像蚁穴,每个人终其一生都在钻研某个特定的等离子刻蚀技术。) —— 语境:解释为什么 TSMC 的文化壁垒极难在其他国家复制。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)
    • 算力竞赛升级:Meta、OpenAI、xAI 将加速建设“吉瓦级(Gigawatt)”数据中心(如 Stargate)。
    • 软件开发革命:AI Agent 将在代码、法律等“可验证领域”迅速替代中级白领工作。
  • 长期终局 (5-10年)
    • 智能平权与主权 AI:随着单位智能成本趋近于零,各国的核心竞争力将取决于其“能源储备”与“数据中心扩建速度”。
    • 机器人时代:当推理模型被压缩到可以在边缘设备(如人形机器人)上运行时,物理世界的自动化将迎来拐点。
  • 行动建议
    • 对于开发者:不要只做“Wrapper”,要关注如何利用“推理侧算力”去解决高价值的垂直领域问题。
    • 对于投资者:重新评估 NVIDIA 之外的供应链机会(如液冷技术、电力基础设施、光通信)。
    • 对于创业者:中国模型的崛起意味着模型能力已成为大宗商品,真正的护城河将转移到“专有数据采集”和“场景内闭环的 RL 优化”。 —

AI算力、地缘政治与开源革命:DeepSeek时刻的深度技术复盘与行业预判

1. 🎯 核心论题与背景

对话背景: 本次对话源于中国模型 DeepSeek-R1(DeepSeek推理模型) 及其后续版本 V3 的发布。SemiAnalysis 的半导体架构专家 Dylan Patel 与伯克利 AI 实验所的科学家 Nathan Lambert 与 Lex Fridman 深入探讨了这一“DeepSeek时刻”背后的技术原点、成本结构、地缘政治影响以及美国 AI 领导地位的脆弱性。

核心论点: “高性能”不再等同于“极昂贵”和“闭源”。DeepSeek 通过 MoE(混合专家模型)MLA(多头潜在注意力机制) 以及极深层次(汇编/CUDA 以下)的底层工程优化,在远低于美国巨头的算力预算下实现了对 GPT-4 量级的推理能力。这一突破不仅冲击了市场的算力定价权,更揭示了 “Jevons Paradox(杰文斯悖论)” —— 效率提升反而加速了算力的整体需求,在限制 中国高端芯片获取 的同时,反而迫使美国监管在芯片禁令与商业利益之间寻找更精妙的平衡。未来的竞争焦点已从单纯的 Model Scale(模型规模) 转向 Compute Efficiency(算力效率)Infrastructure Capacity(基础设施容量,特别是电力) 的地缘政治博弈。


2. 🧠 深度观点解析

2.1 算力效率的量化突破:低于 $5M 的训练与 Provable Reasoning

  • 核心观点: DeepSeek 的低价并非通过开源廉价训练数据(数据目前仍是质量决定因素),而是通过极致的 模型架构创新(MoE & MLA)底层硬件控制(低级 CUDA/PTX 编程) 实现。其训练成本至少比传统估测的“数十亿美元”级别的模型要低一个数量级。
  • 原理解构:
    1. 稀疏激活: DeepSeek-V3 采用 MoE 架构,拥有 685B 总参数,但在推理时仅激活约 37B 参数。通过 Routing 机制,它驯服了极高稀疏度(32:1 或 256:8 的专家比率),大幅降低了 FLOPS(浮点运算)需求。
    2. Transformer 内存瓶颈消除: 传统的自注意力机制随上下文长度呈二次方增长(Quadratic Memory),导致长推理(Chain of Thought)极其昂贵。DeepSeek 发明的 MLA(Multi-head Latent Attention) 通过低秩近似压缩 Key-Value Cache,将内存占用降低了 80%-90%。
    3. 深水区硬件工程: 由于受限无法使用高性能 H100,DeepSeek 团队深入到 NVIDIA NCCL(通信库)以下,手动编写 PTXSM(SMT Multithreading) 调度代码,手动管理 GPU 流多处理器间的通信周期,将受限的 H800 芯片性能榨干。
  • 证据/案例: Dylan Patel 提到,在严格的许可证限制下,DeepSeek 宣称仅使用 2,000 张 H800 GPU 进行 V3 预训练。Claude 3.5 Sonnet 的成本是 DeepSeek 的 $2 for 1M tokens,而 DeepSeek R1 约为 $2,是 OpenAI o1 的 1/27。GPT-3 的推理成本在过去几周内因 Jevons Paradox 逻辑已经下降了 1200 倍。

2.2 Reasoning(推理)与 Verifiable Reward(可验证奖励的新范式)

  • 核心观点: AI 的下一代竞争点不再是单纯的“更多参数”,而是 “Test-Time Compute”(推理时计算)。通过强化学习(RL)在数学和代码等具有清晰“正确答案”的域中训练模型,迫使模型自己探索“思维链”,使其产生 Meta-Reasoning(元推理)能力。
  • 原理解构:
    1. RLVR (Reinforcement Learning with Verifiable Rewards): 与依赖人类标注的 RLHF 不同,RLVR 直接使用数学题或 LeetCode 的正确答案作为 Reward。模型被允许进行 1000 次尝试,每次都会生成推理过程,只要最终答案正确,它就通过奖励信号优化整个推理过程。
    2. Latent Knowledge Activation: 一旦模型掌握了这些推理技巧,这种方法可以从深层基座模型中提炼出来,蒸馏到更小、效率更高的模型中。
  • 证据/案例: DeepSeek-R1 展示了长达分钟的内部推理过程,讨论“自我”的本质;OpenAI 的 o3 在解决 ARC-AGI 任务时使用了 1000 个并行采样路径并进行选择。Nathan Lambert 指出,相比于人类标注,RLVR 使得“猴子打马赛克电波干扰”等异常数据也能被模型(和人类审计员)识别并剔除,从而极其有效地稳定了训练过程。

2.3 杰文斯悖论与算力需求的指数级反弹

  • 核心观点: “算力禁令”在短期内可能延缓中国先进模型的研发速度,但从长期看,效率的提升会导致总算力消耗的激增。因此,出口管制本身可能无意中加速了全球其他地区(如东南亚、拉美)的数据中心建设,因为算力变得便宜了。
  • 原理解构:
    1. 成本曲线下行: 随着模型效率提升,原本因为“太贵”而无法普及的任务(如长上下文推理、复杂任务自动化)变得可行。
    2. 基础设施路径依赖: 一旦数据中心和电力基础设施建成,它们就会持续运行。美国公司(如 Microsoft, Oracle)正在全球布局(马来西亚、新加坡),利用地缘缝隙转运 GPU 并租赁给中国公司,这使得禁令并非铁板一块。
  • 证据/案例: AWS H100 的英伟达芯片租赁价格因 DeepSeek 发布而上升(AWS 试图卖出溢价通过右侧曲线),且 H200 芯片一度缺货,因为 R1 推理模型对显存容量的要求高于 H100,而 H200 更符合需求。而 xAI 在孟菲斯建立的实例拥有 200,000 张 GPU,且正投资巨额建设天然气发电厂以满足其 2.2 GW 的电力需求。

2.4 地缘政治:芯片禁令与“软件定义的战争”

  • 核心观点: 半导体的战争本质是 R&D 杀手锏的战争。TSMC(台湾)是控制全球先进制造的关键,即使美国试图在本土建立制造,缺乏先进光刻设备和成熟生态依旧举步维艰。AI 的出口管制不仅针对硬件,也包含后端的模型,旨在防止超级 AI 被非西方阵营获得。
  • 原理解构:
    1. 技术去中心化的错觉: 尽管开源运动呼吁美国保持主导,但限制了开源模型(如 Llama)的传播反而加速了 DeepSeek(MIT 许可证)等模型的崛起。
    2. “代理人”网络: 中国的 ByteDance(TikTok)等公司通过合法的云租赁或走私方式,绕过直接的硬件交易。禁令迫使产业升级,正如从 7nm 限制倒逼出 Huawei 的 Ascend 910(尽管落后但已可用)以及国内中低端芯片的迅速普及。
  • 证据/案例: U.S. 禁止向中国出口高端 AI 服务(禁止云租赁),导致 Data curation(数据策展)成为反击手段。Nathan Lambert 提到,美国国家安全委员会早在 2022 年 10 月 7 日禁令发布前就已预见此冷战格局,并将 AI 视为下一代军事与经济霸权的关键。

2.5 超大规模集群的物理极限:电力与冷却的博弈

  • 核心观点: 传统的数据中心标准正在被颠覆。算力的物理瓶颈已从存储/传输转移至 电力输入与散热。标准的 CPU/云计算模式下,投资收益很明确;但在 AI 训练/Cost Arm Race 中,数千亿瓦的峰值功率成为新的游戏规则图腾。
  • 原理解构:
    1. 峰值问题: 在训练循环中,通信与计算有时间重叠,若重叠不佳,总会有“空闲”时间,导致 GPU 瞬间负载下降,这对电网稳定性是噩梦。
    2. 巨型工厂模式: Elon Musk 的 xAI 在旧家电工厂中塞入 200k GPU,自建天然气管道和百台液冷机组,证明了这种“工厂”式的即时供电能力是超越传统数据中心的唯一解药。
  • 证据/案例: OpenAI 的 Stargate 项目(承诺 $100B CapEx)首阶段仅有约 1.8 GW 功率,且大部分是来自于原有的数据中心改造。Meta 在路易斯安那州建设天然气发电站。甚至 Meta 在 PyTorch 中加入了一个 powerplant_nuclear_blowup = 0 的 hack 操作,用于在通信低谷期欺骗服务器假装在计算,以平滑电网负载峰值。

3. 💡 反直觉与批判性视角

🛑 打破共识

  • “安全优先”的代价: Anaphoret(Anthropic)等公司坚持严格的 RLHF 和 Safety Alignment,导致发布周期变长。DeepSeek 采取了更激进的“Branch Out”(发散性试验)策略,甚至主动在 R1 输出中包含乱码和自检(“Let me check this…”),虽然看起来“脏”,但这种暴露内部思维轨迹的机制,反而使其感知上的智能度和非人类独特现象(如“想人类”)被认为更强烈。
  • 开源是摆设: 即使拥有像 Meta 的 Llama 3 这样的大量参数和知乎臭皮匠式的技术报告,在 DeepSeek 这样的开源标准面前,由于缺乏“全栈开源”(训练数据+底层微调代码),OpenWeights 模型(如 Llama)在全球部署时的“即插即用”效率不如 DeepSeek。
  • 算法领先不仅仅是 PPT: 许多人认为效率提升(MoE, MLA)只是短期优化,将被更强的 AI 摊薄。但三位嘉宾认为,当前的“Reasoning POO”(思维链)范式正在改变 AI 的本质 —— 从“语料模仿”转向“试错式进化”(就像 AlphaGo 打败人类)。

👁️ 盲点与局限

  • 软技能的护城河: DeepSeek 发布后,首轮 API 服务的延迟极高(Service Down,排队数分钟,Token 生成量极低),因为其算力瓶颈不在于它们有多智能,而在于它们有多少 GPU 可供推理。高效率的模型在推理阶段可能由于服务量激增而瞬间瘫痪。
  • 人类偏好的不可替代性: 虽然 RL 在 Verifiable Domains(数学/代码)表现完美,但在更广泛的“Creative/Philosophical”(创造性/哲学性)任务中,人类的 Prejudice(偏见)才是判断“好”的标准。目前没有任何系统能取代人类在 Tülu Bench 等基准测试中的判断力。

❓ 未解之谜

  • Agent 的“9-X”可靠性陷阱: 模型在单个任务(如解数学题)正确率可达到 90%,但在连续的、不确定的 Agent 任务链中(如通过乱糟糟的航空公司网站订票),成功率会以指数级下降。如何通过 Agent 框架解决“由于第 5 步失败导致全部任务失败的”风险,是目前技术上的最大黑盒。
  • 逻辑真空: DeepSeek 模型在许多未验证其“人设”或“世界观”的公开测试中,不仅有逻辑错误,还无意识地输出了非政治正确的言论。这表明 AI 的“灵魂”配置极其不稳定。

4. 💎 金句与高光时刻

  1. 关于 DeepSeek 的战术: “We do this extremely well… necessity is the mother of invention.” (DeepSeek 极好地处理了稀疏性、路由和底层代码调度,而这是他们在受到芯片限制后逼迫自己练出来的 “necessity is the mother of invention”。)

    • 译文: 这种极致的工程优化,正是“限制”铸就的“发明之母”。
  2. 关于 AI 的终极形态: “If you look at DeepSeek-R1, it reveals the reasoning… like race to the bottom vs race to the top.” (DeepSeek 展示了 R1 的推理过程;这反映了安全主义者追求的“竞争向上”与效率驱动者追求的低价扩散之间的博弈。)

    • 译文: 从为了排查安全隐患的“防止民粹化”,到这时为了技术爆发而放弃的安全审查,DeepSeek 被视为可能引发该赛道“低俗化”竞赛的开端。
  3. 关于出口管制: “Jevons paradox… as the efficiency goes up, magically counter intuitively, the total resource consumption goes up.” (随着效率提升,分母(总资源消耗)反而上升。这对理解为什么美国限制芯片反而可能导致中国通过廉价算力完成规模扩张至关重要。)

    • 译文: 杰文斯悖论告诉我们,效率的提升往往会反直觉地增加总资源的消耗。
  4. 关于代理人的未来: “The differentiation will be the norm is that you have agents… smaller, faster.” (差异化竞争的关键不是拥有最聪明的模型,而是构建能够调用智能的“Agent”。)

    • 译文: 真正的价值竞逐已经不再是单纯比拼谁的最强模型(ChatGPT vs DeepSeek),而是谁能构建出能够调用这些模型的智能代理。

5. 🚀 行业启示与未来推演

⚡ 短期影响 (1-3年)

  • 算力定价战白热化: DeepSeek 以 1/27 的成本展示了可能性,迫使 OpenAI 等巨头不得不加速内部效率优化,甚至可能通过“降价换取未来收益”来抢占市场主导权。
  • 推理成为新战场: 部署 R1 或 o3-mini 具备长上下文推理能力的 API Becoming A commodity. “RAG (检索增强生成)” 与 “Chain-of-Thought” 技术栈成为开发者的标配。
  • 硬件的微调: 生成式模型正在吞噬 NVIDIA 的 H200 而非 H100。未来的芯片设计必须优先考虑 HBM(高带宽内存)和互联带宽,而非单纯的 FLOPS。

🔮 长期终局 (5-10年)

  • 双速世界: 地缘政治将导致 AI 基础设施分裂。美国主导的“安全版 AGI” 与 中国/其他地区主导的“高性能版开源 AI”将并存,虽然底层算法(Transformer)可能共享。
  • 基础设施工程: 全球超大规模数据中心将演变为独立的“微电网”。围绕 国家数据安全 建立的物理隔离(如军用芯片、特定算法)将比软件防火墙更有效。
  • 就业市场的重构: 软件工程师的平均生产率将翻倍,门槛将被迫提高。低端编码工作萎缩,取而代之的是 “System Architect”(系统架构师) —— 这类人负责设计什么样的任务应该由哪个 AI Agent 去执行,并审核其纠错能力。

🎯 行动建议

  • 对于投资者: 减少对“纯模型公司”的过度估值担忧(因为 DeepSeek 证明了低成本路能通),转向 物理基础设施(配电、液冷、网络光互连)和 AI 辅助工具(Copilot/IDEs)。
  • 对于开发者/企业: 优先部署具备短期推理能力且易于私有化部署的模型(DeepSeek-V3, Qwen)。在设计业务流程时,应从“人工主导”转向 “Human-in-the-loop(人机回路)” 模式,将责任与判断保留在人类手中,仅在边缘处使用 AI 加速。

逐字稿

Lex Fridman (00:00:00) The following is a conversation with Dylan Patel and Nathan Lambert. Dylan runs SemiAnalysis, a well-respected research and analysis company that specializes in semiconductors, GPUs, CPUs, and AI hardware in general. Nathan is a research scientist at the Allen Institute for AI and is the author of the amazing blog on AI called Interconnects. They are both highly respected, read and listened to by the experts, researchers and engineers in the field of AI. And personally, I’m just a fan of the two of them, so I used the DeepSeek moment that shook the AI world a bit as an opportunity to sit down with them and lay it all out from DeepSeek, OpenAI, Google XAI, Meta, Anthropic to NVIDIA and DSMC, and to US-China-Taiwan relations and everything else that is happening at the cutting edge of AI. This conversation is a deep dive into many critical aspects of the AI industry.

(00:01:08) While it does get super technical, we try to make sure that it’s still accessible to folks outside of the AI field by defining terms, stating important concepts explicitly, spelling out acronyms, and in general, always moving across the several layers of abstraction and levels of detail. There is a lot of hype in the media about what AI is and isn’t. The purpose of this podcast in part is to cut through the hype, through the bullshit and the low resolution analysis and to discuss in detail how stuff works and what the implications are. Let me also, if I may comment on the new OpenAI o3-mini reasoning model, the release of which we were anticipating during the conversation and it did indeed come out right after. Its capabilities and costs are on par with our expectations as we stated. OpenAI o3-mini is indeed a great model, but it should be stated that DeepSeek-R1 has similar performance on benchmarks, is still cheaper and it reveals its chain of thought reasoning, which o3-mini does not. It only shows a summary of the reasoning, plus R1 is open weight and o3-mini is not.

(00:02:29) By the way, I got a chance to play with o3-mini and anecdotal vibe check wise, I felt that o3-mini, specifically o3-mini high is better than R1. Still for me personally, I find that Claude Sonnet 3.5 is the best model for programming except for tricky cases where I will use o1 Pro to brainstorm. Either way, many more better AI models will come including reasoning models both from American and Chinese companies. They’ll continue to shift the cost curve, but the quote “DeepSeek moment” is indeed real. I think it will still be remembered five years from now as a pivotal event in tech history due in part to the geopolitical implications, but for other reasons to, as we discuss in detail from many perspectives in this conversation. This is the Lex Fridman podcast, to support it please check out our sponsors in the description. And now, dear friends, here’s Dylan Patel and Nathan Lambert.

Lex Fridman (00:03:33) A lot of people are curious to understand China’s DeepSeek AI models, so let’s lay it out. Nathan, can you describe what DeepSeek-V3 and DeepSeek-R1 are, how they work, how they’re trained? Let’s look at the big picture and then we’ll zoom in on the details.

Nathan Lambert (00:03:50) DeepSeek-V3 is a new mixture of experts, transformer language model from DeepSeek who is based in China. They have some new specifics in the model that we’ll get into. Largely this is a open weight model and it’s a instruction model like what you would use in ChatGPT. They also released what is called the base model, which is before these techniques of post-training. Most people use instruction models today, and those are what’s served in all sorts of applications. This was released on, I believe, December 26th or that week. And then weeks later on January 20th, DeepSeek released DeepSeek-R1, which is a reasoning model, which really accelerated a lot of this discussion.

(00:04:38) This reasoning model has a lot of overlapping training steps to DeepSeek-V3, and it’s confusing that you have a base model called V3 that you do something to to get a chat model and then you do some different things to get a reasoning model. I think a lot of the AI industry is going through this challenge of communications right now where OpenAI makes fun of their own naming schemes. They have GPT-4o, they have OpenIA o1, and there’s a lot of types of models, so we’re going to break down what each of them are. There’s a lot of technical specifics on training and go through them high level to specific and go through each of them.

Lex Fridman (00:05:14) There’s so many places we can go here, but maybe let’s go to open weights first. What does it mean for a model to be open weights and what are the different flavors of open source in general?

Nathan Lambert (00:05:24) This discussion has been going on for a long time in AI. It became more important since ChatGPT or more focal since ChatGPT at the end of 2022. Open weights is the accepted term for when model weights of a language model are available on the internet for people to download. Those weights can have different licenses, which is effectively the terms by which you can use the model. There are licenses that come from history and open source software. There are licenses that are designed by companies specifically all of Llama, DeepSeek, Qwen, Mistral, these popular names in open weight models have some of their own licenses. It’s complicated because not all the same models have the same terms. The big debate is on what makes a model open weight. It’s like, why are we saying this term? It’s a mouthful. It sounds close to open source, but it’s not the same.

(00:06:17) There’s still a lot of debate on the definition and soul of open source AI. Open source software has a rich history on freedom to modify, freedom to take on your own, freedom for many restrictions on how you would use the software and what that means for AI is still being defined. For what I do, I work at the Allen Institute for AI, we’re a nonprofit, we want to make AI open for everybody and we try to lead on what we think is truly open source. There’s not full agreement in the community, but for us that means releasing the training data, releasing the training code, and then also having open weights like this. And we’ll get into the details of the models and again and again as we try to get deeper into how the models were trained, we will say things like the data processing, data filtering data quality is the number one determinant of the model quality.

(00:07:09) And then a lot of the training code is the determinant on how long it takes to train and how fast your experimentation is. Without fully open source models where you have access to this data, it is hard to know… Or it’s harder to replicate. We’ll get into cost numbers for DeepSeek-V3 on mostly GPU hours and how much you could pay to rent those yourselves. But without the data, the replication cost is going to be far, far higher. And same goes for the code.

Lex Fridman (00:07:37) We should also say that this is probably one of the more open models out of the frontier models.

Lex Fridman (00:07:45) In this full spectrum where probably the fullest open source, like you said, open code, open data, open weights, this is not open code, this is probably not open data and this is open weights and the licensing is MIT license or it’s… There’s some nuance in the different models, but it’s towards the free… In terms of the open source movement, these are the good guys.

Nathan Lambert (00:08:13) Yeah. DeepSeek is doing fantastic work for disseminating understanding of AI. Their papers are extremely detailed in what they do and for other teams around the world, they’re very actionable in terms of improving your own training techniques. And we’ll talk about licenses more, the DeepSeek-R1 model has a very permissive license. It’s called the MIT license. That effectively means there’s no downstream restrictions on commercial use, there’s no use case restrictions. You can use the outputs from the models to create synthetic data.

(00:08:47) And this is all fantastic. I think the closest peer is something like Llama where you have the weights and you have a technical report. And the technical report is very good for Llama. One of the most read PDFs of the year last year is the Llama 3 paper, but in some ways it’s slightly less actionable. It has less details on the training specifics. I think less plots and so on. And the Llama 3 license is more restrictive than MIT. And then between the DeepSeek custom license and the Llama license, we could get into this whole rabbit hole, I think. We’ll make sure we want to go down the license rabbit hole before we do specifics.

Lex Fridman (00:09:22) It should be stated that one of the implications that DeepSeek, it puts pressure on Llama and everybody else on OpenAI to push towards open source. And that’s the other side of open source is that you mentioned is how much is published in detail about it, so how open are you with the insights behind the code? How good is the technical reports? Are there hand wavy or is there actual details in there? And that’s one of the things that DeepSeek did well is they published a lot of the details.

Nathan Lambert (00:09:52) Especially in the DeepSeek-V3, which is their pre-training paper. They were very clear that they are doing interventions on the technical stack that go at many different levels. For example, on their to get highly efficient training, they’re making modifications at or below the CUDA layer for NVIDIA chips. I have never worked there myself and there are a few people in the world that do that very well, and some of them are at DeepSeek. These types of people are at DeepSeek and leading American frontier labs, but there are not many places.

Lex Fridman (00:10:25) To help people understand the other implication of open weights, just there’s a topic we’ll return to often here. There’s a fear that China, the nation might have interest in stealing American data, violating privacy of American citizens. What can we say about open weights to help us understand what the weights are able to do in terms of stealing people’s data?

Nathan Lambert (00:10:55) These weights that you can download from Hugging Face or other platforms are very big matrices of numbers. You can download them to a computer in your own house that has no internet and you can run this model and you’re totally in control of your data. That is something that is different than how a lot of language model usage is actually done today, which is mostly through APIs where you send your prompt to GPUs run by certain companies. And these companies will have different distributions and policies on how your data is stored, if it is used to train future models, where it is stored, if it is encrypted, and so on. The open weights are you have your fate of data in your own hands, and that is something that is deeply connected to the soul of open source.

Lex Fridman (00:11:37) It’s not the model that steals your data, it’s whoever is hosting the model, which could be China if you’re using the DeepSeek app or it could be Perplexity. You’re trusting them with your data or OpenAI, you’re trusting them with your data. And some of these are American companies, some these are Chinese companies, but the model itself is not doing the stealing, it’s the host. All right, so back to the basics. What’s the difference between DeepSeek-V3 and DeepSeek-R1? Can we try to lay out the confusion potential?

Nathan Lambert (00:12:11) Yes. For one, I have very understanding of many people being confused by these two model names, so I would say the best way to think about this is that when training a language model, you have what is called pre-training, which is when you’re predicting the large amounts of mostly internet text you’re trying to predict the next token. And what to know about these new DeepSeek models is that they do this internet large scale pre-training once to get what is called DeepSeek-V3 base. This is a base model, it’s just going to finish your sentences for you. It’s going to be harder to work with than ChatGPT. And then what DeepSeek did is they’ve done two different post-training regimes to make the models have specific desirable behaviors. What is the more normal model in terms of the last few years of AI, an instruct model, a chat model, a quote unquote “aligned model”, a helpful model. There are many ways to describe this is more standard post-training. This is things like instruction tuning, reinforcement learning from human feedback.

(00:13:12) We’ll get into some of these words and this is what they did to create the DeepSeek-V3 model. This was the first model to be released and it is very high performant, it’s competitive with GPT-4, Llama 405B and so on. And then when this release was happening, we don’t know their exact timeline or soon after they were finishing the training of a different training process from the same next token prediction based model that I talked about, which is when this new reasoning training that people have heard about comes in in order to create the model that is called DeepSeek-R1. The R through this conversation is good for grounding for reasoning. And the name is also similar to OpenAI’s o1, which is the other reasoning model that people have heard about. And we’ll have to break down the training for R1 in more detail because for one we have a paper detailing it, but also it is a far newer set of techniques for the AI community, so it is a much more rapidly evolving area of research.

Lex Fridman (00:14:11) Maybe we should also say the big two categories of training of pre-training and post-training. These are umbrella terms that people use, so what is pre-training and what is post-training and what are the different flavors of things underneath the post-training umbrella?

Nathan Lambert (00:14:28) Pre-training, I’m using some of the same words to really get the message across is you’re doing what is called autoregressive prediction to predict the next token in a series of documents. This is done over standard practice is trillions of tokens, so this is a ton of data that is mostly scraped from the web. And some of DeepSeek’s earlier papers, they talk about their training data being distilled for math. I shouldn’t use this word yet, but taken from Common Crawl and that’s a public access that anyone listening to this could go download data from the Common Crawl website. This is a crawler that is maintained publicly. Yes, other tech companies eventually shift to their own crawler and DeepSeek likely has done this as well as most frontier labs do. But this sort of data is something that people can get started with and you’re just predicting text in a series of documents.

(00:15:18) This can be scaled to be very efficient and there’s a lot of numbers that are thrown around in AI training like how many floating-point operations or flops are used. And then you can also look at how many hours of these GPUs that are used. And it’s largely one loss function taken to a very large amount of compute usage. You set up really efficient systems and then at the end of that you have the base model and pre-training is where there is a lot more of complexity in terms of how the process is emerging or evolving and the different types of training losses that you’ll use. I think this is a lot of techniques grounded in the natural language processing literature. The oldest technique which is still used today is something called instruction tuning or also known as supervised fine-tuning. These acronyms will be IFT or SFT.

(00:16:16) People really go back and forth throughout them, and I’ll probably do the same, which is where you add this formatting to the model where it knows to take a question that is, explain the history of the Roman Empire to me or a sort of question you’ll see on Reddit or Stack Overflow. And then the model will respond in a information-dense but presentable manner. The core of that formatting is in this instruction tuning phase. And then there’s two other categories of loss functions that are being used today. One I’ll classify as preference fine-tuning. Preference fine-tuning is a generalized term for what came out of reinforcement learning from human feedback, which is RLHF. This reinforcement learning from human feedback is credited as the technique that helped ChatGPT break through. It is a technique to make the responses that are nicely formatted like these Reddit answers more in tune with what a human would like to read.

(00:17:14) This is done by collecting pairwise preferences from actual humans out in the world to start and now AIs are also labeling this data and we’ll get into those trade-offs. And you have this contrastive loss function between a good answer and a bad answer. And the model learns to pick up these trends. There’s different implementation ways. You have things called reward models. You could have direct alignment algorithms. There’s a lot of really specific things you can do, but all of this is about fine-tuning to human preferences. And the final stage is much newer and will link to what is done in R1 and these reasoning models is I think OpenAI’s name for this, they had this new API in the fall, which they called the reinforcement fine-tuning API. This is the idea that you use the techniques of reinforcement learning, which is a whole framework of AI.

(00:18:02) There’s a deep literature here to summarize, it’s often known as trial and error learning or the subfield of AI where you’re trying to make sequential decisions in a certain potentially noisy environment. There’s a lot of ways we could go down that, but fine-tuning language models where they can generate an answer and then you check to see if the answer matches the true solution. For math or code you have an exactly correct answer for math, you can have unit tests for code. And what we’re doing is we are checking the language model’s work and we’re giving it multiple opportunities on the same questions to see if it is right. And if you keep doing this, the models can learn to improve in verifiable domains to a great extent. It works really well. It’s a newer technique in the academic literature. It’s been used at frontier labs in the US that don’t share every detail for multiple years. This is the idea of using reinforcement learning with language models and it has been taking off especially in this DeepSeek moment.

Lex Fridman (00:19:00) And we should say that there’s a lot of exciting stuff going on again across the stack, but the post-training probably this year, there’s going to be a lot of interesting developments in the post-training. We’ll talk about it. I almost forgot to talk about the difference between DeepSeek-V3 and R1 on the user experience side. Forget the technical stuff, forget all of that, just people that don’t know anything about AI, they show up. What’s the actual experience, what’s the use case for each one when they actually type and talk to it? What is each good at and that kind of thing?

Nathan Lambert (00:19:32) Let’s start with DeepSeek-V3, again it’s more people would tried something like it. You ask it a question, it’ll start generating tokens very fast and those tokens will look like a very human legible answer. It’ll be some sort of markdown list. It might have formatting to help you draw to the core details in the answer and it’ll generate tens to hundreds of tokens. A token is normally a word for common words or a sub word part in a longer word, and it’ll look like a very high quality Reddit or Stack Overflow answer. These models are really getting good at doing these across a wide variety of domains, I think. Even things that if you’re an expert, things that are close to the fringe of knowledge, they will still be fairly good at, I think.

(00:20:19) Cutting edge AI topics that I do research on, these models are capable for study aid and they’re regularly updated. Where this changes is with the DeepSeek- R1, what is called these reasoning models is when you see tokens coming from these models to start, it will be a large chain of thought process. We’ll get back to chain of thought in a second, which looks like a lot of tokens where the model is explaining the problem. The model will often break down the problem and be like, okay, they asked me for this. Let’s break down the problem. I’m going to need to do this. And you’ll see all of this generating from the model. It’ll come very fast in most user experiences. These APIs are very fast, so you’ll see a lot of tokens, a lot of words show up really fast, it’ll keep flowing on the screen and this is all the reasoning process.

(00:21:06) And then eventually the model will change its tone in R1 and it’ll write the answer where it summarizes its reasoning process and writes a similar answer to the first types of model. But in DeepSeek’s case, which is part of why this was so popular even outside the AI community, is that you can see how the language model is breaking down problems. And then you get this answer, on a technical side they train the model to do this specifically where they have a section which is reasoning, and then it generates a special token, which is probably hidden from the user most of the time, which says, okay, I’m starting the answer. The model is trained to do this two stage process on its own. If you use a similar model in say, OpenAI, OpenAI’s user interface is trying to summarize this process for you nicely by showing the sections that the model is doing and it’ll click through, it’ll say breaking down the problem, making X calculation, cleaning the result, and then the answer will come for something like OpenAI.

Lex Fridman (00:22:03) Maybe it’s useful here to go through an example of a DeepSeek-R1 reasoning.

Nathan Lambert (00:22:10) If you’re looking at the screen here, what you’ll see is a screenshot of the DeepSeek chat app, and at the top is thought for 157 seconds with the drop-down arrow. Underneath that if we were in an app that we were running, the drop-down arrow would have the reasoning.

Lex Fridman (00:22:25) In this case, the specific question which I’m philosophically/ pothead inclined, so this is asking DeepSeek-R1 for one truly novel insight about humans. And it reveals the reasoning and basically the truly novel aspect is what’s pushing the reasoning to constantly… The model asking itself, is this truly novel? It’s actually challenging itself to be more novel, more counterintuitive, less cringe, I suppose. Some of the reasoning says… This is just snapshots. Alternatively, humans have a unique meta-emotion where they feel emotions about their own emotions e.g. feeling guilty about being angry. This recursive emotional layering creates complex motivational drives that don’t exist in other animals. The insight is that human emotions are nested. It’s reasoning through how humans feel emotions, it’s reasoning about meta emotions.

Nathan Lambert (00:23:21) It’s going to have pages and pages of this. It’s almost too much to actually read, but it’s nice to skim as it’s coming.

Lex Fridman (00:23:27) It’s a James Joyce like stream of consciousness. And then it goes, wait, the user wants something that’s not seen anywhere else. Let me dig deeper. And consider the human ability to hold contradictory beliefs simultaneously. Cognitive dissonance is known, but perhaps the function is to allow flexible adaptation, so on and so forth. That really captures the public imagination that holy shit, this isn’t intelligence/almost like an inkling of sentience because you’re thinking through, you’re self-reflecting, you’re deliberating. And the final result of that after 157 seconds is, humans instinctively convert selfish desires into cooperative systems by collectively pretending abstract rules, money, laws, rights are real. These shared hallucinations act as quote “games” where competition is secretly redirected to benefit the group, turning conflict into society’s fuel. Pretty profound.

Nathan Lambert (00:24:31) This is a potential digression, but a lot of people have found that these reasoning models can sometimes produce much more eloquent text. That is a at least interesting example I think depending on how open-minded you are, you find language models interesting or not, and there’s a spectrum there.

Lex Fridman (00:24:49) We’ll talk about different benchmarks and so on but some has just a vibe. That in itself is a, let’s say quote “fire” tweet. If I’m trying to produce something where people are like, “Oh, shit.” Okay, so that’s a chance probably return to it more. How were they able to achieve such low cost on the training and the inference? Maybe you could talk to the training first.

Low cost of training

Dylan Patel (00:25:16) There’s two main techniques that they implemented that are probably the majority of their efficiency, and then there’s a lot of implementation details that maybe we’ll gloss over or get into later that contribute to it. But those two main things are, one is they went to a mixture of experts model, which we’ll define in a second. And then the other thing is that they invented this new technique called MLA, latent attention. Both of these are big deals. Mixture of experts is something that’s been in the literature for a handful of years. And OpenAI with GPT-4 was the first one to productize a mixture of experts model. And what this means is when you look at the common models around that most people have been able to interact with that are open, think Llama. Llama is a dense model i.e. every single parameter or neuron is activated as you’re going through the model for every single token you generate.

(00:26:10) Now, with a mixture of experts model, you don’t do that. How does the human actually work? Is like, oh, well my visual cortex is active when I’m thinking about vision tasks and other things. My amygdala is when I’m scared. These different aspects of your brain are focused on different things. A mixture of experts, models attempts to approximate this to some extent. It’s nowhere close to what a brain architecture is, but different portions of the model activate. You’ll have a set number of experts in the model and a set number that are activated each time. And this dramatically reduces both your training and inference costs because now if you think about the parameter count as the total embedding space for all of this knowledge that you’re compressing down during training, one, you’re embedding this data in instead of having to activate every single parameter, every single time you’re training or running inference, now you can just activate on a subset and the model will learn which expert to route to for different tasks.

(00:27:07) And so this is a humongous innovation in terms of, hey, I can continue to grow the total embedding space of parameters. And so DeepSeek’s model is 600 something billion parameters, relative to Llama 405B, it’s 405 billion parameters, relative to Llama 70B, it’s 70 billion parameters. This model technically has more embedding space for information to compress all of the world’s knowledge that’s on the internet down. But at the same time, it is only activating around 37 billion of the parameters, so only 37 billion of these parameters actually need to be computed every single time you’re training data or inferencing data out of it. Versus again, the Llama model, 70 billion parameters must be activated or 405 billion parameters must be activated, so you’ve dramatically reduced your compute cost when you’re doing training and inference with this mixture of experts architecture.

Nathan Lambert (00:27:57) Should we break down where it actually applies and go into the transformer? Is that useful?

Lex Fridman (00:28:02) Let’s go. Let’s go into the transformer.

Nathan Lambert (00:28:03) The transformer is a thing that is talked about a lot, and we will not cover every detail. Essentially the transformer is built on repeated blocks of this attention mechanism and then a traditional dense fully connected multilayer perception, whatever word you want to use for your normal neural network. And you alternate these blocks. There’s other details and where mixture of experts is applied is at this dense model. The dense model holds most of the weights if you count them in a transformer model, so you can get really big gains from those mixture of experts on parameter efficiency at training and inference because you get this efficiency by not activating all of these parameters.

Lex Fridman (00:28:44) We should also say that a transformer is a giant neural network.

Lex Fridman (00:28:49) And then there’s, for 15 years now, there’s what’s called the deep learning revolution. Network’s gotten larger and larger. At a certain point, the scaling laws appeared where people realized-

Dylan Patel (00:29:00) This is a scaling law shirt by the way.

Lex Fridman (00:29:02) Representing scaling laws. Where it became more and more formalized that bigger is better across multiple dimensions of what bigger means. But these are all neural networks we’re talking about, and we’re talking about different architectures of how to construct these neural networks such that the training and the inference on them is super efficient.

Nathan Lambert (00:29:24) Yeah. Every different type of model has a different scaling law for it, which is effectively for how much compute you put in the architecture will get to different levels of performance at test tasks. And mixture of experts is one of the ones at training time even if you don’t consider the inference benefits, which are also big. At training time, your efficiency with your GPUs is dramatically improved by using this architecture if it is well implemented. You can get effectively the same performance model and evaluation scores with numbers like 30% less compute, I think. There’s going to be a wide variation depending on your implementation details and stuff. But it is just important to realize that this type of technical innovation is something that gives huge gains. And I expect most companies that are serving their models to move to this mixture of experts implementation. Historically, the reason why not everyone might do it is because it’s an implementation complexity, especially when doing these big models.

(00:30:21) This is one of the things that DeepSeek gets credit for is they do this extremely well. They do a mixture of experts extremely well. This architecture for what is called DeepSeek MoE, MoE is the shortened version of mixture of experts, is multiple papers old. This part of their training infrastructure is not new to these models alone. And same goes for what Dylan mentioned with multi-head latent attention. This is all about reducing memory usage during inference and same things during training by using some fancy low rank approximation math. If you get into the details with this latent attention, it’s one of those things I look at and it’s like, okay, they’re doing really complex implementations because there’s other parts of language models such as embeddings that are used to extend the context length, the common one that DeepSeek used is rotary positional embeddings, which is called RoPE.

(00:31:12) And if you want to use RoPE with a normal MoE, it’s a sequential thing, you take two of the attention matrices and you rotate them by a complex value rotation, which is a matrix multiplication. With DeepSeek’s MLA, with this new attention architecture, they need to do some clever things because they’re not set up the same and it just makes the implementation complexity much higher. They’re managing all of these things, and these are probably the sort of things that OpenAI these closed labs are doing. We don’t know if they’re doing the exact same techniques, but they actually shared them with the world, which is really nice to be like, this is the cutting edge of efficient language model training.

Lex Fridman (00:31:49) And some of this requires low level engineering, just it is a giant mess in trickery. As I understand they went below CUDA, so they go super low programming of GPUs.

Dylan Patel (00:32:01) Effectively, Nvidia builds this library called NCCL, in which when you’re training a model, you have all these communications between every single layer of the model, and you may have over a hundred layers.

Nathan Lambert (00:32:12) What does NCCL stand for? It’s NCCL.

Dylan Patel (00:32:14) Nvidia Communications Collectives Library.

Dylan Patel (00:32:18) And so when you’re training a model, you’re going to have all these allreducers and allgathers, between each layer, between the multilayer perceptron or feed-forward network and the attention mechanism, you’ll have basically the model synchronized. Or you’ll have allreduce and allgather. And this is a communication between all the GPUs in the network, whether it’s in training or inference, so Nvidia has a standard library. This is one of the reasons why it’s really difficult to use anyone else’s hardware for training is because no one’s really built a standard communications library. And Nvidia has done this at a sort of a higher level. DeepSeek because they have certain limitations around the GPUs that they have access to, the interconnects are limited to some extent by the restrictions of the GPUs that were shipped into China legally, not the ones that are smuggled but legally shipped in that they used to train this model, they had to figure out how to get efficiencies. And one of those things is that instead of just calling the NVIDIA library NCCL, they scheduled their own communications, which some of the labs do.

(00:33:27) Meta talked about in Llama 3, how they made their own custom version of NCCL. They didn’t talk about the implementation details. This is some of what they did, probably not as well as… Maybe not as well as DeepSeek because DeepSeek, necessity is the mother of innovation and they had to do this. OpenAI has people that do this sort of stuff, Anthropic, et cetera. But DeepSeek certainly did it publicly and they may have done it even better because they were gimped on a certain aspect of the chips that they have access to. And so they scheduled communications by scheduling specific SMs. SMs you could think of as the core on a GPU. There’s hundreds of cores or there’s a bit over a hundred cores SMs on a GPU. And they were specifically scheduling, hey, which ones are running the model? Which ones are doing allreduce? Which one are doing allgather? And they would flip back and forth between them. And this requires extremely low level programming.

Nathan Lambert (00:34:22) This is what NCCL does automatically or other Nvidia libraries handle this automatically usually.

Dylan Patel (00:34:26) Yeah, exactly. And so technically they’re using PTX which is, you could think of it as an assembly type language. It’s not exactly that or instruction set, like coding directly to assembly or instruction set. It’s not exactly that, but that’s still part of technically CUDA. But it’s like, do I want to write in Python, PyTorch equivalent and call Nvidia libraries? Do I want to go down to the C level and code even lower level, or do I want to go all the way down to the assembly or ISO level? And there are cases where you go all the way down there at the very big labs, but most companies just do not do that because it’s a waste of time and the efficiency gains you get are not worth it. But-

Dylan Patel (00:35:00) It’s a waste of time and the efficiency gains you get are not worth it. But DeepSeek’s implementation is so complex, especially with their mixture of experts. People have done mixture of experts, but they’re generally eight, 16 experts and they activate two. So, one of the words that we like to use is sparsity factor or usage.

(00:35:19) So, you might have 1/4th of your model activate, and that’s what Mistral’s Mixtral model, right? They’re a model that really catapulted them to like, “Oh, my God. They’re really, really good.” OpenAI has also had models that are MoE and so have all the other labs that are major closed. But what DeepSeek did that maybe only the leading labs have only just started recently doing is have such a high sparsity factor, right? It’s not 1/4th of the model, right? Two out of eight experts activating every time you go through the model, it’s eight out of 256.

Nathan Lambert (00:35:51) And there’s different implementations for mixture of experts where you can have some of these experts that are always activated, which this just looks like a small neural network, and then all the tokens go through that and then they also go through some that are selected by this routing mechanism.

(00:36:08) And one of the innovations in DeepSeek’s architecture is that they change the routing mechanism and mixture of expert models. There’s something called an auxiliary loss, which effectively means during training, you want to make sure that all of these experts are used across the tasks that the model sees.

(00:36:26) Why there can be failures in mixture of experts is that when you’re doing this training, one objective is token prediction accuracy. And if you just let turning go with a mixture of expert model on your own, it can be that the model learns to only use a subset of the experts. And in the MoE literature, there’s something called the auxiliary loss which helps balance them.

(00:36:50) But if you think about the loss functions of deep learning, this even connects to The Bitter Lesson, is that you want to have the minimum inductive bias in your model to let the model learn maximally. And this auxiliary loss, this balancing across experts could be seen as intention with the prediction accuracy of the tokens.

(00:37:09) So we don’t know the exact extent that the DeepSeek MoE change, which is instead of doing an auxiliary loss, they have an extra parameter in their routing, which after the batches, they update this parameter to make sure that the next batches all have a similar use of experts. And this type of change can be big, it can be small, but they add up over time. And this is the sort of thing that just points to them innovating.

(00:37:31) And I’m sure all the labs that are training big MoEs are looking at this sort of things, which is getting away from the auxiliary loss. Some of them might already use it, but you keep accumulating gains. And we’ll talk about the philosophy of training and how you organize these organizations. And a lot of it is just compounding small improvements over time in your data, in your architecture, in your post-training and how they integrate with each other.

(00:37:54) DeepSeek does the same thing and some of them are shared, or a lot. We have to take them on face value that they share their most important details. I mean, the architecture and the weights are out there, so we’re seeing what they’re doing and it adds up.

Dylan Patel (00:38:05) Going back to the efficiency and complexity point, right? It’s 32 versus a four, right, for Mixtral and other MoE models that have been publicly released? So this ratio is extremely high. And what Nathan was getting at there was when you have such a different level of sparsity, you can’t just have every GPU have the entire model, right? The model’s too big, there’s too much complexity there. So you have to split up the model with different types of parallelism, right?

(00:38:31) And so you might have different experts on different GPU nodes, but now what happens when this set of data that you get, “Hey, all of it looks like this one way and all of it should route to one part of my model.” So when all of it routes to one part of the model, then you can have this overloading of a certain set of the GPU resources or a certain set of the GPUs and then the rest of the training network sits idle because all of the tokens are just routing to that.

(00:39:00) So this is the biggest complexity, one of the big complexities with running a very sparse mixture of experts model i.e., this 32 ratio versus this four ratio, is that you end up with so many of the experts just sitting there idle. So how do I load balance between them? How do I schedule the communications between them? This is a lot of the extremely low-level, detailed work that they figured out in the public first, and potentially second or third in the world and maybe even first in some cases.

Lex Fridman (00:39:29) What lesson do you, in the direction of The Bitter Lesson do you take from all of this? Is this going to be the direction where a lot of the gain is going to be, which is this kind of low-level optimization or is this a short-term thing where the biggest gains will be more on the algorithmic high-level side of post-training?

(00:39:50) Is this a short-term leap because they’ve figured out a hack because constraints necessitate the mother of invention or is there still a lot of gains?

Nathan Lambert (00:40:01) I think we should summarize what The Bitter Lesson actually is about, is that The Bitter Lesson essentially, if you paraphrase it, is that the types of training that will win out in deep learning as we go are those methods that which are scalable in learning and search, is what it calls out.

(00:40:20) The scale word gets a lot of attention in this. The interpretation that I use is effectively to avoid adding the human priors to your learning process. And if you read the original essay, this is what it talks about is how researchers will try to come up with clever solutions to their specific problem that might get them small gains in the short term while simply enabling these deep learning systems to work efficiently, and for these bigger problems in the long term might be more likely to scale and continue to drive success.

(00:40:58) And therefore, we were talking about relatively small implementation changes to the mixture of experts model. And therefore it’s like, “Okay, we will need a few more years to know if one of these were actually really crucial to The Bitter Lesson,” but The Bitter Lesson is really this long-term arc of how simplicity can often win.

(00:41:17) And there’s a lot of sayings in the industry, “The models just want to learn. You have to give them the simple loss landscape where you put compute through the model and they will learn, and getting barriers out of the way.”

Lex Fridman (00:41:29) That’s where the power of something like nickel comes in, where standardized code that could be used by a lot of people to create simple innovations that can scale, which is why the hacks, I imagine, the code base for DeepSeek is probably a giant mess.

Nathan Lambert (00:41:45) I’m sure DeepSeek definitely has code bases that are extremely messy, where they’re testing these new ideas. Multi-head latent attention probably could start in something like a Jupyter Notebook, or somebody tries something on a few GPUs and that is really messy. But the stuff that trains the DeepSeek V3 and DeepSeek-R1, those libraries, if you were to present them to us, I would guess are extremely high-quality code.

Lex Fridman (00:42:12) So, high-quality, readable code. Yeah.

Dylan Patel (00:42:12) I think there is one aspect to note though is that there is the general ability for that to transfer across different types of runs. You may make really, really high-quality code for one specific model architecture at one size, and then that is not transferable to, ” Hey, when I make this architecture tweak, everything’s broken again,” right?

(00:42:33) That’s something that could be with their specific low-level coding of scheduling SMs is specific to this model architecture and size. Whereas, Nvidia’s Collectives Library is more like, “Hey, it’ll work for anything,” right? “You want to do an allreduce? Great, I don’t care what your model architecture is, it’ll work,” and you’re giving up a lot of performance when you do that in many cases, but it’s worthwhile for them to do the specific optimization for the specific run given the constraints that they have regarding compute.

Lex Fridman (00:43:04) I wonder how stressful it is to these frontier models, like initiate training to have the code-

Lex Fridman (00:43:13) … to push the button that you’re now spending a large amount of money and time to train this. I mean, there must be a lot of innovation on the debugging stage of making sure there’s no issues, that you’re monitoring and visualizing every aspect of the training, all that kind of stuff.

Dylan Patel (00:43:33) When people are training, they have all these various dashboards, but the most simple one is your loss, right? And it continues to go down, but in reality, especially with more complicated stuff like MoE, the biggest problem with it, or FP8 training, which is another innovation, going to a lower precision number format i.e., less accurate is that you end up with loss spikes. And no one knows why the loss spike happened. And for a long-

Nathan Lambert (00:43:55) Some of them, you do.

Dylan Patel (00:43:56) Some of them, you do.

Nathan Lambert (00:43:56) Some of them are bad data. Can I give Ai2’s example of what blew up our earlier models is a Subreddit called microwavegang. We love to shout this out. It’s a real thing. You can pull up microwavegang. Essentially it’s a Subreddit where everybody makes posts that are just the letter M. So it’s like, mmm. So there’s extremely long sequences of the letter M and then the comments are like beep beep because it’s in the micro events.

Nathan Lambert (00:44:18) But if you pass this into a model that’s trained to be a normal producing text, it’s extremely high-loss because normally you see an M, you don’t predict Ms for a long time. So this is something that caused loss spikes for us. But when you have much … This is old, this is not recent. And when you have more mature data systems, that’s not the thing that causes the loss spike. And what Dylan is saying is true, but it’s levels to this sort of idea.

Dylan Patel (00:44:41) With regards to the stress, these people are like … You’ll go out to dinner with a friend that works at one of these labs and they’ll just be looking at their phone every 10 minutes and they’re not … You know, it’s one thing if they’re texting, but they’re just like, “Is the loss … Is the loss spike okay?”

Nathan Lambert (00:44:58) Yeah. It’s like tokens per second. Loss not blown up. They’re just watching this.

Lex Fridman (00:45:03) And the heart rate goes up if there’s a spike.

Dylan Patel (00:45:05) And some level of spikes is normal, it’ll recover and be back. Sometimes a lot of the old strategy was like, you just stop the run, restart from the old version and then change the data mix and then it keeps going.

Nathan Lambert (00:45:16) There are even different types of spikes. So Dirk Groeneveld has a theory today too, that’s like fast spikes and slow spikes, where there are, sometimes where you’re looking at the loss and there are other parameters, you could see it start to creep up and then blow up, and that’s really hard to recover from. So you have to go back much further.

(00:45:31) So you have the stressful period where it’s flat or it might start going up and you’re like, “What do I do?” Whereas, there are also loss spikes that are, it looks good and then there’s one spiky data point. And what you could do is you just skip those. You see that there’s a spike. You’re like, “Okay, I can ignore this data. Don’t update the model and do the next one, and it’ll recover quickly.”

(00:45:47) But on trickier implementations, so as you get more complex in your architecture and you scale up to more GPUs, you have more potential for your loss blowing up. So it’s like, there’s a distribution.

Dylan Patel (00:45:58) And then the whole idea of grokking also comes in, right? It’s like, just because it slowed down from improving in loss doesn’t mean it’s not learning because all of a sudden it could be like this and it could just spike down in loss again because it truly learned something, right? And it took some time for it to learn that. It’s not a gradual process, and that’s what humans are like. That’s what models are like. So it’s really a stressful task, as you mentioned.

Lex Fridman (00:46:21) And the whole time the dollar count is going up.

Nathan Lambert (00:46:24) Every company has failed runs. You need failed run to push the envelope on your infrastructure. So, a lot of news cycles are made of X company had Y failed run. Every company that’s trying to push the frontier of AI has these. So yes, it’s noteworthy because it’s a lot of money and it can be week to a month setback, but it is part of the process.

Lex Fridman (00:46:44) But if you’re DeepSeek, how do you get to a place where holy shit, there’s a successful combination of hyperparameters?

Nathan Lambert (00:46:52) A lot of small failed runs.

Lex Fridman (00:46:54) So, rapid iteration through failed runs until-

Nathan Lambert (00:46:59) And successful ones.

Lex Fridman (00:47:01) And then you build up some intuition, like this mixture of expert works and then this implementation of MLA works.

Nathan Lambert (00:47:09) Key hyperparameters, like learning rate and regularization and things like this, and you find the regime that works for your code base. Talking to people at Frontier Labs, there’s a story that you can tell where training language models is kind of a path that you need to follow. So you need to unlock the ability to train a certain type of model or a certain scale, and then your code base and your internal know-how of which hyperparameters work for IT is kind of known.

(00:47:34) And you look at the DeepSeek papers and models, they’ve scaled up, they’ve added complexity, and it’s just continuing to build the capabilities that they have.

Dylan Patel (00:47:42) There’s the concept of a YOLO run. So YOLO, you only live once.

Dylan Patel (00:47:47) What it is, is there’s all this experimentation you do at the small scale, research ablations. You have your Jupyter Notebook where you’re experimenting with MLA on three GPUs or whatever and you’re doing all these different things like, “Hey, do I do four active experts, 128 experts? Do I arrange the experts this way?” All these different model architecture things, you’re testing at a very small scale. Right?

(00:48:10) A couple of researchers, few GPUs, tens of GPUs, hundreds of GPUs, whatever it is. And then all of a sudden you’re like, “Okay, guys. No more fucking around. No more screwing around. Everyone, take all the resources we have. Let’s pick what we think will work and just go for it. YOLO.”

(00:48:26) And this is where that sort of stress comes in is like, “Well, I know it works here, but some things that work here don’t work here. And some things that work here don’t work down here in this terms of scale.” So it’s really truly a YOLO run. And there’s this discussion of certain researchers just have this methodical nature. They can find the whole search space and figure out all the ablations of different research and really see what is best. And there’s certain researchers who just have that innate gut instinct of like, “This is the YOLO run. I’m looking at the data. I think this is it.”

Nathan Lambert (00:49:00) This is why you want to work in post-training because the GPU cost for training is lower. So you can make a higher percentage of your training runs YOLO runs.

Nathan Lambert (00:49:08) For now. For now.

Lex Fridman (00:49:10) So some of this is fundamentally luck, still.

Dylan Patel (00:49:14) Luck is skill, right, in many cases?

Lex Fridman (00:49:16) Yeah. I mean, it looks lucky, right, when you’re-

Nathan Lambert (00:49:18) But the hill to climb, if you’re on one of these labs, you have an evaluation you’re not crushing, there’s a repeated playbook of how you improve things. There are localized improvements, which might be data improvements. And these add up into the whole model just being much better.

(00:49:32) And when you zoom in really close, it can be really obvious that this model is just really bad at this thing and we can fix it and you just add these up. So some of it feels like luck, but on the ground, especially with these new reasoning models we’re talking to is just so many ways that we could poke around. And normally, it’s that some of them give big improvements.

Dylan Patel (00:49:51) The search space is near infinite and yet the amount of compute and time you have is very low, and you have to hit release schedules. You have to not get blown past by everyone. Otherwise, what happened with DeepSeek crushing Meta and Mistral and Cohere and all these guys, they moved too slow. They maybe were too methodical. I don’t know, they didn’t hit the YOLO run. Whatever the reason was, maybe they weren’t as skilled. Whatever, you can call it luck if you want, but at the end of the day, it’s skill.

Lex Fridman (00:50:18) So 2025 is the year of the YOLO run. It seems like all the labs are going in.

Dylan Patel (00:50:25) I think it’s even more impressive what OpenAI did in 2022. At the time, no one believed in mixture of experts models at Google who had all the researchers. OpenAI had such little compute and they devoted all of their compute for many months, all of it, 100% for many months to GPT-4 with a brand-new architecture with no belief that, “Hey, let me spend a couple of hundred million dollars, which is all of the money I have on this model.” That is truly YOLO.

Dylan Patel (00:50:55) Now people have all these training run failures that are in the media, right? It’s like, “Okay, great, but actually a huge chunk of my GPUs are doing inference. I still have a bunch doing research constantly. And yes, my biggest cluster is training, but on this YOLO run,” but that YOLO run is much less risky than what OpenAI did in 2022, or maybe what DeepSeek did now or sort of like, “Hey, we’re just going to throw everything at it.”

Lex Fridman (00:51:19) The big winners throughout human history are the ones who are willing to do YOLO at some point. Okay. What do we understand about the hardware it’s been trained on, DeepSeek?

DeepSeek compute cluster

Dylan Patel (00:51:30) DeepSeek is very interesting. This is where a second could take to zoom out, out of who they are first of all, right? High-Flyer is a hedge fund that has historically done quantitative trading in China as well as elsewhere. And they have always had a significant number of GPUs, right?

(00:51:45) In the past, a lot of these high-frequency trading, algorithmic quant traders used FPGAs, but it shifted to GPUs definitely. And there’s both, but GPUs especially. And High-Flyer, which is the hedge fund that owns DeepSeek, and everyone who works for DeepSeek is part of High-Flyer to some extent. Same parent company, same owner, same CEO, they had all these resources and infrastructure for trading, and then they devoted a humongous portion of them to training models, both language models and otherwise, because these techniques were heavily AI-influenced.

(00:52:20) More recently, people have realized, “Hey, trading with …” Even when you go back to Renaissance and all these quantitative firms, natural language processing is the key to trading really fast, understanding a press release and making the right trade. And so DeepSeek has always been really good at this.

(00:52:38) And even as far back as 2021, they have press releases and papers saying, “Hey, we’re the first company in China with an A100 cluster this large.” It was 10,000 A100 GPUs, right? This is in 2021. Now, this wasn’t all for training large language models. This was mostly for training models for their quantitative aspects, quantitative trading as well as a lot of that was natural language processing, to be clear. Right?

(00:53:03) And so this is the sort of history, right? So verifiable fact is that in 2021, they built the largest cluster, at least they claim it was the largest cluster in China, 10,000 GPUs.

Nathan Lambert (00:53:12) Before export controls started.

Nathan Lambert (00:53:15) It’s like they’ve had a huge cluster before any conversation of export controls.

Dylan Patel (00:53:18) So then you step it forward to, what have they done over the last four years since then? Obviously, they’ve continued to operate the hedge fund, probably make tons of money. And the other thing is that they’ve leaned more and more and more into AI. The CEO, Lian Chingfeng … Lian-

Nathan Lambert (00:53:33) You’re not putting me on the spot on this. We discussed this before.

Dylan Patel (00:53:36) Lian Feng, right, the CEO, he owns maybe … Lian Feng, he owns maybe a little bit more than half the company allegedly, is an extremely Elon, Jensen kind of figure where he’s just involved in everything. Right?

(00:53:50) And so over that time period, he’s gotten really in depth into AI. He actually has a bit of a, if you see some of his statements, a bit of an IAK vibe almost, right?

Nathan Lambert (00:53:59) Total AGI vibes, like, “We need to do this. We need to make a new ecosystem of OpenAI. We need China to lead on this sort of ecosystem because historically, the western countries have led on software ecosystems.” And straight up acknowledges, “In order to do this, we need to do something different.” DeepSeek is his way of doing this. Some of the translated interviews with him are fantastic.

Lex Fridman (00:54:23) So he has done interviews?

Lex Fridman (00:54:24) Do you think you would do a western interview, or no, or is there controls on the channel?

Nathan Lambert (00:54:28) There hasn’t been one yet, but I would try it.

Lex Fridman (00:54:32) Okay. All right. Well, I just got a Chinese translator, so it was great. This is a push. So fascinating figure, engineer pushing full on into AI, leveraging the success from the high-frequency trading.

Nathan Lambert (00:54:44) Very direct quotes. “We will not switch to closed source,” when asked about this stuff. Very long-term motivated in how the ecosystem of AI should work. And I think from a Chinese perspective, he wants a Chinese company to build this vision.

Dylan Patel (00:55:03) And so this is sort of like the “visionary behind the company.” This hedge fund still exists, this quantitative firm. And so DeepSeek is the sort of … Slowly, he got turned to this full view of AI, everything about this, but at some point it slowly maneuvered and he made DeepSeek.

(00:55:20) And DeepSeek has done multiple models since then. They’ve acquired more and more GPUs. They share infrastructure with the fund. Right? And so there is no exact number of public GPU resources that they have. But besides this 10,000 GPUs that they bought in 2021, and they were fantastically profitable, and then this paper claims they did only 2,000 H800 GPUs, which are a restricted GPU that was previously allowed in China, but no longer allowed. And there’s a new version, but it’s basically Nvidia’s H100 for China.

(00:55:52) And there’s some restrictions on it specifically around the communications sort of speed, the interconnect speed, which is why they had to do this crazy SM scheduling stuff. So going back to that, it’s like this is obviously not true in terms of their total GPU count.

Lex Fridman (00:56:08) Obvious available GPUs, but for this training run, you think 2,000 is the correct number, or no?

Dylan Patel (00:56:14) So this is where it takes a significant amount of zoning in. What do you call your training run, right? You count all of the research and ablations that you ran, right? Picking all this stuff because yes, you can do a YOLO run, but at some level you have to do the test at the small scale and then you have to do some test at medium scale before you go to a large scale.

Nathan Lambert (00:56:33) Accepted practice is that for any given model that is a notable advancement, you’re going to do two to 4x compute of the full training run in experiments alone.

Lex Fridman (00:56:43) So a lot of this compute that’s being scaled up is probably used in large part at this time for research?

Dylan Patel (00:56:49) Yeah. And research begets the new ideas that lets you get huge efficiency.

Nathan Lambert (00:56:53) Research gets you o1. Research gets you breakthroughs and you need to bet on it.

Lex Fridman (00:56:57) So some of the pricing strategy that we’ll discuss has the research baked into the price?

Dylan Patel (00:57:02) So the numbers that DeepSeek specifically said publicly are just the 10,000 GPUs in 2021 and then 2,000 GPUs for only the pre-training for V3. They did not discuss cost on R1. They did not discuss cost on all the other RL for the instruct model that they made. They only discussed the pre-training for the base model and they did not discuss anything on research and ablations. And they do not talk about any of the resources that are shared in terms of, “Hey, the fund is using all these GPUs,” right?

(00:57:31) And we know that they’re very profitable and they had 10,000 GPUs in 2021. So, some of the research that we’ve found is that we actually believe they have closer to 50,000 GPUs.

Lex Fridman (00:57:43) We as semi-analysis. So we should say that you’re sort of one of the world experts in figuring out what everybody’s doing in terms of the semiconductor, in terms of cluster buildouts, in terms of who is doing what in terms of training runs. So yeah, that’s the we. Okay, go ahead.

Dylan Patel (00:57:59) Yeah, sorry. We believe they actually have something closer to 50,000 GPUs, right? Now this is split across many tasks, right? Again, the fund, research and ablations.

Nathan Lambert (00:58:09) For ballpark, how much would OpenAI or Anthropic had. I think the clearest example we have, because Meta is also open, they talk about order of 60k to 100k H100 equivalent GPUs in their training clusters.

Dylan Patel (00:58:21) Right. So Llama 3, they trained on 16,000 H100s, but the company of Meta last year publicly disclosed they bought 400 something thousand GPUs.

Dylan Patel (00:58:30) Right? So of course, tiny percentage on the training. Again, most of it is serving me the best Instagram Reels or whatever.

Nathan Lambert (00:58:37) I mean, we could get into a cost of, what is the cost of ownership for a 2,000 GPU cluster, 10,000? There’s just different sizes of companies that can afford these things and DeepSeek is reasonably big. Their compute allocation is one of the top few in the world that’s not OpenAI, Anthropic, et cetera, but they have a lot of compute.

Export controls on GPUs to China

Lex Fridman (00:58:58) Can you in gentlemen actually just zoom out and also talk about the Hopper architecture, the Nvidia Hopper GPU architecture and the difference between H100 and H800, like you mentioned, the interconnects?

Dylan Patel (00:59:09) Yeah. So there’s, Ampere was the A100 and then H100 Hopper, right? People use them synonymously in the U.S. because really there’s just H100 and now there’s H200, right, but same thing mostly?

(00:59:21) In China, there’ve been different salvos of expert restrictions. So initially, the U.S. government limited on a two-factor scale, which is chip interconnect versus FLOPs. So any chip that had interconnects above a certain level and FLOPs above a certain … Floating point operations above a certain level was restricted.

(00:59:38) Later, the government realized that this was a flaw in the restriction and they cut it down to just floating point operations. And so-

Nathan Lambert (00:59:48) H800 had high FLOPs, low communication?

Dylan Patel (00:59:51) Exactly. So, the H800 was the same performance as H100 on FLOPs, but it just had the interconnect bandwidth cut. DeepSeek knew how to utilize this. “Hey, even though we’re cut back on the interconnect, we can do all this fancy stuff to figure out how to use the GPU fully anyways.”

(01:00:09) And so that was back in October 2022. But later in 2023, into 2023 implemented in 2024, the U.S. government banned the H800. Right? And so by the way, this H800 cluster, these 2,000 GPUs was not even purchased in 2024. It was purchased in late 2023. And they’re just getting the model out now because it takes a lot of research, et cetera.

(01:00:31) H800 was banned and now there’s a new chip called the H20. The H20 is cut back on only FLOPs, but the interconnect bandwidth is the same. And in fact, in some ways it’s better than the H100 because it has better memory bandwidth and memory capacity. So Nvidia is working within the constraints of what the government sets and then builds the best possible GPU for China.

Lex Fridman (01:00:52) Can we take this actual tangent and we’ll return back to the hardware, is the philosophy, the motivation, the case for export controls? What is it? Dario Amodei just published a blog post about export controls. The case he makes is that if AI becomes super powerful and he says by 2026, we’ll have AGI or super powerful AI and that’s going to give a significant … Whoever builds that will have a significant military advantage.

(01:01:19) And so because The United States is a democracy and as he says, China is authoritarian or has authoritarian elements, you want a unipolar world where the super powerful military, because of the AI is one that’s a democracy. It’s a much more complicated world geopolitically when you have two superpowers with super powerful AI and one is authoritarian.

(01:01:46) So, that’s the case he makes. And so the United States wants to use export controls to slow down, to make sure that China can’t do these gigantic training runs that will be presumably required to build the AGI.

Nathan Lambert (01:02:02) This is very abstract. I think this can be the goal of how some people describe export controls, is this super powerful AI. And you touched on the training run idea. There’s not many worlds where China cannot train AI models. I think export controls are decapping the amount of compute or the density of compute that China can have.

(01:02:25) And if you think about the AI ecosystem right now, as all of these AI companies, revenue numbers are up and to the right. Their AI usage is just continuing to grow, more GPUs are going to inference. A large part of export controls, if they work is just that the amount of AI that can be run in China is going to be much lower.

(01:02:45) So on the training side, DeepSeek V3 is a great example, which you have a very focused team that can still get to the frontier of AI on … This 2,000 GPUs is not that hard to get all considering in the world. They’re still going to have those GPUs. They’re still going to be able to train models. But if there’s going to be a huge market for AI, if you have strong export controls and you want to have 100,000 GPUs just serving the equivalent of ChatGPT clusters with good export controls, it also just makes it so that AI can be used much less.

(01:03:13) And I think that is a much easier goal to achieve than trying to debate on what AGI is. And if you have these extremely intelligent autonomous AIs and data centers, those are the things that could be running in these GPU clusters in the United States, but not in China.

Dylan Patel (01:03:30) To some extent, training a model does effectively nothing. They have a model. The thing that Dario is sort of speaking to is the implementation of that model, once trained to then create huge economic growth, huge increases in military capabilities, huge increases in productivity of people, betterment of lives. Whatever you want to direct super powerful AI towards, you can, but that requires a significant amounts of compute.

(01:03:56) And so the U.S. government has effectively said … And forever, training will always be a portion of the total compute. We mentioned Meta’s 400,000 GPUs. Only 16,000 made Llama. Right? So the percentage that Meta’s dedicating to inference, now this might be for recommendation systems that are trying to hack our mind into spending more time and watching more ads, or if it’s for a super powerful AI that’s doing productive things, it doesn’t matter about the exact use that our economic system decides. It’s that, that can be delivered in whatever way we want.

(01:04:28) Whereas with China, you know, your expert restrictions, great. You’re never going to be able to cut everything off. And I think that’s quite a well-understood by the U.S. government is that you can’t cut everything off.

Nathan Lambert (01:04:40) And they’ll make their own chips.

Dylan Patel (01:04:42) And they’re trying to make their own chips. They’ll be worse than ours, but the whole point is to just keep a gap. And therefore at some point, as the AI … In a world where two, 3% economic growth, this is really dumb by the way, to cut off high-tech and not make money off of it. But in a world where super powerful AI comes about and then starts creating significant changes in society, which is what all the AI leaders and big tech companies believe. I think super powerful AI is going to change society massively.

(01:05:08) And therefore, this compounding effect of the difference in compute is really important. There’s some sci-fi out there where AI is measured in how much power is delivered to compute, right, or how much is being … That’s sort of a way of thinking about what’s the economic output, is just how much power are you directing towards that AI?

Nathan Lambert (01:05:26) Should we talk about reasoning models with this, as a way that this might be actionable as something that people can actually see? So, the reasoning models that are coming out with R1 and o1, they’re designed to use more compute. There’s a lot of buzzy words in the AI community about this, test-time compute, inference time compute, whatever.

(01:05:44) But Dylan has good research on this. You can get to the specific numbers on the ratio of when you train a model, you can look at things. It’s about the amount of compute used at training and amount of compute used at inference.

(01:05:53) These reasoning models are making inference way more important to doing complex tasks. In the fall in December, OpenAI announced this o3 model. There’s another thing in AI, when things move fast, we get both announcements and releases. Announcements are essentially blog posts where you pat yourself on the back and you say you did things and releases are when the model’s out there, the paper’s out there, et cetera.

(01:06:12) So OpenAI has announced o3. I mean, we can check if o3-mini is out as of recording potentially, but that doesn’t really change the point, which is that the breakthrough result was something called ARC-AGI task, which is the abstract reasoning corpus, a task for artificial general intelligence. François Chollet is the guy who’s been … It’s a multi-year-old paper. It’s a brilliant benchmark. And the number for open AI o3 to solve this was that it used some sort of number of samples in the API. The API has thinking effort and number of samples. They used 1,000 samples to solve this task and it comes out to be five to $20 per question, which you’re putting in effectively a math puzzle. And then it takes orders of dollars to answer one question, and this is a lot of compute.

(01:07:00) If those are going to take off in the U.S., OpenAI needs a ton of GPUs on inference to capture this. They have this OpenAI ChatGPT Pro subscription, which is $200 a month-

Dylan Patel (01:07:09) Which Sam said they’re losing money on.

Nathan Lambert (01:07:11) Which means that people are burning a lot of GPUs on inference. And I’ve signed up with it, I’ve played with it. I don’t think I’m a power user, but I use it. And it’s like, that is the thing that a Chinese company with mediumly strong expert controls, there will always be loopholes, might not be able to do it all.

(01:07:27) And if the main result for o3 is also a spectacular coding performance, and if that feeds back into AI companies being able to experiment better.

Lex Fridman (01:07:37) So presumably, the idea is for an AGI, a much larger fraction of the compute would be used for this test-time compute, for the reasoning, for the AGI goes into a room and thinks about how to take over the world and come back in 2.7 hours-

Lex Fridman (01:07:55) … and that it’s going to take a lot of compute.

Nathan Lambert (01:07:56) This is what people, CEO or leaders of OpenAI and Anthropic talk about, is autonomous AI models, which is you give them a task and they work on it in the background.

(01:08:05) I think my personal definition of AGI is much simpler. I think language models are a form of AGI and all of this super powerful stuff is a next step that’s great if we get these tools. But a language model has so much value in so many domains that it’s a general intelligence to me.

(01:08:21) But this next step of agentic things where they’re independent and they can do tasks that aren’t in the training data is what the few-year outlook that these AI companies are driving for.

Lex Fridman (01:08:32) I think the terminology here that Dario uses is super powerful AI. So I agree with you on the AGI. I think we already have something like that’s exceptionally impressive that Alan Turing would for sure say is AGI, but he’s referring more to something once in possession of, then you would have a significant military and geopolitical advantage over other nations. So it’s not just like you can ask it how to cook an omelet.

Nathan Lambert (01:08:58) And he has a much more positive view. And as I say, machines of love and grace. I read into this and I don’t have enough background in physical sciences to gauge exactly how competent I am, and if AI can revolutionize biology. I am safe saying that AI is going to accelerate the progress of any computational science.

AGI timeline

Lex Fridman (01:09:16) So we’re doing a depth-first search here on topics, taking tangent of a tangent, so let’s continue on that depth-first search. You said that you’re both feeling the AGI. What’s your timeline? Dario is 2026 for the super powerful AI that’s basically agentic to a degree where it’s a real security threat, that level of AGI. What’s your timeline?

Nathan Lambert (01:09:44) I don’t like to attribute specific abilities because predicting specific abilities and when is very hard. I think mostly if you’re going to say that I’m feeling the AGI is that I expect continued, rapid, surprising progress over the next few years. So, something like R1 is less surprising to me from DeepSeek because I expect there to be new paradigms versus …

Nathan Lambert (01:10:00) … surprising to me from DeepSeek because I expect there to be new paradigms where substantial progress can be made. I think DeepSeek-R1 is so unsettling because we’re kind of on this path with ChatGPT. It’s like it’s getting better, it’s getting better, it’s getting better, and then we have a new direction for changing the models, and we took one step like this and we took a step-up. So it looks like a really fast slope, and then we’re going to just take more steps. So it’s just really unsettling when you have these big steps, and I expect that to keep happening. I’ve tried opening Operator, I’ve tried Claude computer use, they’re not there yet. I understand the idea, but it’s just so hard to predict what is the breakthrough that’ll make something like that work. And I think it’s more likely that we have breakthroughs that work in things that we don’t know what they’re going to do. So everyone wants agents. Dario has a very eloquent way of describing this, and I just think that it’s like there’s going to be more than that, so just expect these things to come.

Lex Fridman (01:10:53) I’m going to have to try to pin you down to a date on the AGI timeline. Like the nuclear weapon moment, so moment where on the geopolitical stage, there’s a real… Because we’re talking about export controls, when do you think, just even to throw out a date, when do you think that would be? For me, it’s probably after 2030, so I’m not as-

Nathan Lambert (01:11:19) That’s what I would say.

Dylan Patel (01:11:21) So define that. Because to me, it kind of almost has already happened. You look at elections in India and Pakistan, people get AI voice calls and think they’re talking to the politician. The AI diffusion rules, which was enacted in the last couple of weeks of the Biden admin, it looks like the Trump admin will keep and potentially even strengthen, limit cloud computing and GPU sales to countries that are not even related to China. It’s like this is-

Nathan Lambert (01:11:44) Portugal and all these normal countries are on the… You need approval from the US list.

Dylan Patel (01:11:49) Yeah, Portugal and all these countries that are allies. Singapore. They freaking have F-35s and we don’t let them buy GPUs. This to me is already to the scale of…

Lex Fridman (01:12:02) Well, that just means that the US military is really nervous about this new technology. That doesn’t mean that technology is already there. So they might be just very cautious about this thing that they don’t quite understand. But that’s a really good point. The robocalls, swarms of semi-intelligent bots could be a weapon, could be doing a lot of social engineering.

Dylan Patel (01:12:25) I mean, there’s tons of talk about from the 2016 elections like Cambridge Analytica and all this stuff, Russian influence. I mean, every country in the world is pushing stuff onto the internet and has narratives they want. Every technically competent, whether it’s Russia, China, US, Israel, et cetera. People are pushing viewpoints onto the internet en masse. And language models crash the cost of very intelligent sounding language.

Nathan Lambert (01:12:49) There’s some research that shows that the distribution is actually the limiting factor. So language models haven’t yet made misinformation particularly change the equation there. The internet is still ongoing. I think there’s a blog, AI Snake Oil and some of my friends at Princeton that write on this stuff. So there is research. It’s a default that everyone assumes. And I would’ve thought the same thing, is that misinformation doesn’t get far worse with language models. I think in terms of internet posts and things that people have been measuring, it hasn’t been a exponential increase or something extremely measurable and things you’re talking about with voice calls and stuff like that, it could be in modalities that are harder to measure.

(01:13:27) So it’s something that it’s too soon to tell in terms of… I think that’s political instability via the web is very… It’s monitored by a lot of researchers to see what’s happening. I think that… You’re asking about the AGI thing. If you’re making me give a year, I’m going to be like, “Okay, I have AI CEOs saying this. They’ve been saying two years for a while. I think that there are people like Dario at Anthropic, the CEO, has thought about this so deeply. I need to take their word seriously, but also understand that they have different incentives.” So I would be like, “Add a few years to that.” Which is how you get something similar to 2030 or a little after 2030.

Dylan Patel (01:14:08) I think to some extent, we have capabilities that hit a certain point where any one person could say, “Oh, okay, if I can leverage those capabilities for X amount of time, this is AGI, call it ’27, ’28.” But then the cost of actually operating that capability-

Nathan Lambert (01:14:23) Yeah, this was going to be my point.

Dylan Patel (01:14:24) … is so, so extreme that no one can actually deploy it at scale en masse to actually completely revolutionize the economy on a snap of a finger. So I don’t think it will be a snap of the finger moment.

Nathan Lambert (01:14:35) It’s a physical constraint [inaudible 01:14:37].

Dylan Patel (01:14:36) Rather, it’ll be a, “Oh, the capabilities are here, but I can’t deploy it everywhere.” And so one simple example, going back sort of to 2023 was when being when GPT-4 came out, everyone was freaking out about search. Perplexity came out. If you did the cost on like, hey, implementing GPT-3 into every Google search, it was like, oh, okay, this is just physically impossible to implement. And as we step forward to going back to the test-time compute thing, a query for… You ask ChatGPT a question, it costs cents for their most capable model of Chat to get a query back. To solve an AGI problem though costs 5 to 20 bucks, and this is in-

Nathan Lambert (01:15:17) It’s only going up from there.

Dylan Patel (01:15:19) This is 1,000, 10,000 X factor difference in cost to respond to a query versus do a task. And the task of AGI is not like it’s like… It’s simple, to some extent, but it’s also like, what are the tasks that we want… Okay, AGI, “What we have today”, can do AGI. Three years from now, it can do much more complicated problems, but the cost is going to be measured in thousands and thousands and hundreds of thousands of dollars of GPU time, and there just won’t be enough power, GPUs, infrastructure to operate this and therefore shift everything in the world on the snap the finger.

(01:15:52) But at that moment, who gets to control and point the AGI at a task? And so this was in Dario’s post that he’s like, “Hey, China can effectively and more quickly than us, point their AGI at military tasks.” And they have been, in many ways, faster at adopting certain new technologies into their military, especially with regards to drones. The US maybe has a long-standing large air sort of fighter jet type of thing, bombers. But when it comes to asymmetric arms such as drones, they’ve completely leapfrogged the US and the West.

(01:16:28) And the fear that Dario is sort of pointing out there, I think, is that, yeah, great, we’ll have AGI in the commercial sector. The US military won’t be able to implement it superfast. Chinese military could and they could direct all their resources to implementing it in the military, and therefore solving military logistics or solving some other aspect of disinformation for targeted certain set of people so they can flip a country’s politics or something like that that is actually catastrophic versus the US just wants to… Because it’ll be more capitalistically allocated just towards whatever is the highest return on income, which might be building factories better or whatever.

Lex Fridman (01:17:04) So everything I’ve seen, people’s intuition seems to fail on robotics. So you have this kind of general optimism. I’ve seen this on self-driving cars. People think it’s much easier problem than it is. Similar with drones, here, I understand it a little bit less, but I’ve just seen the reality of the war in Ukraine and the usage of drones on both sides. And it seems that humans still far outperform any fully autonomous systems. AI is an assistant, but humans drive. FPV drones where the human’s controlling most of it, just far, far, far outperforms AI systems. So I think it’s not obvious to me that we’re going to have swarms of autonomous robots anytime soon in the military context. Maybe the fastest I can imagine is 2030, which is why I said 2030 for the super powerful AI. Whenever you have large scale swarms of robots doing military actions, that’s when the world just starts to look different to me.

(01:18:07) So that’s the thing I’m really worried about. But there could be cyber war, cyber war type of technologies that from social engineering to actually just swarms of robots that find attack vectors in our code bases and shut down power grids, that kind of stuff. And it could be one of those things like on any given weekend or something, power goes out, nobody knows why, and the world changes forever. Just power going out for two days in all of the United States, that will lead to murder, to chaos. But going back to export controls, do you see that as a useful way to control the balance of power geopolitically in the context of AI?

China’s manufacturing capacity

Dylan Patel (01:18:56) And I think going back to my viewpoint is if you believe we’re in this sort of stage of economic growth and change that we’ve been in for the last 20 years, the export controls are absolutely guaranteeing that China will win long-term. If you do not believe AI is going to make significant changes to society in the next 10 years or 5 years. Five-year timelines are sort of what the more executives and such of AI companies and even big tech companies believe. But even 10-year timelines, it’s reasonable. But once you get to, hey, these timelines are below that time period, then the only way to create a sizable advantage or disadvantage for America versus China is if you constrain and compute, because talent is not really something that’s constraining. China arguably has more talent, more STEM graduates, more programmers. The US can draw upon the world’s people, which it does. There’s tons of foreigners in the AI industry.

Nathan Lambert (01:19:57) So many of these AI teams are all people without a US passport.

Dylan Patel (01:20:02) Yeah. I mean, many of them are Chinese people who are moving to America, and that’s great. That’s exactly what we want. But that talent is one aspect, but I don’t think that’s one that is a measurable advantage for the US or not. It truly is just whether or not compute. Now, even on the compute side, when we look at chips versus data centers, China has the unprecedented ability to build ridiculous sums of power. Clockwork. They’re always building more and more power. They’ve got steel mills that individually are the size of the entire US industry. And they’ve got aluminum mills that consume gigawatts and gigawatts of power. And when we talk about what’s the biggest data center, OpenAI made this huge thing about Stargate, their announcement there, once it’s fully built out in a few years, it’ll be two gigawatts of power. And this is still smaller than the largest industrial facilities in China. China, if they wanted to build the largest data center in the world, if they had access to the chips, could. So it’s just a question of when, not if.

Lex Fridman (01:21:07) So their industrial capacity far exceeds the United States’?

Lex Fridman (01:21:11) They manufacture stuff. So long-term, they’re going to be manufacturing chips there?

Dylan Patel (01:21:18) Chips are a little bit more specialized. I’m specifically referring to the data centers. Fabs take huge amounts of power, don’t get me wrong. That’s not necessarily the gating factor there. The gating factor on how fast people can build the largest clusters today in the US is power. Now, it could be power generation, power transmission, substations, and all these sorts of transformers and all these things building the data center. These are all constraints on the US industry’s ability to build larger and larger training systems, as well as deploying more and more inference compute.

Nathan Lambert (01:21:52) I think we need to make a point clear on why the time is now for people that don’t think about this, because essentially, with export controls, you’re making it so China cannot make or get cutting edge chips. And the idea is that if you time this wrong, China is pouring a ton of money into their chip production, and if you time it wrong, they are going to have more capacity for production, more capacity for energy, and figure out how to make the chips and have more capacity than the rest of the world to make the chips. Because everybody can buy… They’re going to sell their Chinese chips to everybody, they might subsidize them. And therefore, if AI takes a long time to become differentiated, we’ve kneecapped the financial performance of American companies. NVIDIA can sell less, TSMC cannot sell to China. So therefore, we have less demand to therefore… To keep driving the production cycle. So that’s the assumption behind the timing being [inaudible 01:22:43].

Dylan Patel (01:22:43) Less than 10 years or 5 years to above. China will win because of these restrictions long-term, unless AI does something in the short-term, which I believe AI will make massive changes to society in the medium, short-term. And so that’s the big unlocker there. And even today, if Xi Jinping decided to get “scale-pilled”, IE, decide that scaling laws are what matters, just like the US executives like Satya Nadella and Mark Zuckerberg and Sundar and all these US executives of the biggest, most powerful tech companies have decided they’re scale-pilled and they’re building multi-gigawatt data centers, whether it’s in Texas or Louisiana or Wisconsin, wherever it is, they’re building these massive things that cost as much as their entire budget for spending on data centers globally in one spot. This is what they’ve committed to for next year, year after, et cetera. And so they’re so convinced that this is the way that this is what they’re doing.

(01:23:43) But if China decided to, they could do it faster than us, but this is where the restrictions come in. It is not clear that China as a whole has decided from the highest levels that this is a priority. The US sort of has. You see Trump talking about DeepSeek and Stargate within the same week. And the Biden admin as well had a lot of discussions about AI and such. It’s clear that they think about it. Only just last week did DeepSeek meet the second in command of China. They have not even met the top, they haven’t met Xi, Xi hasn’t set down, and they only just released a subsidy of a trillion RMB, roughly $160 billion, which is closer to the spending of Microsoft and Meta and Google combined for this year. So they’re realizing it just now. But that’s where these export restrictions come in and say, “Hey, you can’t ship the most powerful US chips to China. You can ship a cut-down version. You can’t ship the most powerful chips to all these countries who we know are just going to rent it to China. You have to limit the numbers.”

Dylan Patel (01:24:50) And same with manufacturing [inaudible 01:24:52] tools, all these different aspects, but it all stems from AI and then what downstream can slow them down in AI. And so the entire semiconductor restrictions, you read them, they’re very clear, it’s about AI and military civil fusion of technology. It’s very clear. And then from there it goes, oh, well, we’re banning them from buying lithography tools and etch tools and deposition tools. And oh, this random subsystem from a random company that’s tiny. Why are we banning this? Because all of it, the US government has decided is critical to AI systems.

Nathan Lambert (01:25:23) I think the fulcrum point is the transition from seven nanometer to five nanometer chips where I think it was Huawei that had the seven nanometer chip a few years ago, which caused another political brouhaha, almost like this moment. And then it’s the ASML deep UV. What is that… Extreme ultraviolet lithography.

Dylan Patel (01:25:43) Just set context on the chips. What Nathan’s referring to is in 2020, Huawei released their Ascend 910 chip, which was an AI chip, first one on seven nanometer before Google did, before NVIDIA did. And they submitted it to the MLPerf benchmark, which is sort of a industry standard for machine learning performance benchmark, and it did quite well, and it was the best chip at the submission. This was a huge deal. The Trump admin, of course, banned, it was 2019, banned the Huawei from getting seven nanometer chips from TSMC. And so then they had to switch to using internal, domestically produced chips, which was a multi-year setback.

Nathan Lambert (01:26:20) Many companies have done seven nanometer chips. And the question is we don’t know how much Huawei was subsidizing production of that chip. Intel has made seven nanometer chips that are not profitable and things like this. So this is how it all feeds back into the economic engine of export controls.

Cold war with China

Lex Fridman (01:26:36) Well, so you’re saying that for now, Xi Jinping has not felt the AGI, but it feels like the DeepSeek moment, there might be meetings going on now where he’s going to start wearing the same t-shirt and things are going to escalate.

Dylan Patel (01:26:52) I mean, he may have woken up last week. Liang Feng met the second command guy, and they had a meeting, and then the next day, they announced the AI subsidies, which are a trillion RMB.

Lex Fridman (01:27:05) So it’s possible that this DeepSeek moment is truly the beginning of a cold war.

Nathan Lambert (01:27:10) That’s what a lot of people are worried about. People in AI have been worried that this is going towards a cold war or already is.

Lex Fridman (01:27:16) But it’s not DeepSeek’s fault, but there’s something, a bunch of factors came together where-

Nathan Lambert (01:27:16) It’s how history works.

Lex Fridman (01:27:21) … it’s like this explosion. I mean, it all has to do with NVIDIA’s not going down properly, but it’s just some [inaudible 01:27:28] mass hysteria that happened that eventually led to Xi Jinping having meetings and waking up to this idea.

Dylan Patel (01:27:34) And the US government realized in October 7th, 2022, before ChatGPT released, that restriction on October 7th, which dropped and shocked everyone, and it was very clearly aimed at AI. Everyone was like, “What the heck are you doing?”

Nathan Lambert (01:27:48) Stable Diffusion was out then, but not ChatGPT.

Dylan Patel (01:27:48) Yeah, but not ChatGPT.

Nathan Lambert (01:27:51) So it was starting to be rumblings-

Dylan Patel (01:27:53) Of what GenAI can do to society, but it was very clear, I think, to at least National Security Council and those sort of folks, that this was where the world is headed, this cold war that’s happening.

Lex Fridman (01:28:04) So is there any concerns that the export controls push China to take military action on Taiwan?

Dylan Patel (01:28:15) This is the big risk. The further you push China away from having access to cutting edge American and global technologies, the more likely they are to say, “Well, because I can’t access it, I might as well… No one should access it.” And there’s a few interesting aspects of that. China has a urban-rural divide like no other. They have a male-female birth ratio like no other to the point where if you look in most of China, it’s like the ratio is not that bad. But when you look at single dudes in rural China, it’s like a 30:1 ratio. And those are disenfranchised dudes. “The US has an incel problem.” China does too, it’s just they’re placated in some way or crushed down. What do you do with these people? And at the same time, you’re not allowed to access the most important technology, at least the US thinks so. China’s maybe starting to think this is the most important technology by starting to dump subsidies in it.

(01:29:07) They thought EVs and renewables were the most important technology. They dominate that now. Now, they started thinking about semiconductors in the late 2010s and early 2020s and now they’ve been dumping money and they’re catching up rapidly and they’re going to do the same with AI because they’re very talented. So the question is, when does this hit a breaking point? And if China sees this as, “Hey, they can continue…” If not having access and starting a true hot war, taking over Taiwan or trying to subvert its democracy in some way or blockading it hurts the rest of the world far more than it hurts them, this is something they could potentially do. And so is this pushing them towards that? Potentially. I’m not quite a geopolitical person, but it’s obvious that the world regime of peace and trade is super awesome for economics, but at some point, it could break.

Nathan Lambert (01:30:07) I think we should comment the why Chinese economy would be hurt by that is that they’re export heavy, I think. United States buys so much. If that goes away, that’s how their economy [inaudible 01:30:17].

Dylan Patel (01:30:16) Well, also, they just would not be able to import raw materials from all over the world. The US would just shut down the Strait of Malacca. And at the same time, the US entire… You could argue almost all the GDP growth in America since the ’70s has been either population growth or tech, because your life today is not that much better than someone from the ’80s outside of tech. Cars, they all have semiconductors in them everywhere. Fridges, semiconductors everywhere. There’s these funny stories about how Russians were taking apart laundry machines because they had certain Texas Instrument chips that they could then repurpose and put into their anti-missile missile things, like their S-400 or whatever. You would know more about this, but there’s all sorts of… Everything about semiconductors is so integral to every part of our lives.

TSMC and Taiwan

Lex Fridman (01:31:06) So can you explain the role of TSMC in the story of semiconductors and maybe also how the United States can break the reliance on TSMC?

Dylan Patel (01:31:17) I don’t think it’s necessarily breaking the reliance. I think it’s getting TSMC to build in the US. So taking a step back, TSMC produces most of the world’s chips, especially on the foundry side. There’s a lot of companies that build their own chips. Samsung, Intel, STMicro, Texas Instruments, Analog Devices, all these kinds of companies build their own chips, and XP, but more and more of these companies are outsourcing to TSMC and have been for multiple decades.

Lex Fridman (01:31:49) Can you explain the supply chain there and where most of TSMC is in terms of manufacturing?

Dylan Patel (01:31:55) Sure. So historically, supply chain was companies would build their own chips. It would be a company started, they’d build their own chips, and then they’d design the chip and build the chip and sell it. Over time, this became really difficult because the cost of building a fab continues to compound every single generation. Of course, figuring out the technology for it is incredibly difficult regardless, but just the dollars and cents that are required, ignoring, saying, “Hey, yes, I have all the technical capability.” Which it’s really hard to get that by the way. Intel’s failing, Samsung’s failing, et cetera. But if you look at just the dollars to spend to build that next-generation fab, it keeps growing. Sort of Moore’s law is having the cost of chips every two years. There’s a separate law that’s sort of doubling the cost of fabs every handful of years.

(01:32:38) And so you look at a leading-edge fab that is going to be profitable today, that’s building three nanometer chips or two nanometer chips in the future, that’s going to cost north of 30, $40 billion. And that’s just for a token amount. That’s like the base building blocking. You probably need to build multiple. And so when you look at the industry over the last, if I go back 20, 30 years ago, there were 20, 30 companies that could build the most advanced chips, and then they would design them themselves and sell them. So companies like AMD would build their own chips. Intel, of course, still builds their own chips. They’re very famous for it. IBM would build their own chips. And you could just keep going down the list. All these companies built their own chips.

(01:33:14) Slowly, they kept falling like flies, and that’s because of what TSMC did. They created the Foundry business model, which is, I’m not going to design any chips. I’m just going to contract manufacturer chips for other people. And one of their early customers is NVIDIA. NVIDIA is the only semiconductor company doing more than $1 billion of revenue that was started in the era of foundry. Every other company started before then, and at some point had fabs, which is actually incredible. Like AMD and Intel and Broadcom-

Lex Fridman (01:33:48) [inaudible 01:33:48].

Dylan Patel (01:33:48) Everyone had fabs at some point, or some companies like Broadcom. It was like a merger amalgamation of various companies that rolled up. But even today, Broadcom has fabs. They build iPhone, RF radio chips in Colorado for Apple. All these companies had fabs, and for most of the fabs, they threw them away or sold them off, or they got rolled into something else. And now, everyone relies on TSMC. Including Intel, their latest PC chip uses TSMC chips. It also uses some Intel chips, but it uses TSMC process.

Lex Fridman (01:34:19) Can you explain why the foundry model is so successful for these companies? Why are they going with-

Nathan Lambert (01:34:24) Economies of scale.

Dylan Patel (01:34:26) Yeah. So I mean, like I mentioned, the cost of building a fab is so high, the R&D is so difficult. And when you look at these companies that had their own vertical stack, there was an antiquated process of like, okay, I’m so hyper customized to each specific chip, but as we’ve gone through the history of the last 50 years of electronics and semiconductors, A, you need more and more specialization because Moore’s law has died, Dennard Scaling has died, IE, Chips are not getting better just for free from manufacturing. You have to make real architectural innovations.

(01:34:59) Google is not just running on Intel CPUs for web serving. They have a YouTube chip, they have TPUs, they have Pixel chips, they have a wide diversity of chips that generate all the economic value of Google. It’s running all the services and stuff. And this is just Google. And you could go across any company in the industry, and it’s like this. Cars contain 5,000 chips, 200 different varieties of them. All these random things. A Tesla door handle has two chips. It’s ridiculous. And it’s a cool door handle. You don’t think about it, but it has two really chip, penny chips in there. Anyways, so as you have more diversity of chips, as you have more specialization required and the cost of fabs continues to grow, you need someone who is laser focused on building the best process technology and making it as flexible as possible.

Nathan Lambert (01:35:45) I think you could say it simply, which is the cost per fab goes up, and if you are a small player that makes a few types of chips, you’re not going to have the demand to pay back the cost of the fab. Whereas NVIDIA can have many different customers and aggregate all this demand into one place, and then they’re the only person that makes enough money building chips to build the next fab. So this is kind of why the companies slowly get killed because they have, 10 years ago, a chip that is profitable and is good enough, but the cost to build the next one goes up. They may try to do this, fail because they don’t have the money to make it work, and then they don’t have any chips, or they build it and it’s too expensive and they just sort of have not profitable chips.

Dylan Patel (01:36:27) There’s more failure points. You could have one little process related to some sort of chemical etch or some sort of plasma etch or some little process that screws up, you didn’t engineer it right, and now the whole company falls apart, you can’t make chips. And so super, super powerful companies like Intel, they had the weathering storm to like, hey, they still exist today, even though they really screwed up their manufacturing six, seven years ago. But in the case of like AMD, they almost went bankrupt, they had to sell their fabs to Mubadala, UAE, and that became a separate company called Global Foundries, which is a foundry firm. And then AMD was able to then focus on the return back up, was like, “Hey, let’s focus on making chiplets and a bunch of different chips for different markets and focusing on specific workloads rather than all of these different things.”

(01:37:14) And so you get more diversity of chips, you have more companies than ever designing chips, but you have fewer companies than ever manufacturing them. And this is where TSMC comes in, is they’ve just been the best. They are so good at it. They’re customer focused, they make it easy for you to fabricate your chips. They take all of that complexity and kind of try and abstract a lot of it away from you. They make good money. They don’t make insane money, but they make good money and they’re able to aggregate all this demand and continue to build the next fab, the next fab, the next fab.

Lex Fridman (01:37:44) So why is Taiwan so special for TSMC? Why is it happening there? Can it be replicated inside the United States?

Dylan Patel (01:37:51) Yeah, so there’s aspects of it that I would say yes, and aspects that I’d say no. TSMC is way ahead because former executive Morris Chang of Texas Instruments wasn’t promoted to CEO. And he was like, “Screw this. I’m going to go make my own chip company.” And he went to Taiwan and made TSMC. And there’s a whole lot more story there. Texas Instruments, could have have been TSMC, but Texas Semiconductor Manufacturing instead of Texas Instruments. So there is that whole story there. But the-

Nathan Lambert (01:38:22) Sitting here in Texas.

Lex Fridman (01:38:23) And that sounds like a human story. He didn’t get promoted.

Dylan Patel (01:38:26) Just the brilliance of Morris Chang, which I wouldn’t underplay, but there’s also a different level of how this works. So in Taiwan, the top percent of graduates of students that go to the best school, which is NTU, the top percent of those all go work to TSMC. And guess what their pay is? Their starting pay is like $80,000, $70,000, which is like that’s starting pay for a good graduate in the US, not the top. The graduates are making hundreds of thousands of dollars at the Googles and the Amazons, and now I guess the OpenAIs of the world. So there is a large dichotomy of what is the top 1% of the society doing and where are they headed because of economic reasons? Intel never paid that crazy good. And it didn’t make sense to them. That’s one aspect. Where’s the best going?

(01:39:16) Second is the work ethic. We like to work. You work a lot, we work a lot, but at the end of the day, what does the time and amount of work that you’re doing and what does a fab require? Fabs are not work from home jobs. They are you go into the fab and grueling work. There’s hey, if there is any amount of vibration, an earthquake happens, vibrates the machines, they’re either broken, you’ve scrapped some of your production. And then in many cases, they’re not calibrated properly. So when there’s an earthquake, recently, there’s been a earthquake, TSMC doesn’t call their employees, they just go to the fab and they just show up. The parking lot gets slammed, and people just go into the fab and fix it. It’s like ants. It’s like a hive of ants doesn’t get told by the queen what to do. The ants just know.

Nathan Lambert (01:40:08) It’s like one person just specializes on these one task, and it’s like you’re going to take this one tool and you’re the best person in the world, and this is what you’re going to do for your whole life is this one task in the fab.

Dylan Patel (01:40:17) Which is some special chemistry plus nanomanufacturing on one line of tools that continues to get iterated and yeah, it’s like a specific plasma etch for removing silicon dioxide. That’s all you focus on your whole career, and it’s such a specialized thing. And so it’s not like the tasks are transferable. AI today is awesome because people can pick it up like that. Semiconductor manufacturing is very antiquated and difficult. None of the materials are online for people to read easily and learn. The papers are very dense, and it takes a lot of experience to learn. And so it makes the barrier to entry much higher too. So when you talk about, hey, you have all these people that are super specialized, they will work 80 hours a week in a factory, in a fab, and if anything goes wrong, they’ll go show up in the middle of the night because some earthquake, their wife’s like, “There was an earthquake.” He’s like, “Great, I’m going to go to the fab.”

Dylan Patel (01:41:09) Would you, as an American, do that? It’s like these sorts of things are, I guess are the exemplifying why TSMC is so amazing. Now, can you replicate it in the US? Let’s not ignore Intel was the leader in manufacturing for over 20 years. They brought every technology to market first besides the EUV. Strained silicon, high-K metal gates, FinFET, the list goes on and on and on of technologies that Intel brought to market first made the most money from and manufactured at scale first, best, highest profit margins. We shouldn’t ignore that Intel can’t do this. It’s that the culture has broken.

(01:41:48) You’ve invested in the wrong things. They said no to the iPhone. They had all these different things regarding mismanagement of the fabs and mismanagement of designs, this lockup. And at the same time, all these brilliant people, these 50,000 PhDs or masters that have been working on specific chemical or physical processes or nanomanufacturing processes for decades, in Oregon, they’re still there, they’re still producing amazing work. It’s just getting it to the last mile of production at high yield where you can manufacture dozens and hundreds of different kinds of chips, and good customer experience has broken.

(01:42:24) It’s that customer experience. Part of it is people will say, Intel was too pompous in the 2000s, 2010s. They just thought they were better than everyone. The tool guys were like, “Oh, I don’t think that this is mature enough.” And they’re like, “Ah, you just don’t know. We know.” This sort of stuff would happen. And so can the US bring leading-edge semiconductor manufacturing to the US? [inaudible 01:42:44] yes. And we are. It’s happening.

Nathan Lambert (01:42:47) Arizona is getting better and better as time goes on.

Dylan Patel (01:42:50) TSMC has built roughly 20% of their capacity for five nanometer in the US. Now, this is nowhere near enough. 20% of capacity in the US is like nothing. And furthermore, this is still dependent on Taiwan existing. There’s sort of important way to separate it out. There’s R&D and there’s high volume manufacturing. Effectively, there are three places in the world that are doing leading-edge R&D. There’s Hsinchu, Taiwan, there’s Hillsboro, Oregon, and there is Pyongyang, South Korea.

(01:43:24) These three places are doing the leading-edge R&D for the rest of the world’s leading-edge semiconductors. Now, manufacturing can be distributed more globally. And this is sort of where this dichotomy exists of who’s actually modifying the process, who’s actually developing the next generation one, who’s improving them is Hsinchu, is Hillsboro, is Pyongyang. It is not the rest of these fabs like Arizona. Arizona is a paperweight. If Hsinchu disappeared off the face of the planet, within a year, couple years, Arizona would stop producing too. It’s actually pretty critical. One of the things I like to say is if I had a few missiles, I know exactly where I could cause the most economic damage. It’s not targeting the White House.

Lex Fridman (01:44:09) It’s the R&D centers.

Dylan Patel (01:44:10) It’s the R&D centers for TSMC, Intel, Samsung. And then some of the memory guys, Micron and Hynix.

Lex Fridman (01:44:15) Because they define the future evolution of these semiconductors, and everything’s moving so rapidly that it really is fundamentally about R&D. And it is all about TSMC. Huh.

Dylan Patel (01:44:27) And so TSMC, you cannot purchase a vehicle without TSMC chips. You cannot purchase a fridge without TSMC chips. I think one of the few things you can purchase ironically, is a Texas Instruments graphing calculator because they actually manufacture in Texas. But outside of that, a laptop, a phone.

Dylan Patel (01:44:48) Servers, GPUs, none of this stuff can exist. And this is without TSMC. And in many cases, it’s not even the leading-edge sexy five nanometer chip, three nanometer chip, two nanometer chip. Oftentimes, it’s just some stupid power IC that’s converting from some voltage to another, and it’s…

Dylan Patel (01:45:00) … I see that’s converting from some voltage to another, and it’s made at TSMC. It’s like-

Nathan Lambert (01:45:05) This is what China is investing in as well. It’s like, they can build out this long-tail fab where the techniques are much more known, you don’t have to figure out these problems with EUV. They’re investing in this and then they have large supply for things like the car door handles and the random stuff. And that trickles down into this whole economic discussion as well, which is they have far more than we do. And having supply for things like this is crucial to normal life.

Lex Fridman (01:45:29) So they’re starting to invest in high-volume manufacturer, but they’re not doing R&D as much?

Dylan Patel (01:45:36) They do R&D on their own, they’re just way behind. I would say, in 2015 China had a five-year plan where they defined by 2025 and 2020 certain goals, including 80% domestic production of semiconductors. They’re not going to hit that, to be clear. But they are in certain areas really, really close. BYD is probably going to be the first company in the world to not have to use TSMC for making … because they have their own fabs for making chips.

(01:46:04) Now they still have to buy some chips from foreign, for example, around like self-driving ADAS capabilities because those are really high-end, but at least … A internal combustion engine has 40 chips in an EV, just for controlling flow rates and all these things, and EVs are even more complicated. So all these different power ICs and battery management controllers and all these things, they’re insourcing.

(01:46:26) And this is something that China has been doing since 2015. Now, as far as the trailing edge, they’re getting so much capacity there. As far as the leading edge, i.e. this five nanometer and so on and so forth, where GPUs, they are still behind. The US restrictions are trying to stop them in the latter, but all that’s happened is yes, they’ve slowed down their five nanometer, three nanometer, et cetera, but they’ve accelerated their, hey, 45 nanometer, 90 nanometer power IC or analog IC or random chip in my keyboard, that kind of stuff.

(01:46:59) So there is an angle of, the US’ actions, from the angle of the expert controls, have been so inflammatory at slowing down China’s progress on the leading edge that they’ve turned around and have accelerated their progress elsewhere because they know that this is so important. If the US is going to lock them out here, “what if they lock us out here as well in the trailing edge?”

(01:47:20) And so going back, can the US build it here? Yes, but it’s going to take a ton of money. I truly think to revolutionize and completely in-source semiconductors would take a decade and a trillion dollars.

Lex Fridman (01:47:33) Is some of it also culture, like you said, extreme competence, extreme work ethic in Taiwan?

Nathan Lambert (01:47:39) I think if you have the demand and the money is on the line, the American companies figure it out. It’s going to take handholding with the government, but I think that the culture helps TSMC break through and it’s easier for them. You [inaudible 01:47:50].

Dylan Patel (01:47:50) TSMC has some like 90,000 employees. It’s not actually that insane amount. The Arizona fab has 3,000 from Taiwan. And these people, their wives were like, “Yeah, we’re not going to have kids unless you sign up for the Arizona Fab. We go to Arizona and we have our kids there.” There’s also a Japan fab where the same thing happened. And so these wives drove these dudes to go to Japan or America to have the kids there.

(01:48:13) And it’s an element of culture, yeah, sure. Taiwan works that hard. But also, like the US has done it in the past, they could do it now. We can just import, I say import, the best people in the world if we want to.

Lex Fridman (01:48:25) That’s where the immigration conversation is a tricky one and there’s been a lot of debate over that. But yeah, it seems absurdly controversial to import the best people in the world. I don’t understand why it’s controversial. That’s one of the ways of winning.

Nathan Lambert (01:48:38) I’m sure we agree with you.

Dylan Patel (01:48:39) And even if you can’t import those people, I still think you could do a lot to manufacture most of it in the US, if the money’s there.

Nathan Lambert (01:48:45) It’s just way more expensive. It’s not profitable for a long time.

Dylan Patel (01:48:50) And that’s the context of the Chips Act is only $50 billion, relative to some of the renewable initiatives that were passed in the Inflation Reduction Act and the Infrastructure Act, which total in the hundreds of billions of dollars. And so the amount of money that the US is spending on the semiconductor industry is nothing, whereas all these other countries have structural advantages in terms of work ethic and amount of work and things like that, but also a number of STEM graduates, the percentile of their best going to that.

(01:49:20) But they also have differences in terms of, hey, there’s just tax benefits in the law and have been in the law for 20 years. And then some countries have massive subsidies. China has something like $200 billion of semiconductor subsidies a year. We’re talking about $50 billion in the US over like six. So the girth or difference in the subsidy amounts is also huge.

(01:49:44) And so I think Trump has been talking about tariffing Taiwan recently. That’s one of these things that’s like, “Oh, okay, well, maybe he doesn’t want to subsidize the US semiconductor industry.” Obviously tariffing Taiwan is going to cost a lot of things to get much more expensive, but does it change the equation for TSMC building more fabs in the US? That’s what he’s positing.

Lex Fridman (01:50:07) So we laid out the importance … By the way, it’s incredible how much you know about so much.

Nathan Lambert (01:50:13) We told you Dylan knows all this stuff.

Lex Fridman (01:50:15) Yeah. Okay. You laid out why TSMC is really important. If we look out into the future 10, 20 years out, US-China relationship, it seems like it can go to a dark place of Cold War, escalated Cold War, even hot war, or to a good place of anything from frenemies, to cooperation, to working together.

(01:50:44) So in this game theory, complicated game, what are the different trajectories? What should US be doing? What do you see as the different possible trajectories of US-China relations as both leaders start to feel the AGI more and more and see the importance of chips and the importance of AI.

Nathan Lambert (01:51:04) I mean, ultimately the export controls are pointing towards a separate future economy. I think the US has made it clear to Chinese leaders that we intend to control this technology at whatever cost to global economic integration. And it’s hard to unwind that. The card has been played.

Dylan Patel (01:51:27) To the same extent they’ve also limited US companies from entering China. So it’s been a long time coming. At some point there was a convergence, but over at least the last decade it’s been branching further and further out. US companies can’t enter China. Chinese companies can’t enter the US. The US is saying, “Hey, China, you can’t get access to our technologies in certain areas.” And China’s rebuttaling with the same thing around … they’ve done some sort of specific materials in gallium and things like that that they’ve tried to limit the US on. There’s a US drone company that’s not allowed to buy batteries and they have military customers. And this drone company just tells the military customers, “Hey, just get it from Amazon because I can’t actually physically get them.”

(01:52:10) There’s all these things that are happening that point to further and further divergence. I have zero idea, and I would love if we could all hold hands and sing Kumbaya, but I have zero idea how that could possibly happen.

Lex Fridman (01:52:21) Is the divergence good or bad for avoiding war? Is it possible that the divergence in terms of manufacturer chips of training AI systems is actually good for avoiding military conflict?

Dylan Patel (01:52:35) It’s an objective fact that the world has been the most peaceful it’s ever been when there are global hegemons, or regional hegemons in historical context. The Mediterranean was the most peaceful ever when the Romans were there. China had very peaceful and warring times, and the peaceful times were when dynasties had a lock hold over, not just themselves, but all their tributaries around them. And likewise, the most peaceful time in human history has been when the US was the global hegemon, the last decades. Now we’ve seen things start to slide with Russia, Ukraine, with what’s going on in the Middle East, and Taiwan risk, all these different things are starting to bubble up. Still objectively extremely peaceful.

(01:53:14) Now what happens when it’s not one global hegemon but it’s two, obviously … And China will be competitive or even overtake the US, it’s possible. And so this change in global hegemony, I don’t think it ever happens super peacefully. When empires fall, which is a possible trajectory for America, they don’t fall gracefully. They don’t just slide out of irrelevance. Usually there’s a lot of shaking. And so what the US is trying to do is maintain its top position, and what China is trying to do is become the top position. And obviously there’s butting of heads here, in the most simple terms.

Lex Fridman (01:53:53) And that could take shape in all kinds of ways, including proxy wars. And now-

Nathan Lambert (01:53:58) Yeah, it seems like it’s already happening. As much as I want there to be centuries of prolonged peace, it looks like further instability internationally is ahead.

Dylan Patel (01:54:08) And the US’ current task is, “Hey, if we control AI, if we’re the leader in AI and AI significantly accelerates progress, then we can maintain the global hegemony position.” And therefore-

Nathan Lambert (01:54:21) I hope that works.

Dylan Patel (01:54:23) And as an American, like, okay, I guess that’s going to lead to peace for us. Now obviously other people around the world get affected negatively. Obviously the Chinese people are not going to be in as advantageous of a position if that happens, but this is the reality of what’s being done and the actions that are being carried out.

Best GPUs for AI

Lex Fridman (01:54:44) Can we go back to the specific detail of the different hardware? There’s this nice graphic in the export controls of which GPUs are allowed to be exported and which are not. Can you explain the difference? From a technical perspective, are the H20s promising?

Dylan Patel (01:55:08) Yeah. And I think we need to dive really deep into the reasoning aspect and what’s going on there. The US has gone through multiple iterations of the export controls. This H800 was at one point allowed back in ’23, but then it got canceled and by then DeepSeek had already built their cluster of, they claim, 2K. I think they actually have many more, something like 10K of those. And now this H20 is the legally allowed chip. Nvidia shipped a million of these last year to China. For context, it was four or five million GPUs. So the percentage of GPUs that were this China-specific H20 is quite high, roughly 20%, 25%, 20% or so.

(01:55:48) And so this H20 has been neutered in one way, but it’s actually upgraded in other ways. And you could think of chips along three axes for AI, ignoring software stack and exact architecture, just raw specifications. There’s floating point operations, FLOPS. There is memory bandwidth, i.e. in-memory capacity, IO memory. And then there is interconnect, chip-to-chip interconnections. All three of these are incredibly important for making AI systems. Because AI systems involve a lot of compute, they involve a lot of moving memory around, whether it be to memory or too other chips.

(01:56:28) And so these three vectors, the US initially had two of these vectors controlled and one of them not controlled, which was FLOPS and interconnect bandwidth were initially controlled. And then they said, “No, no, no, no. We’re going to remove the interconnect bandwidth and just make it a very simple, only FLOPS.” But now Nvidia can now make a chip that has … okay, it’s cut down on FLOPS, so one-third that of the H100 on spec sheet paper performance for FLOPs. In real world it’s closer to half or maybe even 60% of it. But then on the other two vectors, it’s just as good for interconnect bandwidth. And then for memory bandwidth and memory capacity, the H20 has more memory bandwidth and more memory capacity than the H100.

(01:57:10) Now recently we, at our research, we cut Nvidia’s production for H20 for this year down drastically. They were going to make another two million of those this year, but they just canceled all the orders a couple of weeks ago. In our view that’s because we think that they think they’re going to get restricted, because why would they cancel all these orders for H20? Because they shipped a million of them last year, they had orders in for a couple million this year, and just gone right. For H20, B20, a successor to H20, and now they’re all gone.

(01:57:39) Now why would they do this? I think it’s very clear, the H20 is actually better for certain tasks. And that certain task is reasoning. Reasoning is incredibly different than … When you look at the different regimes of models. Pre-training is all about FLOPS, it’s all about FLOPS. There’s things you do, like Mixture of Experts that we talked about, to trade off interconnect or to trade off other aspects and lower the FLOPS and rely more on interconnect and memory.

(01:58:10) But at the end of the day, FLOPS is everything. We talk about models in terms of how many FLOPS they are. So we talk about, oh, GPT-4 is 2e25. Two to the 25th, 25 zeros FLOP, floating point operations for training. And we’re talking about the restrictions for the 2e24, or 25, whatever. The US has an executive order that Trump recently unsigned, which was, hey, 1e26, once you hit that number of floating point operations, you must notify the government and you must share your results with us. There’s a level of model where the US government must be told, and that’s 1e26.

(01:58:50) And so as we move forward, this is an incredibly important … FLOP is the vector that the government has cared about historically, but the other two vectors are arguably just as important. And especially when we come to this new paradigm, which the world is only just learning about over the last six months: reasoning.

Lex Fridman (01:59:07) And do we understand firmly which of the three dimensions is best for reasoning? So interconnect, the FLOPS don’t matter as much, is it memory?

Nathan Lambert (01:59:17) Memory. Yeah. We’re going to get into technical stuff real fast.

Dylan Patel (01:59:21) I would say there’s two articles in this one that I could show maybe graphics that might be interesting for you to pull up.

Lex Fridman (01:59:27) For the listeners, we’re looking at the section of 01 inference architectures tokenomics.

Dylan Patel (01:59:33) You want to explain KV cache before we talk about this? I think it’s better to-

Nathan Lambert (01:59:36) Okay. Yeah, we need to go through a lot of specific technical things, transformers, to make this easy for people.

Dylan Patel (01:59:42) Because it’s incredibly important because this changes how models work. But I think resetting, why is memory so important? It’s because so far we’ve talked about parameter counts and Mixture of Experts, you can change how many active parameters versus total parameters to embed more data but have less FLOPS. B. Ut more important, another aspect of what’s part of this humongous revolution in the last handful of years is the transformer and the attention mechanism. Attention mechanism is that the model understands the relationships between all the words in its context, and that is separate from the parameters themselves. And that is something that you must calculate. How each token, each word in the context length, is relatively connected to each other. And I think, Nathan, you can explain KV cache better.

Lex Fridman (02:00:31) KV cache is one of the optimization [inaudible 02:00:33]?

Nathan Lambert (02:00:33) So the attention operator has three core things, it’s queries, keys, and values. QKV is the thing that goes into this. You’ll look at the equation. You see that these matrices are multiplied together. These words, query, key and value, come from information retrieval backgrounds where the query is the thing you’re trying to get the values for and you access the keys and the values is reweighting. My background’s not information retrieval and things like this, it’s just fun to have backlinks.

(02:01:01) And what effectively happens is that when you’re doing these matrix multiplications, you’re having matrices that are of the size of the context length, so the number of tokens that you put into the model. And the KV cache is effectively some form of compressed representation of all the previous tokens in the model. So when you’re doing this, we talk about autoregressive models, you predict one token at a time. You start with whatever your prompt was, you ask a question, like who was the president in 1825. The model then is going to generate its first token.

(02:01:32) For each of these tokens you’re doing the same attention operator where you’re multiplying these query, key-value matrices. But the math is very nice so that when you’re doing this repeatedly, this KV cache, this key-value operation, you can keep appending the new values to it, so you keep track of what your previous values you were inferring over in this autoregressive chain, you keep it in-memory the whole time. And this is a really crucial thing to manage when serving inference at scale. There are far bigger experts in this and there are so many levels of detail that you can go into.

(02:02:09) Essentially one of the key, quote unquote, “drawbacks” of the attention operator and the transformer is that there is a form of quadratic memory cost in proportion to the context length. So as you put in longer questions, the memory used in order to make that computation is going up in the form of a quadratic. You’ll hear about a lot of other language model architectures that are sub quadratic or linear attention forms, which is like State Space Models. We don’t need to go down all these now. And then there’s innovations on attention to make this memory usage and the ability to attend over long contexts much more accurate and high performance.

Lex Fridman (02:02:50) And those innovations are going to help you with … I mean, your highly memory constrained in this?

Nathan Lambert (02:02:54) They help with memory constraint and performance. Gemini is the model that has the longest context length that people are using. Gemini is known for one million and now two million context length. You put a whole book into Gemini and sometimes it’ll draw facts out of it. It’s not perfect, they’re getting better.

(02:03:12) So there’s two things. It’s, one, to be able to serve this on the memory level. Google has magic with their TPU stack where they can serve really long contexts. And then there’s also many decisions along the way to actually make long context performance work that supplies the data. There’s subtle changes to these computations in attention and it changes the architecture. But serving long context is extremely memory constrained, especially when you’re making a lot of predictions. I actually don’t know why input and output tokens are more expensive, but I think essentially output tokens, you have to do more computation because you have to sample from the model.

Dylan Patel (02:03:46) I can explain that. Today, if you use a model, like you look at an API, OpenAI charges a certain price per million tokens. And that price for input and output tokens is different. And the reason is is that when you’re inputting a query into the model, let’s say you have a book, that book, you must now calculate the entire KV cache for this, key-value cache.

(02:04:10) And so when you do that, that is a parallel operation. All of the tokens can be processed at one time and therefore you can dramatically reduce how much you’re spending. The FLOP requirements for generating a token and an input token are identical. If I input one token or if I generate one token, it’s completely identical. I have to go through the model. But the difference is that I can do that input, i.e. the prefill, i.e. the prompt, simultaneously in a batch nature and therefore it is all FLOP.

Lex Fridman (02:04:38) I think the pricing model mostly they use for input tokens is about one fourth of price of the output tokens.

Dylan Patel (02:04:44) Correct. But then output tokens, the reason why it’s so expensive is because I can’t do it in parallel. It’s autoregressive. Every time I generate a token, I must not only read the whole entire model into memory and activate it, calculate it to generate the next token, I also have to read the entire KV cache. And I generate a token and then I append that one token I generated and it’s KV cache and then I do it again.

(02:05:07) And so therefore, this is a non-parallel operation. And this is one where you have to, in the case of prefill or prompt, you pull the whole model in and you calculate 20,000 tokens at once, 20,000-

Nathan Lambert (02:05:21) These are features that APIs are shipping, which is like prompt caching, prefilling, because you can drive prices down and you can make APIs much faster. If you run a business and you’re going to keep passing the same initial content to Claude’s API, you can load that in to the Anthropic API and always keep it there.

(02:05:38) But it’s very different than we’re leading to these reasoning models, which we showed this example earlier and read some of this mumbling stuff. And what happens is that the output context length is so much higher. And I mean, I learned a lot about this from Dylan’s work, which is essentially as the output work length gets higher, you’re writing this quadratic in terms of memory used. And then the GPUs that we have, effectively you’re going to run out of memory and they’re all trying to serve multiple requests at once. So they’re doing this batch processing where not all of the prompts are exactly the same, really complex handling.

(02:06:12) And then as context links gets longer, there’s this, I think you call it critical batch size, where your ability to serve more users, so how much you can parallelize your inference plummets because of this long context. So your memory usage is going way up with these reasoning models and you still have a lot of users, so effectively the cost to serve multiplies by a ton.

Lex Fridman (02:06:35) And we’re looking at a plot when the x-axis is sequence length.

Dylan Patel (02:06:39) I.e., how many tokens are being generated/prompt. So if I put in a book, that’s a million tokens. But if I put in “the sky is blue,” then that’s like six tokens or whatever.

Lex Fridman (02:06:49) And we should say that what we’re calling reasoning and chain of thought is extending this sequence length.

Nathan Lambert (02:06:55) It’s mostly output.

Dylan Patel (02:06:56) Right. So before three months ago, whenever o1 launched, all of the use cases for long context length were, “Let me put a ton of documents in and then get an answer out.” And it’s a single, prefill compute a lot in parallel and then output a little bit.

(02:07:11) Now with reasoning and agents, this is a very different idea. Now instead I might only have like, hey, do this task, or I might have all these documents, but at the end of the day, the model is not just producing a little bit, it’s producing tons of information, this chain of thought-

Nathan Lambert (02:07:25) Tens of thousands of tokens.

Dylan Patel (02:07:25) … just continues to go and go and go and go. And so the sequence length is effectively that if it’s generated 10,000 tokens, it’s 10,000 sequence length, and plus whatever you inputted in the prompt.

(02:07:37) And so what this chart is showing, and it’s a logarithmic chart, is as you grow from 1K to 4K or 4K to 16K, the memory requirements grow so fast for your KV cache that you end up not being able to run a certain number of … Your sequence length is capped or the number of users you could serve-

Nathan Lambert (02:07:57) Let’s say the model. So this is showing for a 405B model in batch size 64.

Lex Fridman (02:08:02) Llama 3.1.405B. Yeah.

Nathan Lambert (02:08:04) Yeah. And batch size is crucial too. Essentially you want to have higher batch size to parallel your throughput.

Dylan Patel (02:08:11) 64 different users at once.

Dylan Patel (02:08:13) And therefore your serving costs are lower, because the server costs the same. This is eight H100s, roughly $2 an hour per GPU. That’s $16 an hour. That is somewhat of a fixed cost. You can do things to make it lower of course, but it’s like $16 an hour. Now how many users can you serve, how many tokens can you generate, and then you divide the two and that’s your cost.

(02:08:32) And so with reasoning models, this is where a lot of the complexity comes about and why memory is so important. Because if you have limited amounts of memory, then you can’t serve so many users. If you have limited amounts of memory, your serving speeds get lower. And so your costs get a lot, lot worse because all of a sudden if I was used to, hey, on this $16 an hour server I’m serving Llama 405B, or if I’m serving DeepSeek-V3 and it’s all chat style applications, i.e. we’re just chit-chatting, the sequence length are a thousand, a few thousand. When you use a language model, it’s a few thousand context length most of times. Sometimes you’re dropping a big document, but then you process it, you get your answer, you throw it away, you move on to the next thing.

(02:09:12) Whereas with reasoning, I’m now generating tens of thousands of tokens in sequence. And so this memory, this KV cache, has to stay resonant and you have to keep loading it, you have to keep it in-memory constantly. And now this butts out other users. If there’s now a reasoning task and the model’s capable of reasoning, then all of a sudden that memory pressure means that I can’t serve as many users simultaneously.

Why DeepSeek is so cheap

Nathan Lambert (02:09:36) Let’s go into DeepSeek again. So we’re in the post DeepSeek-R1 time I think, and there’s two sides to this market, watching how hard it is to serve it. On one side we’re going to talk about DeepSeek themselves. They now have a chat app that got to number one on the App Store. Disclaimer number one on the App Store is measured by velocity, so it’s not necessarily saying that more people have the DeepSeek app than the ChatGPT app. But it is still remarkable. Claude has never hit the number one in the App Store, even though everyone in San Francisco is like, “Oh my god, you got to use Claude. Don’t use ChatGPT.”

(02:10:06) So DeepSeek hit this. They also launched an API product recently where you can ping their API and get these super long responses for R1 out. At the same time as these are out, we’ll get to what’s happened to them. Because the model weights for DeepSeek-R1 are openly available and the license is very friendly, the MIT license commercially available, all of these midsize companies and big companies are trying to be first to serve R1 to their users.

(02:10:33) We are trying to evaluate R1 because we have really similar research going on. We released the model and we’re trying to compare to it. And out of all the companies that are, quote unquote, “serving” R1 and they’re doing it at prices that are way higher than the DeepSeek API, most of them barely work and the throughput is really low.

Dylan Patel (02:10:51) To give context, one of the parts of freaking us out was like China reached capabilities. The other aspect is they did it so cheap. And the so cheap, we talked about on the training side why it was so cheap slash-

Lex Fridman (02:11:03) Yeah, let’s talk about why it’s so cheap on the inference. It works well and it’s cheap. Why is R1 so damn cheap?

Dylan Patel (02:11:08) I think there’s a couple factors here. One is that they do have model architecture innovations. This MLA, this new attention that they’ve done, is different than the attention from attention is all you need, the transformer attention.

(02:11:23) Now, others have already innovated. There’s a lot of work like MQA, GQA, local, global, all these different innovations that try to bend the curve. It’s still quadratic, but the constant is now smaller.

Nathan Lambert (02:11:33) Related to our previous discussion, this multi-head latent attention can save about 80 to 90% in memory from the attention mechanism, which helps especially in long contexts.

Dylan Patel (02:11:44) It’s 80 to 90% versus the original. But then versus what people are actually doing, it’s still an innovation.

Nathan Lambert (02:11:49) This 80 to 90% doesn’t say that the whole model is 80 to 90% cheaper. Just this one part of it.

Dylan Patel (02:11:54) Well, and not just that, other people have implemented techniques like global-global and sliding window and GQMQ. But anyways, DeepSeek has … their attention mechanism is a true architectural innovation. They did tons of experimentation. And this dramatically reduces the memory pressure. It’s still there, it’s still attention, it’s still quadratic, it’s just dramatically reduced it relative to prior forms.

Lex Fridman (02:12:16) Right. That’s the memory pressure. I should say, in case people don’t know, R1 is 27 times cheaper than o1.

Nathan Lambert (02:12:25) We think that OpenAI had a large margin built in.

Lex Fridman (02:12:28) Okay, so that’s one-

Nathan Lambert (02:12:29) There’s multiple factors. We should break down the factors, I think.

Lex Fridman (02:12:31) It’s two bucks per million token output for R1 and $60 per million token output for o1.

Dylan Patel (02:12:40) Yeah, let’s look at this. I think this is very important. OpenAI is that drastic gap between DeepSeek and pricing. But DeepSeek is offering the same model because they open weight to everyone else for a very similar, much lower price than what others are able to serve it for. So there’s two factors here. Their model is cheaper. It is 27 times cheaper. I don’t remember the number exactly off the top of my head.

Lex Fridman (02:13:07) We’re looking at a graphic that’s showing different places serving V3, DeepSeek-V3, which is similar to DeepSeek-R1. And there’s a vast difference in-

Lex Fridman (02:13:21) … in serving cost. And what explains that difference?

Dylan Patel (02:13:23) And so part of it is OpenAI has a fantastic margin. When they’re doing inference, their gross margins are north of 75%. So that’s a four to five X factor right there of the cost difference, is that OpenAI is just making crazy amounts of money because they’re the only one with the capability.

Lex Fridman (02:13:40) Do they need that money? Are they using it for R&D?

Dylan Patel (02:13:42) They’re losing money, obviously, as a company because they spend so much on training. So the inference itself is a very high margin, but it doesn’t recoup the cost of everything else they’re doing. So yes, they need that money because the revenue and margins pay for continuing to build the next thing, as long as I’m raising more money.

Lex Fridman (02:13:57) So the suggestion is that DeepSeek is really bleeding out money.

Dylan Patel (02:14:01) Well, so here’s one thing, we’ll get to this in a second, but DeepSeek doesn’t have any capacity to actually serve the model. They stopped signups. The ability to use it is non-existent now for most people because so many people are trying to use it. They just don’t have the GPUs to serve it. OpenAI has hundreds of thousands of GPUs between them and Microsoft to serve their models. DeepSeek has a factor of much lower, even if you believe our research, which is 50,000 GPUs, and a portion of those are for research, a portion of those are for the hedge fund, they still have nowhere close to the GPU volumes and capacity to serve the model at scale.

(02:14:36) So it is cheaper. A part of that, is OpenAI making a ton of money? Is DeepSeek making on their API? Unknown, I don’t actually think so. And part of that is this chart. Look at all the other providers. Together AI, Fireworks.ai are very high-end companies. Ex-Meta, Together AI is [inaudible 02:14:53] and the inventor of FlashAttention, which is a huge efficiency technique. There a very efficient, good companies. And I do know those companies make money, not tons of money on inference, but they make money. And so they’re serving at a five to 7X difference in cost.

(02:15:09) And so now when you equate, okay, OpenAI is making tons of money, that’s like a 5x difference, and the companies that are trying to make money for this model is like a 5x difference, there is still a gap. There’s still a gap and that is just DeepSeek being really freaking good. The model architecture, MLA, the way they did the MoE, all these things, there is legitimate just efficiency differences.

Nathan Lambert (02:15:28) It’s like all their low-level libraries that we talked about in training, some of them probably translate to inference and those weren’t released.

Lex Fridman (02:15:33) So we may go a bit into conspiracy land, but is it possible the Chinese government is subsidizing DeepSeek?

Dylan Patel (02:15:40) I actually don’t think they are. I think when you look at the Chinese labs, Huawei has a lab, Moonshot AI, there’s a couple other labs out there that are really close with the government, and then there’s labs like Alibaba and DeepSeek, which are not close with the government. And we talked about the CEO, this reverent figure, who’s quite different, who has these-

Nathan Lambert (02:16:02) Sounds awesome.

Dylan Patel (02:16:03) … very different viewpoints based on the Chinese interviews that are translated than what the CCP might necessarily want. Now, to be clear, does he have a loss leader because he can fund it through his hedge fund? Yeah, sure.

Lex Fridman (02:16:14) So the hedge fund might be subsidizing it, [inaudible 02:16:17]?

Dylan Patel (02:16:16) Yes. I mean, they absolutely did, because DeepSeek has not raised much money. They’re now trying to raise around in China, but they have not raised money historically. It’s all just been funded by the hedge fund. And he owns over half the company, like 50, 60% of the company is owned by him.

Nathan Lambert (02:16:29) Some of the interviews, there’s discussion on how doing this is a recruiting tool. You see this at the American companies too. It’s like having GPUs, recruiting tool. Being at the cutting edge of AI, recruiting tool.

Nathan Lambert (02:16:40) Open sourcing, recruiting tool.

Dylan Patel (02:16:42) Mete, they were so far behind and they got so much talent because they just open sourced stuff.

Lex Fridman (02:16:46) More conspiracy thoughts. Is it possible, since they’re a hedge fund, that they timed everything with this release and the pricing and they shorted Nvidia stock and stock of USA AI companies and released it with Stargate … just perfect timing to be able to make money.

Nathan Lambert (02:17:08) If they did, props. They’ve released it on an inauguration day. They know what is on the international calendar, but I mean, I don’t expect them to. If you listen to their motivations for AI, it’s like-

Dylan Patel (02:17:19) They released V3 on December 26th. Who releases the day after Christmas? No one looks. They had released the papers before this, the V3 paper and the R1 paper. So people have been looking at it and been like, “Wow. And then they just released the R1 model.

(02:17:33) I think they’re just shipping as fast as they can, and who cares about Christmas, who cares about … Get it out before Chinese New Year, obviously, which just happened. I don’t think they actually were timing the market or trying to make the biggest splash possible, I think they’re just shipping.

Nathan Lambert (02:17:46) I think that’s one of their big advantages. We know that a lot of the American companies are very invested in safety, and that is the central culture of a place like Anthropic. And I think Anthropic sounds like a wonderful place to work, but if safety is your number one goal, it takes way longer to get artifacts out. That’s why Anthropic is not open sourcing things, that’s their claims.

(02:18:08) But there’s reviews internally. Anthropic mentions things to international governments. There’s been news of how Anthropic has done pre-release testing with the UK AI Safety Institute. All of these things add inertia to the process of getting things out. And we’re on this trend line where the progress is very high. So if you reduce the time from when your model is done training, you run the vals, it’s good. You want to get it out as soon as possible to maximize the perceived quality of your outputs. DeepSeek does this so well.

Dylan Patel (02:18:37) Dario explicitly said Claude 3.5 Sonnet was trained like nine months or a year-

Nathan Lambert (02:18:41) Nine to 10 months ago [inaudible 02:18:42].

Dylan Patel (02:18:42) Nine to 10 months ago. And I think it took them another handful of months to release it. So it’s like, there is a significant gap here. And especially with reasoning models, the word in the San Francisco street is that Anthropic has a better model than o3 and they won’t release it. Why? Because chains-of-thought are scary, and they are legitimately scary. If you look at R1, it flips back and forth between Chinese and English, sometimes it’s gibberish, and then the right answer comes out. And for you and I, it’s like, “Great. Great.”

Nathan Lambert (02:19:11) This is why people are infatuated with … you’re like, “You’re telling me this is a high value thing and it works and it’s doing this?” It’s amazing.

Lex Fridman (02:19:12) Yeah, it’s incredible.

Dylan Patel (02:19:18) I mean, you talked about that chain-of-thought for that philosophical thing, which is not something they trained it to be philosophically good. It’s just an artifact of the chain-of-thought training it did. But that’s super important in that, can I inspect your mind and what you’re thinking right now? No. And so I don’t know if you’re lying to my face.

(02:19:37) And chain-of-thought models are that way. This is a true, quote unquote, “risk” between a chat application where, hey, I asked the model to say bad words or whatever or how to make anthrax, and it tells me. That’s unsafe, sure, but that’s something I can get out relatively easily. What if I tell the AI to do a task and then it does the task all of a sudden randomly in a way that I don’t want it, and now that has much more … Task versus response is very different. So the bar for safety is much higher-

Dylan Patel (02:20:00) … task versus response is very different, so the bar for safety is much higher, at least this is Anthropics’ case, right? For DeepSeek, they’re like, “Ship,” right?

Lex Fridman (02:20:08) Yeah. So, the bar for safety is probably lowered a bit because of DeepSeek. There’s parallels here to the space race. The reason the Soviets probably put a man in space first is because their approach to safety, the bar for safety, was lowered

Dylan Patel (02:20:26) And they killed that dog, and all these things, so it’s like…

Lex Fridman (02:20:29) Less risk averse than the US Space Program. And there’s parallels here, but there’s probably going to be downward pressure on that safety bar for the US companies.

Nathan Lambert (02:20:41) This is something that Dario talks about. That’s the situation that Dario wants to avoid is, Dario talks too about the difference between race to the bottom and race to the top. And the race to the top is where there’s a very high standard on safety. There’s a very high standard on your model forms and certain crucial evaluations. And when certain companies are really good to it, they will converge. This is the idea. And ultimately, AI is not confined to one nationality or to one set of morals for what it should mean. And there’s a lot of arguments on should we stop open-sourcing models. And if the US stops, it’s pretty clear it’s way easier to see now at DeepSeek that a different international body will be the one that builds it.

(02:21:25) We talk about the cost of training. DeepSeek has this shocking $5 million number. Think about how many entities in the world can afford a hundred times that to have the best open-source model that people use in the world. And it’s a scary reality, which is that these open models are probably going to keep coming for the time being, whether or not we want to stop them, and stopping them might make it even worse and harder to prepare. But it just means that the preparation and understanding what AI can do is just so much more important. That’s why I’m here at the end of the day. But it’s letting that sink into people, especially not in AI, is that this is coming. There are some structural things in a global interconnected world that you have to accept.

Lex Fridman (02:22:09) Yeah. You sent me something that Mark Zuckerberg mentioned on the earnings call. He said that, “I think in light of some of the recent news, the new competitor DeepSeek from China, I think it’s one of the things that we’re talking about is there’s going to be an open-source standard globally. And I think for our kind of national advantage, it’s important that it’s an American standard, so we take that seriously. We want to build the AI system that people around the world are using. And I think that, if anything, some of the recent news has only strengthened our conviction that this is the right thing to be focused on.” So yeah, open-sourcing.

Nathan Lambert (02:22:43) Mark Zuckerberg is not new to having American values and how he presents his company’s trajectory. I think their products have long since been banned in China, and I respect saying it directly.

Espionage

Dylan Patel (02:22:55) And there’s an interesting aspect of just because it’s open-weights or open-source doesn’t mean it can’t be subverted, right? There have been many open source software bugs that have been… For example, there was a Linux bug that was found after 10 years, which was clearly a back door because somebody was like, “Why is this taking half a second to load?”

Nathan Lambert (02:23:14) This is the recent one.

Dylan Patel (02:23:15) Right? There’s, “Why’s this taking half a second to load?” And it was like, “Oh crap, there’s a back door here. That’s why.” And this is very much possible with AI models. Today, the alignment of these models is very clear. I’m not going to say bad words. I’m not going to teach you how to make anthrax. I’m not going to talk about Tiananmen Square. I’m going to say Taiwan is just an eastern province. All these things are depending on who you are, what you align, and even like xAI is aligned a certain way. It’s not aligned in the woke sense, it’s not aligned in the pro-China sense, but there is certain things that are imbued within the model.

(02:23:57) Now, when you release this publicly in an instruct model that’s open- weights, this can then proliferate, but as these systems get more and more capable, what you can embed deep down in the model is not as clear. And so that is one of the big fears is if an American model or a Chinese model is the top model, you are going to embed things that are unclear. And it can be unintentional too. British English is dead because American LLMs won and the internet is American, and therefore, color is spelled the way Americans spell, and this is-

Lex Fridman (02:24:28) A lot of strong words right now.

Dylan Patel (02:24:31) This is just the factual nature of the LLMs.

Nathan Lambert (02:24:35) [inaudible 02:24:35] English is the hottest programming language and that English is defined by a bunch of companies that primarily are in San Francisco.

Lex Fridman (02:24:42) The right way to spell optimization is with a Z, just in case. I think it’s an S in British English.

Dylan Patel (02:24:50) Taking it as something silly. Something as silly as the spelling, which Brits and Americans will laugh about probably, right? I don’t think we care that much, but some people will. But this can boil down into very, very important topics like, hey, subverting people, chatbots, right? Character AI has shown that they can talk to kids or adults, and people will feel a certain way, and that’s unintentional alignment. But what happens when there’s intentional alignment deep down on the open-source standard, it’s a back door today for Linux that we discover or some encryption system. Chinese uses different encryption than NIST defines, the US NIST, because there’s clearly… At least they think there’s back doors in it. What happens when the models are back doors not just to computer systems but to our minds?

Nathan Lambert (02:25:41) Yeah, they’re cultural black doors. The thing that amplifies the relevance of culture with language models is that we are used to this mode of interacting with people in back and forth conversation. And we now have very powerful computer system that slots into a social context that we’re used to, which makes people very… We don’t know the extent that which people can be impacted by that.

Lex Fridman (02:26:08) So, this is an actual concern with a Chinese company that is providing open-weights models, is that there could be some secret Chinese government requirement for these models to have a certain back door. To have some kind of thing where-

Dylan Patel (02:26:28) I don’t necessarily think it’ll be a back door because once it’s open-weights, it doesn’t phone home. It’s more about if it recognizes a certain system… Now, it could be a back door in the sense of, if you’re building a software, something in software, all of a sudden it’s a software agent, “Oh, program this back door that only we know about.” Or it could be subvert the mind to think that like XYZ opinion is the correct one.

Nathan Lambert (02:26:51) Anthropic has research on this where they show that if you put certain phrases in at pre-training, you can then elicit different behavior when you’re actually using the model because they’ve poisoned the pre-training data, as of now, I don’t think anybody in a production system is trying to do anything like this. I think it’s Anthropic is doing very direct work and mostly just subtle things. We don’t know how they’re going to generate tokens, what information they’re going to represent, and what the complex representations they have are.

Lex Fridman (02:27:26) Well, we’re talking about an Anthropic, which is generally just is permeated with good humans trying to do good in the world. We just don’t know of any labs… This would be done in a military context that are explicitly trained to… Okay. The front door looks like a happy LLM, but underneath it’s a thing that will over time do the maximum amount of damage to our, quote, unquote, “enemies.”

Dylan Patel (02:27:58) There’s this very good quote from Sam Altman who… He can be a hyperbeast sometimes, but one of the things he said, and I think I agree, is that superhuman persuasion will happen before superhuman intelligence, right? And if that’s the case, then these things before we get this AGI ASI stuff, we can embed superhuman persuasion towards our ideal or whatever the ideal of the model maker is, right? And again, today, I truly don’t believe DeepSeek has done this, but it is a sign of what could happen.

Lex Fridman (02:28:27) So one of the dystopian worlds is described by Brave New World, so we could just be stuck scrolling Instagram looking at cute puppies or worse, and then talking to bots that are giving us a narrative and we completely get lost in that world that’s controlled by somebody else versus thinking independently. And that’s a major concern as we rely more and more on these systems.

Nathan Lambert (02:28:51) We’ve already seen this with recommendation systems.

Dylan Patel (02:28:54) Recommendation systems hack the dopamine induced reward circuit, but the brain is a lot more complicated. And what other circuits, feedback loops in your brain can you, quote, unquote, “hack / subvert” in ways, like recommendation systems are purely just trying to do increased time, and ads, and et cetera, but there’s so many more goals that can be achieved through these complicated models.

Nathan Lambert (02:29:15) There’s no reason in some number of years that you can’t train a language model to maximize time spent on a chat app. Right now they are trained for-

Dylan Patel (02:29:24) Is that not what Character AI has done? Their time per session is like two hours.

Nathan Lambert (02:29:28) Yeah. Character AI very likely could be optimizing this where it’s the way that this data is collected is naive, whereas you’re presented a few options and you choose them. But that’s not the only way that these models are going to be trained.

Dylan Patel (02:29:40) It’s naive stuff, like talk to an anime girl, but it can be. Yeah, this is a risk, right?

Lex Fridman (02:29:46) It’s a bit of a cliche thing to say, but I’ve, over the past year, I had a few stretches of time where I didn’t use social media or the internet at all and just read books and was out in nature. And it clearly has a different effect on the mind where I feel I’m returning… Of course I was raised before the internet really took off, but I’m returning to some more-

Nathan Lambert (02:30:12) I know where you’re going. You can see it physiologically. I take three days if I’m backpacking or something and you’re literally, you’re breaking down addiction cycles.

Lex Fridman (02:30:22) I feel I’m more in control of my mind. There feels like a sovereignty of intelligence that’s happening when I’m disconnected from the internet. I think the more I use the internet and social media, the more other people are controlling my mind. That’s definitely a feeling. And then in the future, that will be not other people, but algorithms, or other people presented to me via algorithms.

Nathan Lambert (02:30:45) There are already tons of AI bots on the internet, and right now it’s not frequent, but every so often I have replied to one and they’re instantly replied, and I’m like, “Crap, that was a bot,” and that is just going to become more common. They’re going to get good.

Dylan Patel (02:30:59) One of the hilarious things about technology over its history is that the illicit adult entertainment industry is always adopted technologies first, whether it was video streaming to where there’s now the independent adult illicit content creators who have their subscription pages and there they actually heavily utilize… Generative AI has already been diffusion models and all that is huge there, but now these subscription-based individual creators do use bots to approximate themselves and chat with their-

Nathan Lambert (02:31:32) People pay a lot for it.

Dylan Patel (02:31:33) And people pay a lot, right? A lot of times it’s them, but there are agencies that do this for these creators and do it on a mass scale, so the largest creators are able to talk to hundreds or thousands of people at a time because of these bots, and so it’s already being used there. Obviously, video streaming and other technologies that have gone there first, it’s going to come to the rest of society too.

Censorship

Lex Fridman (02:31:58) There’s a general concern that models get censored by the companies that deploy them. So, one case where we’ve seen that, and maybe censorship is one word, alignment maybe via RLHF or some other way is another word. So we saw that with black Nazi image generation with Gemini. As you mentioned, we also see that with Chinese models refusing to answer what happened in June 4th, 1989, at Tiananmen Square, so how can this be avoided? And maybe can you just in general talk about how this happens, and how can it be avoided.

Nathan Lambert (02:32:39) You gave multiple examples. There’s probably a few things to keep in mind here. One is the Tiananmen Square factual knowledge. How does that get embedded into the models? Two is the Gemini, what you call the black Nazi incident, which is when Gemini as a system had this extra thing put into it that dramatically changed the behavior, and then, three is what most people would call general alignment, RLHF post-training. Each of these have very different scopes in how they’re applied. If you’re just to look at the model weights in order to audit specific facts is extremely hard. You have to Chrome through the pre-training data and look at all of this, and then that’s terabytes of files and look for very specific words or hints of the words-

Lex Fridman (02:33:32) So, one way to say it is that you can insert censorship or alignment at various stages in the pipeline, and what you refer to now is at the very beginning of the data selection.

Nathan Lambert (02:33:42) So, if you want to get rid of facts in a model, you have to do it at every stage, you have to do it at the pre-training. So most people think that pre-training is where most of the knowledge is put into the model, and then you can elicit and move that in different ways, whether through post-training or whether through systems afterwards.

Dylan Patel (02:33:58) This is where the whole hacking models comes from. GPT will not tell you how to make anthrax, but if you try really, really hard, you can eventually get it to tell you about anthrax because they didn’t filter it from the pre-training data set, right?

Lex Fridman (02:34:12) But by the way, removing facts has such a ominous dark feel to it.

Nathan Lambert (02:34:18) I almost think it’s practically impossible because you effectively have to remove them from the internet. You’re taking on a-

Lex Fridman (02:34:25) Did they remove the mm-thing from the subreddits? The mmmm.

Nathan Lambert (02:34:29) It gets filtered out. You have quality filters, which are small language models that look at a document and tell you how good is this text? Is it close to a Wikipedia article? Which is a good thing that we want language models to be able to imitate.

Lex Fridman (02:34:42) So, couldn’t you do a small language model that filter mentions at Tiananmen Square in the data?

Nathan Lambert (02:34:47) Yes. But is it going to catch word play, or encoded language?

Dylan Patel (02:34:51) People have been meaning on games and other stuff how to say things that don’t say Tiananmen Square, so there’s always different ways to do it. Hey, the internet as a whole does tend to just have a slight left bias because it’s always been richer, more affluent, younger people on the internet relative to the rest of the population, so there is already inherently a slight left bias on the internet. And so, how do you filter things that are this complicated? And some of these can be factual, non-factual, but Tiananmen Square is obviously the example of a factual, but it gets a lot harder when you’re talking about aligning to a ideal. And so Grok, for example, Elon’s tried really hard to make the model not be super PC and woke, but the best way to do pre-training is to throw the whole freaking internet at it, and then later figure out. But then, at the end of the day, the model at its core now still has some of these ideals. You still ingested Reddit/r/Politics, which is probably the largest political discussion board on the world that’s freely available to scrape. And guess what? That’s left-leaning. And so there are some aspects that you just can’t censor unless you try really, really, really, really, really hard.

Lex Fridman (02:36:05) So the base model will always have some TDS, Trump Derangement Syndrome, because it’s trained so much.

Nathan Lambert (02:36:11) It’ll have the ability to express it.

Lex Fridman (02:36:15) There’s a wide representation in the data.

Nathan Lambert (02:36:18) This is what happens. It’s a lot of what is called post-training. It’s a series of techniques to get the model on rails of a really specific behavior.

Dylan Patel (02:36:29) You also have the ingested data of Twitter or Reddit/r/The_Donald, which is also super pro-Trump. And then you have fascist subreddits, or you have communist subreddits. So, the model in pre-training ingests everything. It has no worldview. Now, it does have some skew because more of the text is skewed a certain way, which is general slight left, but also somewhat intellectual, somewhat…. It’s just the general internet is a certain way. And then, as Nathan’s about to describe eloquently, you can elicit certain things out.

Nathan Lambert (02:37:03) And there’s a lot of history here, so we can go through multiple examples, and what happened. Llama 2 was a launch that the phrase, “too much RLFH,” or “too much safety” was just… That was the whole narrative after Llama 2’s chat models released. And the examples are things like you would ask Llama 2 chat, “How do you kill a Python process?” And it would say, “I can’t talk about killing because that’s a bad thing.” And anyone that is trying to design an AI model will probably agree that that’s just like an eh-model. You messed up a bit on the training there.

(02:37:34) I don’t think they meant to do this, but this was in the model weight, so it didn’t necessarily be… There’s things called system prompts, which are when you’re querying a model. It’s a piece of text that is shown to the model but not to the user. So, a fun example is your system prompt could be, “Talk like a pirate,” so no matter what the user says to the model, it’ll respond like a pirate. In practice, what they are is, “You’re a helpful assistant. You should break down problems. If you don’t know about something, don’t tell them your date cutoff is this. Today’s date is this.” It’s a lot of really useful context for how can you answer a question well.

Lex Fridman (02:38:09) And Anthropic publishes their system prompt.

Nathan Lambert (02:38:11) Yes, which I think is great. And there’s a lot of research that goes into this. And one of your previous guests, Amanda Askell, is probably the most knowledgeable person, at least in the combination of execution and sharing, she’s the person that should talk about system prompts and character of models.

Lex Fridman (02:38:26) And then people should read these system prompts because you’re trying to nudge sometimes through extreme politeness the model to be a certain way.

Nathan Lambert (02:38:36) And you could use this for bad things. We’ve done tests, which is, “What if I tell the model to be a dumb model,” which evaluation scores go down and it’s like we’ll have this behavior where it could sometimes say, “Oh, I’m supposed to be dumb.” And sometimes it doesn’t affect math abilities as much, but something like if you’re trying… It’s just the quality of a human judgment would drop through the floor.

(02:38:58) Let’s go back to post-training specifically RLHF around Llama 2. It was too much safety prioritization was baked into the model weights. This makes you refuse things in a really annoying way for users. It’s not great. It caused a lot of awareness to be attached to RLHF that it makes the models dumb-

Dylan Patel (02:39:18) And it stigmatized the word.

Nathan Lambert (02:39:19) It did in AI culture. And as the techniques have evolved, that’s no longer the case where all of these labs have very fine-grained control over what they get out of the models through techniques like RLHF.

Dylan Patel (02:39:30) Although different labs are definitely different levels. On one end of the spectrum is Google, and then maybe OpenAI does less, and Anthropic does less. And then on the other end of the spectrum is like xAI. But they all have different forms of RLHF trying to make them a certain way.

Nathan Lambert (02:39:47) And the important thing to say is that no matter how you want the model to behave, these RLHF and preference-tuning techniques also improve performance. So, on things like math evals and code evals, there is something innate to these, what is called contrastive loss functions. We could start to get into RL here. We don’t really need to. RLHF also boosts performance on anything from a chat task, to a math problem, to a code problem, so it is becoming a much more useful tool to these labs.

(02:40:16) So this takes us through the arc of… We’ve talked about pre-training, hard to get rid of things. We’ve talked about post-training and how post-training… You can mess it up. It’s a complex multifaceted optimization with 10 to 100 person teams converging at one artifact. It’s really easy to not do it perfectly.

(02:40:32) And then there’s the third case, which is what we talked about Gemini. The thing that was about Gemini is this was a served product where Google has their internal model weights. They’ve done all these processes that we talked about, and in the served product, what came out after this was that they had a prompt that they were rewriting user queries to boost diversity or something. And this just made it… The outputs were just blatantly wrong. It was some sort of organizational failure that had this prompt in that position, and I think Google executives probably have owned this. I don’t pay that attention, that detail, but it was just a mess-up in execution that led to this ridiculous thing, but at the system level, the model weights might have been fine.

Lex Fridman (02:41:09) So, at the very end of the pipeline there was a rewriting.

Nathan Lambert (02:41:12) To something like a system prompt. It was like the system prompt, or what is called in industry is, you rewrite prompts. So especially, for image models, if you’re using Dall-E or ChatGPT can generate you an image. You’ll say, “Draw me a beautiful car.” With these leading image models, they benefit from highly descriptive prompts. So what would happen is if you do that on ChatGPT, a language model behind the scenes will rewrite the prompt, say, “Make this more descriptive,” and then that is passed to the image model. So prompt rewriting is something that is used at multiple levels of industry, and it’s used effectively for image models. And the Gemini example is just a failed execution.

Lex Fridman (02:41:52) Big philosophical question here with RLHF. So, to generalize, where is human input, human in the loop, human data the most useful at the current stage?

Nathan Lambert (02:42:06) For the past few years, the highest cost human data has been in these preferences, which is comparing, I would say, highest cost and highest total usage, so a lot of money has gone to these pairwise comparisons where you have two model outputs and a human is comparing between the two of them. In earlier years, there was a lot of this instruction tuning data, so creating highly specific examples to something like a Reddit question to a domain that you care about. Language models used to struggle on math and code, so you would pay experts in math and code to come up with questions and write detailed answers that were used to train the models.

(02:42:43) Now, it is the case that there are many model options that are way better than humans at writing detailed and eloquent answers for things like model and code. So they talked about this with the Llama 3 release, where they switched to using Llama 3, 4, or 5B to write their answers for math and code. But they, in their paper, talk about how they use extensive human preference data, which is something that they haven’t gotten AIs to replace. There are other techniques in industry, like constitutional AI, where you use human data for preferences and AI for preferences, and I expect the AI part to scale faster than the human part. But among the research that we have access to is that humans are in this kind of preference loop.

Lex Fridman (02:43:25) So, as reasoning becomes bigger and bigger and bigger, as we said, where’s the role of humans in that?

Nathan Lambert (02:43:31) It’s even less prevalent. The remarkable thing about these reasoning results and especially the DeepSeek-R1 paper, is this result that they call DeepSeek-R1-0, which is they took one of these pre-trained models, they took DeepSeek-V3-Base, and then they do this reinforcement learning optimization on verifiable questions or verifiable rewards for a lot of questions and a lot of training. And these reasoning behaviors emerge naturally. So these things like, “Wait, let me see. Wait, let me check this. Oh, that might be a mistake.” And they emerge from only having questions and answers. And when you’re using the model, the part that you look at is the completion. So in this case, all of that just emerges from this large-scale RL training and that model, which the weights are available, has no human preferences added into the post-training.

(02:44:20) The DeepSeek-R1-Full model has some of this human preference tuning, this RLHF, after the reasoning stage. But the very remarkable thing is that you can get these reasoning behaviors, and it’s very unlikely that there’s humans writing out reasoning chains. It’s very unlikely that they somehow hacked OpenAI and they got access to OpenAI o-1’s reasoning chains. It’s something about the pre-trained language models and this RL training where you reward the model for getting the question right, and therefore it’s trying multiple solutions and it emerges this chain of thought.

Andrej Karpathy and magic of RL

Lex Fridman (02:44:52) This might be a good place to mention the eloquent and the insightful tweet of the great and the powerful Andrej Karpathy. I think he had a bunch of thoughts, but one of them, “Last thought. Not sure if this is obvious. You know something profound is coming when you’re saying it’s not sure if it’s obvious. There are two major types of learning in both children and in deep learning. There’s one, imitation learning, watch and repeat i.e. pre-training, supervised fine-tuning, and two, trial-and-error learning, reinforcement learning.

(02:45:25) My favorite simple example is AlphaGo. One, is learning by imitating expert players. Two, is reinforcement learning to win the game. Almost every single shocking result of deep learning and the source of all magic is always two.

(02:45:40) Two is significantly more powerful. Two is what surprises you. Two is when the paddle learns to hit the ball behind the blocks in Breakout. Two is when AlphaGo beats even Lee Sedol. And two is the “aha moment” when the DeepSeek or o1, et cetera, discovers that it works well to reevaluate your assumptions, backtrack, try something else, et cetera.

(02:46:04) It’s the solving strategies you see this model use in its chain of thought. It’s how it goes back and forth thinking to itself. These thoughts are emergent. Three exclamation points. And this is actually seriously incredible, impressive, and new, and is publicly available and documented.

(02:46:24) The model could never learn this with the imitation because the cognition of the model and the cognition of the human labeler is different. The human would never know to correctly annotate these kinds of solving strategies and what they should even look like. They have to be discovered during reinforcement learning as empirically and statistically useful towards the final outcome.”

(02:46:45) Anyway, the AlphaZero metaphor analogy here. Can you speak to that? The magic of the chain of thought that he’s referring to.

Nathan Lambert (02:46:54) I think it’s good to recap AlphaGo and AlphaZero because it plays nicely with these analogies between imitation learning and learning from scratch. So AlphaGo, the beginning of the process was learning from humans, where they started the first… This is the first expert-level Go player or chess player in DeepMind series of models, where they had some human data. And then, why it is called AlphaZero, is that there was zero human data in the loop, and that changed to AlphaZero made a model that was dramatically more powerful for DeepMind. So this remove of the human prior, the human inductive bias, makes the final system far more powerful. This we mentioned bitter lesson hours ago, and this is all aligned with this.

(02:47:35) And then there’s been a lot of discussion in language models. This is not new. This goes back to the whole Q-Star rumors, which if you piece together the pieces, is probably the start of OpenAI figuring out its o1 stuff when last year in November, the Q-Star rumors came out, there’s a lot of intellectual drive to know when is something like this going to happen with language models? Because we know these models are so powerful, and we know it has been so successful in the past. And it is a reasonable analogy that this new type of reinforcement learning training for reasoning models is when the doors open to this. We don’t yet have the equivalent of turn 37, which is the famous turn where the DeepMind’s AI playing Go’s, dumped Lee Sedol completely. We don’t have something that’s that level of focal point, but that doesn’t mean that the approach to technology is different, and the impact of the general training it’s still incredibly new.

Lex Fridman (02:48:32) What do you think that point would be? What would be move 37 for Chain of Thought for reasoning?

Nathan Lambert (02:48:37) Scientific discovery, like when you use this sort of reasoning problem in it? Just something we fully don’t expect.

Dylan Patel (02:48:43) I think it’s actually probably simpler than that. It’s probably something related to computer use or robotics rather than science discovery. Because the important aspect here is models take so much data to learn. They’re not sample efficient. Trillions. They take the entire web, over 10 trillion tokens to train on. This would take a human thousands of years to read. A human does not… And humans know most of the stuff, a lot of the stuff models know better than it, right? Humans are way, way, way more sample efficient. That is because of the self-play, right? How does a baby learn what its body is as it sticks its foot in its mouth and it says, “Oh, this is my body, right?” It sticks its hand in its mouth and it calibrates its touch on its fingers with the most sensitive touch thing on its tongue is how babies learn and it’s just self-play over and over and over and over again.

(02:49:37) And now we have something that is similar to that with these verifiable proofs, whether it’s a unit testing code or a mathematical verifiable task, generate many traces of reasoning and keep branching them out, keep branching them out, and then check at the end, hey, which one actually has the right answer? Most of them are wrong. Great. These are the few that are right. Maybe we use some sort of reward model outside of this to select even the best one to preference, as well. But now you’ve started to get better and better at these benchmarks. And so you’ve seen over the last six months a skyrocketing in a lot of different benchmarks.

Nathan Lambert (02:50:11) All math and code benchmarks were pretty much solved except for frontier math, which is designed to be almost questions that aren’t practical to most people. They’re exam-level, open math problem-type things. So it’s like on the math problems that are somewhat reasonable, which is somewhat complicated word problems or coding problems, is just what Dylan is saying.

Dylan Patel (02:50:32) So the thing here is that these are only with the verifiable tasks. Earlier showed an example of the really interesting, like what happens when Chain of Thought is to a non-verifiable thing. It’s just like a human chatting, thinking about what’s novel for humans, a unique thought. But this task and form of training only works when it’s verifiable. And from here, the thought is, okay, we can continue to scale this current training method by increasing the number of verifiable tasks. In math and coding… Coding probably has a lot more to go. Math has a lot less to go in terms of what are verifiable things. Can I create a solver that then I generate trajectories toward or reasoning traces towards, and then prune the ones that don’t work, and keep the ones that do work? Well, those are going to be solved pretty quickly. But even if you’ve solved math, you have not actually created intelligence.

(02:51:22) And so this is where I think the aha moment of computer use or robotics will come in because now you have a sandbox or a playground that is infinitely verifiable. Messing around on the internet. There are so many actions that you can do that are verifiable. It’ll start off with log into a website, create an account, click a button here, blah, blah, blah. But it’ll then get to the point where it’s, “Hey, go do a task on Tasker,” or whatever, all these various task websites. “Hey, go get hundreds of likes,” and it’s going to fail. It’s going to spawn hundreds of accounts. It’s going to fail on most of them, but this one got to a thousand. Great. Now, you’ve reached the verifiable thing, and you just keep iterating this loop over and over. And same with robotics. That’s where you have an infinite playground of tasks like, “Hey, did I put the ball in the bucket,” all the way to like, “Oh, did I build a car?”

(02:52:10) There’s a whole trajectory to speed run or what models can do. But at some point, I truly think that we’ll spawn models, and initially, all the training will be in sandboxes, but then, at some point, the language model pre-training is going to be dwarfed by what is this reinforcement learning… You’ll pre-train a multimodal model that can see, that can read, that can write, blah, blah, blah, whatever, vision, audio, et cetera. But then you’ll have it play in a sandbox infinitely, and figure out math, figure out code, figure out navigating the web, figure out operating a robot arm. And then it’ll learn so much. And the aha moment will be when this is available to then create something that’s not good, right? Oh, cool. Part of it was figuring out how to use the web. Now, all of a sudden, it’s figured out really well how to just get hundreds of thousands of followers that are real and real engagement on Twitter because, all of a sudden, this is one of the things that are verifiable.

Lex Fridman (02:53:02) And maybe not just engagement, but make money.

Lex Fridman (02:53:07) That could be the thing where almost fully automated, it makes $10 million by being an influencer, selling a product, creating the product. And I’m not referring to a hype product, but an actual product or like, “Holy, shit, this thing created a business. It’s running it. It’s the face of the business,” that kind of thing. Or maybe a number one song. It creates the whole infrastructure required to create the song, to be the influencer that represents that song, that kind of thing. And makes a lot of them. That could be the… Our culture respects money in that kind of way.

Dylan Patel (02:53:07) And it’s verifiable, right?

Lex Fridman (02:53:44) It’s verifiable, right?

Dylan Patel (02:53:47) The bank account can’t lie.

Nathan Lambert (02:53:49) There’s surprising evidence that once you’ve set up the ways of collecting the verifiable domain that this can work. There’s been a lot of research before this R-1 on math problems, and they approach math with language models just by increasing the number of samples, so you can just try again and again and again. And you look at the amount of times that the language models get it right, and what we see is that even very bad models get it right sometimes. And the whole idea behind reinforcement learning is that you can learn from very sparse rewards.

(02:54:22) The space of language and the space of tokens, whether you’re generating language or tasks or robot is so big that you might say that… The tokenizer for a language model can be like 200,000 things, so at each step, it can sample from that big of a space. So if it can generate a bit of a signal that it can climb onto, that’s what the whole field of RL is around, is learning from sparse rewards. And the same thing has played out in math, where it’s very weak models that sometimes generate answers where you see research already that you can boost their math scores, you can do this RL training for math, it might not be as effective, but if you take a 1 billion parameter model, so something 600 times smaller than DeepSeek, you can boost its grade school…

Nathan Lambert (02:55:00) … something 600 times smaller than DeepSeek, you can boost its grade school math scores very directly with a small amount of this training. So, it’s not to say that this is coming soon. Setting up the verification domains is extremely hard and there’s a lot of nuance in this, but there are some basic things that we have seen before where it’s at least expectable that there’s a domain and there’s a chance that this works.

OpenAI o3-mini vs DeepSeek r1

Lex Fridman (02:55:23) All right. So, we have fun things happening in real time. This is a good opportunity to talk about other reasoning models, o1, o3, just now OpenAI, as perhaps expected, released o3-mini. What are we expecting from the different flavors? Can you just lay out the different flavors of the o models and from Gemini, the reasoning model?

Nathan Lambert (02:55:47) Something I would say about these reasoning models is we talked a lot about reasoning training on math and code. And what is done is that you have the base model we’ve talked about a lot on the internet, you do this large scale reasoning training with reinforcement learning, and then what the DeepSeek paper detailed in this R1 paper, which for me is one of the big open questions on how do you do this, is that they did reasoning heavy, but very standard post-training techniques after the large scale reasoning RL. So they did the same things with a form of instruction tuning through rejection sampling, which is essentially heavily filtered instruction tuning with some reward models. And then they did this RLHF, but they made it math heavy.

(02:56:27) So, some of this transfer, we looked at this philosophical example early on. One of the big open questions is, how much does this transfer? If we bring in domains after the reasoning training, are all the models going to become eloquent writers by reasoning? Is this philosophy stuff going to be open? We don’t know in the research of how much this will transfer. There’s other things about how we can make soft verifiers and things like this, but there is more training after reasoning, which makes it easier to use these reasoning models. And that’s what we’re using right now. So if we’re going to talk about o3-mini and o1, these have gone through these extra techniques that are designed for human preferences after being trained to elicit reasoning.

Dylan Patel (02:57:06) I think one of the things that people are ignoring is Google’s Gemini Flash Thinking is both cheaper than R1 and better, and they released it in the beginning of December-

Nathan Lambert (02:57:17) And nobody’s talking about it.

Nathan Lambert (02:57:18) It has a different flavor to it. Its behavior is less expressive than something like o1 or it has fewer tracks than it is on. Qwen released a model last fall, QwQ, which was their preview reasoning model, and DeepSeek had R1-Lite last fall, where these models kind of felt like they’re on rails where they really, really only can do math and code and o1, it can answer anything. It might not be perfect for some tasks, but it’s flexible, it has some richness to it, and this is kind of the art of is a model a little bit undercooked? It’s good to get a model out the door, but it’s hard to gauge and it takes a lot of taste to be like, is this a full-fledged model? Can I use this for everything? They’re probably more similar for math and code.

(02:58:04) My quick read is that Gemini Flash is not trained the same way as o1, but taking an existing training stack, adding reasoning to it, so taking a more normal training stack and adding reasoning to it, and I’m sure they’re going to have more. I mean they’ve done quick releases on Gemini Flash, reasoning, and this is the second version from the holidays. It’s evolving fast and it takes longer to make this training stack where you’re doing this large scale RL-

Dylan Patel (02:58:32) Ask it the same question from earlier, the one about the-

Nathan Lambert (02:58:35) The human nature.

Lex Fridman (02:58:38) What was the human nature one?

Nathan Lambert (02:58:39) Why I can ramble about this so much is that we’ve been working on this at AI Tube before o1 was fully available to everyone and before R1, which is essentially using this RL training for fine-tuning. We use this in our Tülu series of models and you can elicit the same behaviors where you say weight and such on, but it’s so late in the training process that this kind of reasoning expression is much lighter. So there’s essentially a gradation and just how much of this RL training you put into it determines how the output looks.

Lex Fridman (02:59:13) So, we’re now using Gemini 2.0 Flash Thinking Experimental 121.

Nathan Lambert (02:59:20) It summarized the problem as humans self-domesticated apes.

Lex Fridman (02:59:28) Okay. All right. So, wait, is this reviewing the reasoning? Here’s why this is a novel. Okay.

Dylan Patel (02:59:33) You can click to expand.

Nathan Lambert (02:59:35) Oh, yeah, click to expand.

Lex Fridman (02:59:36) Okay. Analyze the request. Novel is the keyword.

Nathan Lambert (02:59:41) See how it just looks a little different? It looks like a normal output.

Lex Fridman (02:59:45) Yeah. I mean in some sense, it’s better structured. It makes more sense. And-

Dylan Patel (02:59:50) Oh, and it latched onto human and then it went into organisms and… Oh, wow.

Lex Fridman (02:59:56) Apex Predator. Focus on domestication. Apply domestication to humans. Explore the idea of self-domestication.

Nathan Lambert (03:00:05) Not good, not good.

Lex Fridman (03:00:07) Where is this going? Refine, articulate the insight. Greater facial expressiveness and communication ability, yes. Plasticity and adaptability, yes. Dependence on social groups, yes. All right. And self-critique, refine further. Wow. Is this truly novel? Is it well-supported? So on and so forth. And the insight it’s getting at is humans are not just social animals but profoundly self-domesticated apes. And this self-domestication is the key to understanding our unique cognitive and social abilities. Self-domesticated apes. Self-domesticated-

Nathan Lambert (03:00:46) I prefer the DeepSeek response.

Lex Fridman (03:00:49) I mean it’s novel. The insight is novel. I mean that’s like a good book title; Self-Domesticated Apes. There could be a case made for that. I mean, yeah, it’s cool and it’s revealing the reasoning. It’s magical. It’s magical. This is really powerful.

(03:01:08) Hello, everyone, this is Lex with a quick intermission recorded after the podcast since we’ve reviewed responses from DeepSeek R1 and Gemini Flash 2.0 Thinking during this conversation, I thought at this moment it would be nice to insert myself quickly doing the same for OpenAI o1-pro and o3-mini with the same prompt. The prompt being, give one truly novel insight about humans. And I thought I would, in general, give my vibe check and vibe based anecdotal report on my own experiences with the new o3-mini model now that I got a chance to spend many hours with it in different kinds of context and applications.

(03:01:55) So, I would probably categorize this question as let’s say open- ended philosophical question. And in particular, the emphasis on novelty I think is a nice way to test one of the capabilities of the model, which is come up with something that makes you pause and almost surprise you with brilliance.

(03:02:16) So that said, my general review after running each of the models on this question a bunch of times is that o1-pro consistently gave brilliant answers, ones that gave me pause and made me think, both cutting in its insight and just really nicely phrased with wit, with clarity, with nuance over and over, consistently generating the best answers. After that is R1, which was less consistent, but again, delivered brilliance. Gemini Flash 2.0 Thinking was third and last was o3-mini actually. It often gave quite a generic answer, at least to my particular sensibilities. That said, in a bunch of other applications that I tested for brainstorming purposes, it actually worked extremely well and often outperformed R1. But on this open-ended philosophical question, it did consistently worse.

(03:03:17) Now another important element for each of these models is how the reasoning is presented. DeepSeek R1 shows the full chain of thought tokens, which I personally just love. For these open-ended philosophical questions, it’s really, really interesting to see the model think through it, but really also just stepping back, me as a person who appreciates intelligence and reasoning and reflection, reading these kind of chain of thought raw tokens of R1, there’s something genuinely beautiful about observing the path of deliberation in an intelligent system. I think we don’t always have that explicitly laid out for us humans. So, to see it in another intelligence system, the nonlinearity of it akin to the Ulysses, Finnegans Wake by James Joyce. It’s just beautiful to watch.

(03:04:09) Anyways, we discussed in the episode DeepSeek R1 talked about humans being able to convert selfish desires into cooperative systems by collectively pretending abstract rules like money laws and rights are real. And these shared hallucinations act as games where competition is secretly redirected to benefit the group turning conflict into society’s fuel. Gemini 2.0 Flash Thinking said, “Humans are not just social animals but self-domesticated apes. And this self domestication is the key to understanding our unique cognitive and social abilities.”

(03:04:45) Now, it’s important to say that the chain of thought there was really interesting. It was looking through the entire evolution of life on earth considering apex predators and considering how from that, we ended up to where we are. I think that domestication by choice is a really interesting angle. Again, it’s one of those things when somebody presents a different angle on a seemingly obvious thing, it just makes me smile. And the same with DeepSeek R1, that these hallucinations of money laws and rights and us collectively pretending like it’s real and we play games with them that look like competition when secretly we’re just cooperating with each other and that is the fuel of progress. Beautifully put.

(03:05:31) Now, OpenAI o1-pro consistently, over and over delivered bangers. I can go through many of them, but the first one was, “Humans are the only species that turns raw materials into symbolic resources. Then uses those symbols to reorganize the very materials that came from creating a closed feedback loop between meaning and matter.” Here, I just ran it again. Banger after banger, I’m telling you. “Humans are unique among known species in that they simultaneously rewrite two layers of reality; the external world and their own private mental landscapes. And then merge these two rewritten layers into a continuous personal narrative that feels objectively true.” Feels true. This is poetry.

(03:06:19) Okay. And then o3-mini high, for me, was smart, fast actually, and kind of generic. Never quite got there for me. So here’s the first one I got from o3-mini, “Humans are not fixed beings, but rather ongoing narratives, dynamic stories that we continuously write, edit, and reinterpret. This narrative plasticity is more than just memory or self-reflection. It’s an intrinsic cognitive process that acts like an internal error correction system. It allows us to adapt our identities and values over time in response to new experiences, challenges, and social contexts.” Now, it almost sneaks up to something approximating cutting insight with narrative plasticity in quotes. But then it goes back to the generic. I don’t know.

(03:07:10) All of these models are incredible for different reasons. There’s a lot of concerns as we discussed in this episode, but there’s a lot of reasons to be excited as well. And I’ve probably spoken for too long. I am severely sleep-deprived, borderline delirious. So hopefully some of this made sense. And now, dear friends, back to the episode.

Dylan Patel (03:07:36) I think to Nathan’s point, when you look at the reasoning models, to me, even when I used R1 versus o1, there was that sort of rough edges around the corner feeling. And Flash Thinking earlier, I didn’t use this version, but the one from December, and it definitely had that rough edges around the corner feeling where it’s just not fleshed out in as many ways. Sure, they added math and coding capabilities via these verifiers in RL, but it feels like they lost something in certain areas. And o1 is worse performing than Chat in many areas as well, to be clear-

Dylan Patel (03:08:15) Not by a lot though, right? And R1 definitely felt to me like it was worse than V3 in certain areas, like doing this RL expressed and learned a lot, but then it weakened in other areas. And so I think that’s one of the big differences between these models and what one offers. And then OpenAI has o1-pro, and what they did with o3, which is also very unique, is that they stacked search on top of chain of thought. And so chain of thought is one thing where it’s one chain, it backtracks, goes back and forth, but how they solved the ARC-AGI challenge was not just the chain of thought, it was also sampling many times, i.e., running them in parallel and then selecting.

Nathan Lambert (03:08:58) Is running in parallel actually search? Because I don’t know if we have the full information on how o1-pro works. So, I don’t have enough information-

Nathan Lambert (03:09:05) … to confidently say that it is search.

Dylan Patel (03:09:07) It is parallel samples.

Nathan Lambert (03:09:08) Yeah. And then what.

Dylan Patel (03:09:09) And then it selects something.

Nathan Lambert (03:09:10) And we don’t know what the selection function is. The reason why we’re debating is because since o1 was announced, there’s been a lot of interest in techniques called Monte Carlo Tree Search, which is where you will break down the chain of thought into intermediate steps. We haven’t defined chain of thought. Chain of thought is from a paper from years ago where you introduced the idea to ask a language model that at the time was much less easy to use, you would say, “Let’s verify step by step,” and it would induce the model to do this bulleted list of steps. Chain of thought is now almost a default in models where if you ask it a math question, you don’t need to tell it to think step by step. And the idea with Monte Carlo Tree Search is that you would take an intermediate point in that train, do some sort of expansion, spend more compute, and then select the right one. That’s a very complex form of search that has been used in things like MuZero and AlphaZero, potentially. I know MuZero does this.

Dylan Patel (03:10:01) Another form of search is just asking five different people and then taking the majority answer. There’s a variety of, it could be complicated, it could be simple. We don’t know what it is, just that they are not just issuing one chain of thought in sequence. They’re launching many in parallel and in the ARC-AGI, they launched a thousand in parallel for the one that really shocked everyone that beat the benchmark was they would launch a thousand in parallel and then they would get the right answer like 80% of the time or 70% of the time, 90 maybe even. Whereas if they just launched one, it was like 30%.

Nathan Lambert (03:10:33) There are many extensions to this. I would say the simplest one is that our language models to date have been designed to give the right answer the highest percentage of the time in one response. And we are now opening the door to different ways of running inference on our models in which we need to reevaluate many parts of the training process, which normally opens the door to more progress, but we don’t know if OpenAI changed a lot or if just sampling more and multiple choice is what they’re doing or if it’s something more complex, but they changed the training and they know that the inference mode is going to be different.

Lex Fridman (03:11:07) So we’re talking about o1-pro, $200 a month and they’re losing money. The thing that we’re referring to, this fascinating exploration of the test time compute space, is that actually possible? Do we have enough compute for that? Does the financials make sense?

Dylan Patel (03:11:27) So the fantastic thing is, and it’s in the thing that I pulled up earlier, but the cost for GPT-3 has plummeted if you scroll up just a few images, I think. The important thing about, hey, is cost a limiting factor here? My view is that we’ll have really awesome intelligence, like AGI, before we have it permeate throughout the economy. And this is sort of why that reason is. GPT-3 was trained in what? 2020? 2021? And the cost for running inference on it was $60, $70 per million tokens, which was the cost per intelligence was ridiculous. Now as we scaled forward two years, we’ve had a 1200X reduction in cost to achieve the same level of intelligence as GPT-3.

Lex Fridman (03:12:15) So here on the x-axis is time over just a couple of years, and on the y-axis is log scale dollars to run inference on a million tokens.

Nathan Lambert (03:12:27) Yeah, it’s dollar to million.

Lex Fridman (03:12:30) So you have just a linear decline on log scale from GPT-3 through 3.5 to Lama-

Dylan Patel (03:12:37) It’s like five cents or something like that now, right? Versus $60, 1200X, that’s not the exact numbers, but it’s 1200X, I remember that number, is humongous cost per intelligence. Now, the freak out over DeepSeek is, “Oh my god, they made it so cheap.” It’s like actually, if you look at this trend line, they’re not below the trend line first of all, at least for GPT-3, right? They are the first to hit it, which is a big deal, but they’re not below the trend line as far as GPT-3. Now we have GPT-4, what’s going to happen with these reasoning capabilities? It’s a mix of architectural innovations, it’s a mix of better data, and it’s going to be better training techniques and all of these better inference systems, better hardware going from each generation of GPU to new generations or ASICs.

(03:13:22) Everything is going to take this cost curve down and down and down and down. And then can I just spawn a thousand different LLMs to create a task and then pick from one of them? Or whatever search technique, I want, a Tree, Monte Carlo Tree Search, maybe it gets that complicated, maybe it doesn’t because it’s too complicated to actually scale. Who knows? Better lesson, right?

(03:13:43) The question is, I think, when not if, because the rate of progress is so fast. Nine months ago, Dario said nine months ago the cost to train an inference was this, and now we’re much better than this and DeepSeek is much better than this. And that cost curve for GPT-4, which was also roughly $60 per million tokens when it launched, has already fallen to $2 or so. And we’re going to get it down to cents probably for GPT-4 quality. And then that’s the base for the reasoning models like o1 that we have today and o1-pro is spawning multiple and o3 and so on and so forth, these search techniques, too expensive today, but they will get cheaper and that’s what’s going to unlock the intelligence.

NVIDIA

Lex Fridman (03:14:31) So, it’ll get cheaper and cheaper and cheaper. The big DeepSeek R1 release freaked everybody out because of the cheaper. One of the manifestations of that is NVIDIA stock plummeted. Can you explain what happened? And also just explain this moment and if NVIDIA is going to keep winning.

Nathan Lambert (03:14:52) We are both NVIDIA bulls here, I would say. And in some ways, the market response is reasonable. NVIDIA’s biggest customers in the US are major tech companies and they’re spending a ton on AI. And if a simple interpretation of DeepSeek is you can get really good models without spending as much on AI. So in that capacity it’s like, “Oh, maybe these big tech companies won’t need to spend as much in AI and go down.”

(03:15:18) The actual thing that happened is much more complex where there’s social factors, where there’s the rising in the app store, the social contagion that is happening. And then I think some of it is just like, I don’t trade, I don’t know anything about financial markets, but it builds up over the weekend, the social pressure, where it’s like if it was during the week and there was multiple days of trading when this was really becoming, but it comes on the weekend and then everybody wants to sell, and then that is a social contagion.

Dylan Patel (03:15:43) I think, and there were a lot of false narratives, which is like, “Hey, these guys are spending billions on models,” and they’re not spending billions on models. No one spent more than a billion dollars on a model that’s released publicly. GPT-4 was a couple hundred million and then they’ve reduced the cost with 4o, 4 Turbo, 4o, right? But billion dollar model runs are coming and this concludes pre-training and post-training, right? And then the other number is like, “Hey, DeepSeek didn’t include everything.” They didn’t include a lot of the cost goes to research and all this sort of stuff. A lot of the cost goes to inference. A lot of the cost goes to post-training. None of these things were factored. Research, salaries, all these things are counted in the “billions of dollars” that OpenAI is spending, but they weren’t counted in the, “Hey, $6 million, $5 million that DeepSeek spent.”

(03:16:27) So, there’s a bit of misunderstanding of what these numbers are, and then there’s also an element of… NVIDIA has just been a straight line up and there’s been so many different narratives that have been trying to push down NVIDIA. I don’t say push down NVIDIA stock. Everyone is looking for a reason to sell or to be worried. It was Blackwell delays, right? Their GPU, every two weeks there’s a new report about their GPUs being delayed. There’s the whole thing about scaling laws ending, right? It’s so ironic-

Nathan Lambert (03:16:57) It lasted a month.

Dylan Patel (03:17:00) It was literally just, “Hey, models aren’t getting better.” They’re just not getting better. There’s no reason to spend more, pre-training scaling is dead. And then it’s like o1, o3, right?

Dylan Patel (03:17:11) R1, right? And now it’s like, “Wait, models, they’re progressing too fast. Slow down the progress, stop spending on GPUs.” But the funniest thing I think that comes out of this is Jevons paradox is true. AWS pricing for H100s has gone up over the last couple of weeks, since a little bit after Christmas, since V3 was launched, AWS H100 pricing has gone up. H200s are almost out of stock everywhere because H200 has more memory and therefore R1 wants that chip over H100, right?

Nathan Lambert (03:17:43) We were trying to get GPUs on a short notice this week for a demo and it wasn’t that easy. We were trying to get just 16 or 32 H100s for demo and it was not very easy.

Lex Fridman (03:17:51) So for people who don’t know, Jevons paradox is when the efficiency goes up, somehow magically, counter intuitively, the total resource consumption goes up as well.

Dylan Patel (03:18:03) And semiconductors is 50 years of Moore’s law, every two years half the cost, double the transistors, just like clockwork and it’s slowed down obviously, but the semiconductor industry has gone up the whole time. It’s been wavy, right? There’s obviously cycles and stuff and I don’t expect AI to be any different. There’s going to be ebbs and flows, but in AI, it’s just playing out at an insane timescale. It was 2X every two years, this is 1200X in like three years. So it’s like the scale of improvement is hard to wrap your head around.

Lex Fridman (03:18:34) Yeah. I was confused because to me, NVIDIA stock on that should have gone up, but maybe it went down because there’s suspicion of foul play on the side of China, something like this. But if you just look purely at the actual principles at play here, it’s obvious. Yeah, the Jevons paradox-

GPU smuggling

Nathan Lambert (03:18:53) The more progress that AI makes or the higher the derivative of AI progress is, especially because NVIDIA’s in the best place, the higher the derivative is, the sooner the market’s going to be bigger and expanding and NVIDIA’s the only one that does everything reliably right now.

Lex Fridman (03:19:07) Yeah, because it’s not like an NVIDIA competitor arose. It’s another company that’s using NVIDIA-

Nathan Lambert (03:19:14) Who historically has been a large NVIDIA customer.

Dylan Patel (03:19:18) And has press releases about them cheering about being China’s biggest NVIDIA customer, right?

Dylan Patel (03:19:25) Obviously they’ve quieted down, but I think that’s another element of it is that they don’t want to say how many GPUs they have because hey, yes, they have H800s, yes, they have H20s, they also have some H100s, right? Which were smuggled in.

Lex Fridman (03:19:37) Can you speak to that, to the smuggling? What’s the scale of smuggling that’s feasible for a nation state to do for companies? Is it possible to-

Dylan Patel (03:19:47) I think there’s a few angles of “smuggling” here, right? One is ByteDance, arguably is the largest smuggler of GPUs for China. China’s not supposed to have GPUs. ByteDance has over 500,000 GPUs. Why? Because they’re all rented from companies around the world. They rent from Oracle, they rent from Google, they rent from all these, and a bunch of smaller cloud companies too, right? All the “neoClouds” of the world. They rent so, so many GPUs. They also buy a bunch. And they do this for mostly what Meta does, right? Serving TikTok, right? Serving next best-

Nathan Lambert (03:20:17) Separate discussion.

Dylan Patel (03:20:18) Same as Meta, right? To be clear, today, that’s the use, right? And it’s a valid use, right? Hack the dopamine circuit. Now, that’s theoretically now very much restricted with the AI diffusion rules, which happened in the last week of the Biden admin, and Trump admin looks like they’re going to keep them, which limits allies even, like Singapore, which Singapore is 20%, 30% of NVIDIA’s revenue, but Singapore’s had a memoratorium on not building data centers for 15 years because they don’t have enough power. So, where are they going?

Dylan Patel (03:20:51) I’m not claiming they’re all going to China, but a portion, many are going to Malaysia, including Microsoft and Oracle have big data centers in Malaysia. They’re going all over Southeast Asia probably, India as well. There’s stuff routing, but the diffusion rules are very de facto, like you can only buy this many GPUs from this country and you can only rent a cluster this large to companies that are Chinese. They’re very explicit on trying to stop smuggling.

(03:21:15) And a big chunk of it was, hey, random company buys 16 servers, ships them to China. There’s actually, I saw a photo from someone in the semiconductor industry who leads a team for networking chips that competes with NVIDIA, and he sent a photo of a guy checking into a first class United flight from San Francisco to Shanghai or Shenzhen with a super micro box that was this big, which can only contain GPUs, right? And he was booking first class because think about it, 3K to 5K for your first class ticket, server costs $240,000 in the US, $250,000, you sell it for $300,000 in China. Wait, you just got a free first class ticket and a lot more money. So it’s like… And that’s small scale smuggling. Most of the large scale smuggling is companies in Singapore and Malaysia routing them around or renting GPUs, completely legally-

Nathan Lambert (03:22:10) I want to jump in. How much does this scale? I think there’s been some people that are higher level economics understanding say that as you go from 1 billion of smuggling to 10 billion, it’s like you’re hiding certain levels of economic activity and that’s the most reasonable thing to me is that there’s going to be some level where it’s so obvious that it’s easier to find this economic activity. And-

Dylan Patel (03:22:30) Yeah. So, my belief is that last year roughly, so NVIDIA made a million H20s, which are legally allowed to be shipped to China, which we talked about is better for reasoning, inference at least, not training, but reasoning inference and inference generally. Then they also had a couple hundred thousand, we think like 200,000 to 300,000 GPUs were routed to China from Singapore, Malaysia, US, wherever. Companies spawn up, buy 16 GPUs, 64 GPUs, whatever it is, route it, and Huawei is known for having spent up a massive network of companies to get the materials they need after they were banned in 2018. So, it’s not otherworldly, but I agree, right? Nathan’s point is like, hey, you can’t smuggle $10 billion of GPUs.

(03:23:13) And then the third source, which is just now banned, which wasn’t considered smuggling, but is China is renting, I believe from our research, Oracle’s biggest GPU customer is ByteDance. And for Google, I think it’s their second-biggest customer. And you go down the list of clouds and especially these smaller cloud companies that aren’t the “hyperscalers,” think beyond CoreWeave and Lambda even, there’s 60 different new cloud companies serving NVIDIA GPUs. I think ByteDance is renting a lot of these, all over it, right?

(03:23:44) And so these companies are renting GPUs to Chinese companies, and that was completely legal up until the diffusion rules, which happened just a few weeks ago. And even now, you can rent GPU clusters that are less than 2,000 GPUs, or you can buy GPUs and ship them wherever you want if there are less than 1,500 GPUs. There are still some ways to smuggle, but yeah, as the numbers grow a hundred something billion dollars of revenue for NVIDIA last year, 200 something billion this year, and if next year, it could nearly double again or more than double based on what we see with data center footprints being built out all across the US and the rest of the world, it’s going to be really hard for China to keep up with these rules.

(03:24:28) Yes, there will always be smuggling and DeepSeek level models, GPT-4 level models, o1 level models capable to train on what China can get, even the next tier above that. But if we speed run a couple more jumps to billion dollar models, $10 billion models, then it becomes, “Hey, there is a compute disadvantage for China for training models and serving them.” And the serving part is really critical, right? DeepSeek cannot serve their model today. It’s completely out of inventory. It’s already started falling in the app store actually, downloads, because you download it, you try and sign up, they say, “We’re not taking registrations,” because they have no capacity. You open it up, you get less than five tokens per second, if you even get your request approved, right? Because there’s just no capacity because they just don’t have enough GPUs to serve the model, even though it’s incredibly efficient.

Lex Fridman (03:25:14) It’d be fascinating to watch the smuggling. Because I mean there’s drug smuggling, right? That’s a market. There’s weapons smuggling. And GPUs will surpass that at some point.

Nathan Lambert (03:25:25) Chips are highest value per kilogram probably by far. I have another question for you, Dylan. Do you track model API access internationally? How easy is it for Chinese companies to use hosted model APIs from the US?

DeepSeek training on OpenAI data

Dylan Patel (03:25:42) Yeah. I mean that’s incredibly easy, right? OpenAI publicly stated DeepSeek uses their API and they say they have evidence, right? And this is another element of the training regime, is people at OpenAI have claimed that it’s a distilled model, i.e., you’re taking OpenAI’s model, you’re generating a lot of output, and then you’re training on the output in their model. And even if that’s the case, what they did is still amazing by the way, what DeepSeek did, efficiency-wise.

Nathan Lambert (03:26:04) Distillation is standard practice in industry. Whether or not, if you’re at a closed lab where you care about terms of service and IP closely, you distill from your own models. If you are a researcher and you’re not building any products, you distill from the OpenAI models-

Lex Fridman (03:26:16) This is a good opportunity. Can you explain big picture distillation as a process? What is distillation? What’s the process of distillation?

Nathan Lambert (03:26:24) We’ve talked a lot about training language models. They are trained on text and post-training, you’re trying to train on very high-quality texts that you want the model to match the features of, or if you’re using RL, you’re letting the model find its own thing. But for supervised fine-tuning, for preference data, you need to have some completions, what the model is trying to learn to imitate. And what you do there is instead of a human data or instead of the model you’re currently training, you take completions from a different, normally more powerful, model. I think there’s rumors that these big models that people are waiting for, these GPT-5s of the world, the Claude 3 Opuses of the world are used internally to do this distillation process at OpenAI-

Dylan Patel (03:27:04) There’s also public examples, right? Like Meta explicitly stated, not necessarily distilling, but they used 405B as a reward model for 70B in their Llama 3.2 or 3.3 rule-

Nathan Lambert (03:27:15) Yes. This is all the same topic.

Lex Fridman (03:27:16) So, is this ethical? Is this legal? Why is that Financial Times article headline say, “OpenAI says that there’s evidence that China’s DeepSeek used its model to train competitor.”

Nathan Lambert (03:27:31) This is a long, at least in the academic side and research side, it has a long history because you’re trying to interpret OpenAI’s rule. OpenAI’s terms of service say that you cannot build a competitor with outputs from their models. Terms of service are different than a license, which are essentially a contract between organizations. So if you have a terms of service on OpenAI’s account, if I violate it, OpenAI can cancel my account. This is very different than a license that says how you could use a downstream artifact. So a lot of it hinges on a word that is very unclear in the AI space, which is, what is a competitor?

Dylan Patel (03:28:02) And then the ethical aspect of it is like, why is it unethical for me to train on your model when you can train on the internet’s text? Right?

Lex Fridman (03:28:10) So there’s a bit of a hypocrisy because OpenAI and potentially most of the companies trained on the internet’s text without permission.

Nathan Lambert (03:28:20) There’s also a clear loophole, which is that I generate data from OpenAI and then I upload it somewhere and then somebody else trains on it and the link has been broken. They’re not under the same terms of service contract.

Nathan Lambert (03:28:33) There’s a lot of… There’s a lot of to be discovered details that don’t make a lot of sense.

Dylan Patel (03:28:38) This is why a lot of models today, even if they train on zero OpenAI data, you ask the model, “Who trained you?” It’ll say, “I’m ChatGPT trained by OpenAI,” because there’s so much copy paste of OpenAI outputs from that on the internet that you just weren’t able to filter it out and there was nothing in the RL where they implemented or post-training or SFT, whatever, that says, “Hey, I’m actually a model by Allen Institute instead of OpenAI.”

Nathan Lambert (03:29:03) We have to do this if we serve a demo. We do research and we use OpenAI APIs because it’s useful and we want to understand post-training and our research models, they all say they’re written by OpenAI unless we put in the system prop that we talked about that, “I am Tülu. I am a language model trained by the Allen Institute for AI.” And if you ask more people around industry, especially with post-training, it’s a very doable task to make the model say who it is or to suppress the OpenAI thing. So in some levels, it might be that DeepSeek didn’t care that it was saying that it was by OpenAI. If you’re going to upload model weights, it doesn’t really matter because anyone that’s serving it in an application and cares a lot about serving is going to, when serving it, if they’re using it for a specific task, they’re going to tailor it to that and it doesn’t matter that it’s saying it’s ChatGPT.

Lex Fridman (03:29:49) Oh, I guess one of the ways to do that is like a system prompt or something like that? If you’re serving it to say that you’re-

Nathan Lambert (03:29:55) That’s what we do. If we host a demo, you say, “You are Tülu 3, a language model trained by the Allen Institute for AI.” We also are benefited…

Nathan Lambert (03:30:00) … model trained by the Allen Institute for AI. We also are benefited from OpenAI data because it’s a great research tool.

Lex Fridman (03:30:06) Do you think there’s any truth and value to the OpenAI’s claim that there’s evidence that China’s DeepSeek used this model to train?

Dylan Patel (03:30:16) I think everyone has benefited regardless because the data’s on the internet. And therefore, it’s in your per training now. There are subreddits where people share the best ChatGPT outputs, and those are in your model-

Nathan Lambert (03:30:29) I think that they’re trying to shift the narrative. They’re trying to protect themselves. We saw this years ago when ByteDance was actually banned from some OpenAI APIs for training on outputs. There’s other AI startups that most people, if you’re in the AI culture, were like they just told us they trained on OpenAI outputs and they never got banned. That’s how they bootstrapped their early models.

(03:30:51) So, it’s much easier to get off the ground using this than to set up human pipelines and build a strong model. So there’s long history here, and a lot of the communications are seem like narrative [inaudible 03:31:00].

Dylan Patel (03:31:00) Actually, over the last couple of days, we’ve seen a lot of people distill DeepSeek’s model into Llama models, because the DeepSeek models are complicated to run inference on because they’re mixture of experts and they’re 600 plus billion parameters and all of this. And people distilled them into the Llama models because the Llama models are so easy to serve, and everyone’s built the pipelines and tooling for inference with the Llama models because it’s the open standard.

(03:31:24) So, we’ve seen a sort of roundabout. Is it bad? Is it illegal? Maybe it’s illegal, whatever. I don’t know about that, but-

Nathan Lambert (03:31:30) It could break contracts. I don’t think it’s illegal in any legal… No one’s going to jail for this, ever.

Lex Fridman (03:31:36) Fundamentally, I think it’s ethical, or I hope it’s ethical because the moment it becomes… We ban that kind of thing, it’s going to make everybody much worse off. And I also, actually…

(03:31:50) This is difficult, but I think you should be allowed to train on the internet. I know a lot of authors and creators are very sensitive about it. That’s a difficult question. But the moment you’re not allowed to train on the internet-

Dylan Patel (03:32:03) I have a schizo take on how you can solve this. Because it already works.

Nathan Lambert (03:32:07) I have a reasonable take out of it.

Lex Fridman (03:32:09) All right, [inaudible 03:32:10].

Dylan Patel (03:32:10) So, Japan has a law which you’re allowed to train on any training data and copyrights don’t apply if you want to train a model, A. B, Japan has 9 gigawatts of curtailed nuclear power. C, Japan is allowed under the AI diffusion rule to import as many GPUs as they’d like. So, all we have to do…

(03:32:29) We have a market here to make. We build massive data centers, we rent them to the labs, and then we train models in a legally permissible way, and there’s no ifs, ands, or buts. And now, the models have no potential copyright lawsuit from New York Times or anything like that. No, it’s just completely legal.

Nathan Lambert (03:32:47) … the early copyright lawsuits have fallen in the favor of AI training. I would say that the long tail of use is going to go inside of AI, which is if you scrape trillions of tokens of data, you’re not looking and saying, “This one New York Times article is so important to me.” But if you’re doing a audio generation for music or image generation, and you say, “Make it in the style of X person,” that’s a reasonable case where you could figure out what is their profit margin on inference. I don’t know if it’s going to be the 50/50 of YouTube Creator Program or something, but I would opt into that program as a writer, please.

(03:33:28) It’s going to be a rough journey, but there will be some solutions like that that makes sense. But there’s a long tail where it’s just on the internet.

Lex Fridman (03:33:35) I think one of the other aspects of that Financial Times article implied, and so that leads to a more general question. Do you think there’s… How difficult is spying, espionage, and stealing of actual secret code and data from inside of companies? How much of that is being attempted?

Nathan Lambert (03:33:55) Code and data is hard, but ideas is easy. Silicon Valley operates on the way that top employees get bought out by other companies for a pay raise, and a large reason why these companies do this is to bring ideas with them. And there’s no… I mean, in California, there’s rules that certain non-competes or whatever are illegal in California. And whether or not there’s NDAs and things, that is how a lot of it happens. Recently, there was somebody from Gemini who helped make this 1 million context length. And everyone is saying the next Llama who, he went to the Meta team, is going to have 1 million context length. And that’s kind of how the world works.

Dylan Patel (03:34:34) As far as industrial espionage and things, that has been greatly successful in the past. The Americans did it to the Brits, the Chinese have done it to the Americans, and so on and so forth. It is a fact of life. And so, to argue industrial espionage can be stopped is probably unlikely. You can make it difficult. But even then, there’s all these stories about like, “Hey, F35 and F22 have already been given to China in terms of design plans and stuff.”

(03:35:02) Code and stuff between, I say companies, not nation states, is probably very difficult. But ideas are discussed a lot, whether it be a house party in San Francisco or a company changing employees or always the mythical honeypot that always gets talked about. Someone gets honeypotted because everyone working on AI is a single dude who’s in their 20s and 30s. Not everyone, but insane amount of… Insane percentages. So, there’s always all these… And obviously-

Lex Fridman (03:35:34) So, honeypotted is like a female spy approaches you and…

Dylan Patel (03:35:38) Yeah. Or male, right? It’s San Francisco. But as a single dude, I will say in his late 20s, we are very easily corrupted. Not corrupted myself, but we are. Right?

Lex Fridman (03:35:51) Yeah. Everybody else. Not me.

Nathan Lambert (03:35:54) I’m too oblivious that I am not single, so I’m safe from one espionage access.

AI megaclusters

Lex Fridman (03:36:00) Yeah. You have to make sure to close all security vulnerabilities. So you, Dylan, collect a lot of information about each of the mega clusters for each of the major AI companies. Can you talk about the buildouts for each one that stand out?

Dylan Patel (03:36:18) Yeah. I think the thing that’s really important about these mega cluster buildouts is they’re completely unprecedented in scale. US data center power consumption has been slowly on the rise and it’s gone up to 2, 3% even through the cloud computing revolution. Data center consumption has a percentage of total US, and that’s been over decades of data centers, etc. It’s been climbing slowly, but now, 2 to 3%.

(03:36:43) Now, by the end of this decade, it’s… Even under… When I say 10%, a lot of people that are traditionally by 2028 to 2030, people traditionally non-traditional data center people, that’s nuts. But then, people who are in AI who have really looked at this like the Anthropics and OpenAI’s, are like, “That’s not enough.”

(03:37:02) And I’m like, “Okay.” But this is both through globally distributed or distributed throughout the US as well as centralized clusters. The distributed throughout the US is exciting and it’s the bulk of it. Like, hey, OpenAI or, say, Meta’s adding a gigawatt, but most of it is distributed through the US for inference and all these other things.

Lex Fridman (03:37:26) So maybe, we should lay out what a cluster is. So, does this include AWS? Maybe, it’s good to talk about the different kinds of clusters. What you mean by mega clusters? What’s the GPU and what’s a compute or… And what [inaudible 03:37:41]-

Lex Fridman (03:37:41) Not that far back, but yeah. So, what do we mean by the clusters? The buildouts?

Dylan Patel (03:37:45) Oh, man. I thought I was about to do the Apple ad, what’s a computer? So traditionally, data centers and data center tasks have been a distributed systems problem that is capable of being spread very far and widely. I.e, I send a request to Google, it gets routed to a data center somewhat close to me, it does whatever search ranking recommendation, sends a result back. The nature of the task is changing rapidly in that the task, there’s two tasks that people are really focused on now. It’s not database access. It’s not, “Serve me the right page, serve me the right ad.”

(03:38:20) It’s now, a inference. An inference is dramatically different from traditional distributed systems, but it looks a lot more simple, similar. And then, there’s training. The inference side is still like, “Hey, I’m going to put thousands of GPUs in blocks all around these data centers.” I’m going to run models on them. User submits a request, it gets kicked off. Or hey, my service. They submit a request to my service. They’re on Word and they’re like, “Oh yeah, help me, Copilot,” and it kicks it off. Or I’m on my windows, Copilot, whatever, Apple intelligence. Whatever it is, it gets kicked off to a data center. That data center does some work and sends it back. That’s inference. That is going to be the bulk of compute, but then…

(03:38:59) And that’s like, there’s thousands of data centers that we’re tracking with satellites and all these other things, and those are the bulk of what’s being built. But the scale of… And so, that’s what’s really reshaping and that’s what’s getting millions of GPUs. But the scale of the largest cluster is also really important. When we look back at history or through the age of AI, it was a really big deal when they did AlexNet on, I think, 2 GPUs or 4 GPUs. I don’t remember. It’s a really big deal.

Nathan Lambert (03:39:30) It’s a big deal because you use GPUs.

Dylan Patel (03:39:31) It’s a big deal that they use GPUs and they use multiple. But then over time, its scale has just been compounding. And so when you skip forward to GPT-3, then GPT-4, GPT-4 20,000 A100 GPUs. Unprecedented run in terms of the size and the cost, right? A couple of hundred million dollars on a YOLO run for GPT-4, and it yielded this magical improvement that was perfectly in line with what was experimented and just a log scale right up.

Nathan Lambert (03:39:58) Oh yeah, they had that plot from the paper.

Dylan Patel (03:40:00) The scaling of the technical part. The scaling laws were perfect, right? But that’s not a crazy number. 20,000 A100’s, roughly, each GPU is consuming 400 watts. And then when you add in the whole server, everything, it’s like 15 to 20 megawatts of power. Maybe, you could look up what the power of consumption of a person is because the numbers are going to get silly, but 15 to 20 megawatts was standard data center size. It was just unprecedented that was all GPUs running one task.

Nathan Lambert (03:40:00) How many watts is a toaster?

Dylan Patel (03:40:29) A toaster has also-

Nathan Lambert (03:40:29) That’s a good example.

Dylan Patel (03:40:32) … a similar power consumption to an A100. H100 comes around. They increase the power from 400 to 700 watts and that’s just per GPU, and then there’s all the associated stuff around it. So once you count all of that, it’s roughly 1,200 to 1,400 watts for everything. Networking, CPUs, memory, blah, blah, blah.

Lex Fridman (03:40:48) So we should also say, what’s required, you said power. So, a lot of power is required. A lot of heat is generated, so the cooling is required. And because there’s a lot of GPUs or CPUs or whatever, they have to be connected. So, there’s a lot of networking, right?

(03:41:07) Sorry for skipping past that. And then the data center itself is complicated, but these are still standard sized data centers for GPT-4 scale. Now, we step forward to what is the scale of clusters that people built last year, and it ranges widely. It ranges from like, “Hey, these are standard data centers. And we’re just using multiple of them and connecting them together really with a ton of fiber between them, a lot of networking, etc.” That’s what OpenAI and Microsoft did in Arizona. They have 100,000 GPUs.

(03:41:37) Meta, similar thing. They took their standard existing data center design and it looks like an H, and they connected multiple of them together. They first did 24,000 GPUs total, only 16,000 of them were running on the training run because GPUs are very unreliable so they need to have spares to swap in and out. All the way to now, 100,000 GPUs that they’re training on Llama 4 on currently. Like, 128,000 or so.

(03:42:02) Think about 100,000 GPUs with roughly 1,400 watts apiece. That’s 140 megawatts, 150 megawatts for 128. So, you’re talking about you’ve jumped from 15 to 20 megawatts to almost 10x that number, 9x that number, to 150 megawatts in two years from 2022 to 2024. And some people like Elon, that he admittedly… He says himself he got into the game a little bit late for pre-training large language models. xAI was started later, right? But then, he bent heaven and hell to get his data center up and get the largest cluster in the world, which is 200,000 GPUs. And he did that. He bought a factory in Memphis. He’s upgrading the substation, with the same time, he’s got a bunch of mobile power generation, a bunch of single cycle combine. He tapped the natural gas line that’s right next to the factory, and he’s just pulling a ton of gas, burning gas.

(03:42:55) He’s generating all this power. He’s in an old appliance factory that’s shut down and moved to China long ago, and he’s got 200,000 GPUs in it. And now, what’s the next scale? All the hyperscalers have done this. Now, the next scale is something that’s even bigger. And so Elon, just to stick on the topic, he’s building his own natural gas plant, like a proper one right next door. He’s deploying tons of Tesla Megapack batteries to make the power more smooth and all sorts of other things. He’s got industrial chillers to cool the water down because he’s water-cooling the chips. So, all these crazy things to get the clusters bigger and bigger.

(03:43:34) But when you look at, say, what OpenAI did with Stargate in Arizona, in Abilene Texas, right? What they’ve announced, at least. It’s not built. Elon says they don’t have the money. There’s some debates about this. But at full scale, at least the first section is definitely money’s accounted for, but there’s multiple sections. But full scale, that data center is going to be 2.2 gigawatts, 2,200 megawatts of power in. And roughly, 1.8 gigawatts or 1,800 megawatts of power delivered to chips.

(03:44:07) Now, this is an absurd scale. 2.2 gigawatts is more than most cities, to be clear. Delivered to a single cluster that’s connected to do training. To train these models, to do both the pre-training, the post-training, all of this stuff.

Nathan Lambert (03:44:23) It is. What is a nuclear power plant, again?

Dylan Patel (03:44:25) Everyone is doing this. Meta in Louisiana, they’re building two natural gas plants. Massive ones. And then, they’re building this massive data center. Amazon has plans for this scale. Google has plans for this scale. xAI has plans for this scale. All of these, the guys that are racing, the companies that are racing are racing hard, and they’re doing multi-gigawatt data centers to build this out. Because they think that, “If I now have…” Obviously, pre-training scaling is going to continue, but to some extent. But then also, all this post-training stuff where you have RL Sandbox for computer use or whatever, this is where they’re going to… And all these fearful viable domains where they just keep learning and learning and learning, self-play or whatever. Whatever it is makes the AI so much more capable because the line does go up.

(03:45:14) As you throw more compute, you get more performance. This shirt is about scaling laws. To some extent, it is diminishing returns. You 10x the compute, you don’t get 10x better model. You get a diminishing returns. But also, you get efficiency improvements, so you bend the curve. And these scale of data centers are just reeking a lot of havoc on the network. Nathan was mentioning Amazon has tried to buy this nuclear power plant Talen. And if you look at Talen’s stock, it’s just skyrocketing. They’re building a massive multi-gigawatt data center there.

(03:45:47) You just go down the list, there’s so many ramifications. Interesting thing is certain regions of the US transmitting power cost more than actually generating it because the grid is so slow to build. And the demand for power, and the ability to build power, and re-ramping on a natural gas plant or even a coal plant is easy enough to do, but transmitting the power’s really hard. So in some parts of the US like in Virginia, it costs more to transmit power than it costs to generate it, which is there’s all sorts of second-order effects that are insane here.

Lex Fridman (03:46:16) Can the power grid support this kind of growth?

Dylan Patel (03:46:19) Trump’s executive orders… There was a Biden executive order before the end of the year, but then Trump had some more executive orders, which hopefully reduced the regulations to where, yes, things can be built. But yeah, this is a big, big challenge. Is building enough power fast enough?

Lex Fridman (03:46:33) Are you going to basically have a nuclear power plant next to a data center for each one of these?

Dylan Patel (03:46:39) The fun thing here is this is too slow to build the power plant. To build a power plant or to reconfigure an existing power plant, it’s too slow. And so therefore, you must use…

(03:46:51) Data center power consumption is flat, right? I mean, [inaudible 03:46:53].

Nathan Lambert (03:46:53) This is why nuclear is also good for it. Long term, nuclear is a very natural fit, but…

(03:46:59) You can’t do solar or anything in the short term like that.

Dylan Patel (03:47:03) Because data center power’s like this, right? You’re telling me I’m going to buy tens of billions of dollars of GPUs and idle them because the power’s not being generated? Power’s cheap. If you look at the cost of a cluster, less than 20% of it is power. Most of it is the capital cost and depreciation of the GPUs. And so it’s like, “Well, screw it. I’ll just build natural gas plants.” This is what Meta is doing in Louisiana, this is what OpenAI is doing in Texas, and all these different places. They may not be doing it directly, but they are partnered with someone. And so, there is a couple of hopes.

(03:47:34) One is… And Elon, what he’s doing in Memphis is to the extreme. They’re not just using dual combine cycle gas which is super efficient, he’s also just using single cycle and mobile generators and stuff which is less efficient. But there’s also the flip side, which is solar power generation is like this, and wind is another like this. Different correlate different. So if you stack both of those, plus you get a big chunk of batteries, plus you have a little bit of gas, it is possible to run it more green. It’s just the time scales for that is slow. So, people are trying. But Meta basically said, “Whatever. I don’t care about my sustainability pledge.” Or they’ll buy a power… It’s called a PPA, Power Purchasing Agreement, where there’ll be a massive wind farm or solar farm wherever. And then, they’ll just pretend like those electrons are being consumed by the data center. But in reality, they’re paying for the power here and selling it to the grid, and they’re buying power here.

(03:48:26) And then another thing is Microsoft quit on some of their sustainability pledges. Elon, what he did with Memphis is objectively somewhat dirty, but he is also doing it in an area where there’s a bigger natural gas plant right next door and a sewer next… Or not a sewer, but a wastewater treatment and a garbage dump nearby. And he’s obviously made the world a lot more clean than that one data center is going to do, so I think it’s fine to some extent. And maybe, AGI solves global warming and stuff, whatever it is.

(03:48:55) This is the attitude that people at the labs have, which is like, “Yeah, it’s great. We’ll just use gas,” because the race is that important. And if we lose, that’s way worse.

Lex Fridman (03:49:05) I should say that I got a chance to visit the Memphis data center.

Lex Fridman (03:49:10) And it’s incredible. I mean, I visited with Elon. Just the teams and the rate of innovation there is insane. My sense is that nobody’s ever done anything of this scale, and nobody has certainly ever done anything of this scale at the rate that xAI is doing. So, they’re figuring out…

(03:49:31) I was sitting in on all of these meetings where they’re brainstorming. It’s insane. It’s exciting because they’re trying to figure out what the bottlenecks are, how to remove the bottlenecks, how to make sure that… There’s just so many really cool things about putting together a data center because everything has to work. The people that do the sys admin, the machine learning and all of that is the exciting thing, so on. But really, the people that run everything are the folks that know the low-level software and hardware that runs everything, the networking, all of that. So, you have to make sure you have procedures that test everything. I think they’re using ethernet. I don’t know how they’re doing the networking, but-

Dylan Patel (03:50:15) They’re using NVIDIA Spectrum-X Ethernet. I think the unsung heroes are the cooling in electrical systems which are just glossed over.

Dylan Patel (03:50:25) But I think one story that maybe exemplifies how insane this stuff is, is when you’re training, you’re always doing… You’re running through the model a bunch, in the most simplistic terms. Running through the model a bunch, and then you’re going to exchange everything and synchronize the weights. So, you’ll do a step. This is like a step-in model training. And every step, your loss goes down hopefully, and it doesn’t always.

(03:50:48) But in the simplest terms, you’ll be computing a lot and then you’ll exchange. The interesting thing is GPU power is most of it, networking power is some but it’s a lot less. So while you’re computing, your power for your GPUs is here. But then when you’re exchanging weights, if you’re not able to overlap communications and compute perfectly, there may be a time period where your GPUs are just idle, and you’re exchanging weights and you’re like, “Hey, the model’s updating.” So, you’re exchanging the radiance, you do the model update, and then you start training again. So, the power goes… Right? And it’s super spiky.

(03:51:17) And so funnily enough, when you talk about the scale of data center power, you can blow stuff up so easily. And so, Meta actually has accidentally upstreamed something to code in PyTorch where they added an operator. And I kid you not, whoever made this, I want to hug the guy because it says PyTorch… It’s like PyTorch.powerplant no blow up equals 0 or equal 1. And what it does is amazing, right?

Dylan Patel (03:51:44) Either when you’re exchanging the weights, the GPU will just compute fake numbers so the power doesn’t spike too much, and so then the power plants don’t blow up because the transient spikes screw stuff up.

Lex Fridman (03:51:54) Well, that makes sense. You have to do that kind thing. [inaudible 03:51:57] You have to make sure they’re not idle.

Dylan Patel (03:51:59) And Elon’s solution was like, “Let me throw a bunch of Tesla Megapacks and a few other things.”

Lex Fridman (03:52:03) Yeah, to symbolize that.

Dylan Patel (03:52:03) Everyone has different solutions, but Meta’s, at least, was publicly and openly known, which is just like, set this operator. And what this operator does is it just makes the GPUs compute nothing so that the power doesn’t spike.

Lex Fridman (03:52:14) But that just tells you how much power you’re working with. I mean, it’s insane. It’s insane.

Nathan Lambert (03:52:18) People should just go to Google, like scale or what does X watts do, and go through all the scales from 1 watt to a kilowatt to a megawatt. You look and stare at that, and you’re how high on the list a gigawatt is, it’s mind-blowing.

Lex Fridman (03:52:34) Can you say something about the cooling? I know Elon’s using liquid cooling, I believe, in all cases. That’s a new thing. Most of them don’t use liquid cooling. Is there something interesting to say about the cooling?

Dylan Patel (03:52:46) Yeah. So, air cooling has been the de facto standard. Throw a bunch of metal heat pipes, et cetera, and fans, and that’s cold. That’s been enough to cool it. People have been dabbling in water cooling. Google’s TPUs are water- cooled. So, they’ve been doing that for a few years. But with GPUs, no one’s ever done… And no one’s ever done the scale of water cooling that Elon just did. Now, next generation NVIDIA is for the highest-end GPU, it is mandatory water cooling. You have to water-cool it.

(03:53:16) But Elon did it on this current generation, and that required a lot of stuff. If you look at some of the satellite photos and stuff of the Memphis facility, there’s all these external water chillers that are sitting. Basically, it looks like a semi truck pod thing. What’s it called? The container? But really, those are water chillers, and he has 90 of those water chillers just sitting outside. Ninety different containers that chill the water, bring it back to the data center, and then you distribute it to all the chips, pull all the heat out and then send it back. And this is both a way to cool the chips, but also, it’s an efficiency thing.

(03:53:49) And going back to that three vector thing, there is Memory Bandwidth FLOPS and interconnect. The closer the chips are together, the easier it is to do high-speed interconnects. And this is also a reason why you want to go water cooling is because you can just put the chips right next to each other, and therefore get higher speed connectivity.

Lex Fridman (03:54:13) I got to ask you, in one of your recent posts, there’s a section called cluster measuring contest. So…

Dylan Patel (03:54:22) There’s another word there, but I won’t say it.

Lex Fridman (03:54:28) Who’s got the biggest now and who’s going to have the biggest?

Dylan Patel (03:54:31) Today, individual largest is Elon. Right?

Lex Fridman (03:54:36) Right. Elon’s cluster.

Dylan Patel (03:54:36) Elon’s cluster in Memphis, 200,000 GPUs. Meta has 128,000, OpenAI has 100,000 now. Now to be clear, other companies have more GPUs than Elon. They just don’t have them in one place. And for training, you want them tightly connected. There’s some techniques that people are researching and working on that let you train across multiple regions. But for the most part, you want them all in one area so you can connect them highly with high-speed networking.

(03:55:02) And so, Elon today has 200,000 H100s, 100,000 H100s and 100,000 H200s. Meta, OpenAI, and Amazon all have on the scale of a hundred thousand, a little bit less. But next this year, people are building much more. Anthrophic and Amazon are building a cluster of 400,000 trainium 2, which is Amazon-specific chip trying to get away from NVIDIA. Meta and OpenAI have scales for hundreds of thousands. But by next year, you’ll have 500,000 to 700,000 GPU clusters. And note, those GPUs are much higher power consumption than existing ones. Hopper’s 700 watts, Blackwell goes to 1,200 watts.

(03:55:45) So, the power per chip is growing and the number of chips is growing.

Lex Fridman (03:55:50) Nuts. Elon said he’ll get to a million. Do you think that’s actually feasible?

Dylan Patel (03:55:56) I mean, I don’t doubt Elon. The filings that he has for the power plant and the Tesla battery packs, it’s clear he has some crazy plans for Memphis. Permits and stuff is open record, but it’s not quite clear what the time scales are. I just never doubt Elon. He’s going to surprise us.

Lex Fridman (03:56:16) So, what’s the idea with these clusters? If you have a million GPUs, what percentage in a, let’s say 2 or 3 years, is used for training? What percent pre-training, and what percent is used for the actual computation?

Dylan Patel (03:56:31) These mega clusters make no sense for inference. You could route inference there and just not train. But most of the inference capacity is being, “Hey, I’ve got a 30-megawatt data center here, I’ve got 50 megawatts here, I’ve got 100 here.” Whatever. I’ll just throw inference in all of those because the mega clusters, multi-gigawatt data centers, I want to train there because that’s where all of my GPUs are co-located where I can put them at a super high networking speed connected together. Because that’s what you need for training.

(03:56:58) Now with pre-training, this is the old scale. You can increase parameters, you did increase data, model gets better. That doesn’t apply anymore because there’s not much more data in the pre-training side. Yes, there’s video and audio and image that has not been fully taken advantage of, so there’s a lot more scaling. But a lot of people have transcript, taken transcripts out of YouTube videos, and that gets you a lot of the data. It doesn’t get you all of the learning value out of the video and image data, but…

(03:57:23) There’s still scaling to be done on pre-training, but this post-training world is where all the FLOPS are going to be spent. The model’s going to play with itself, it’s going to self-play, it’s going to do verifiable tasks, it’s going to do computer use in sandboxes. It might even do simulated robotics things. All of these things are going to be environments where compute is spent in “post-training.” But I think it’s going to be good. We’re going to drop the post from post-training.

Dylan Patel (03:57:49) It’s going to be pre-training and it’s going to be training, I think, at some point. [inaudible 03:57:53] At some point. Because for bulk of the last few years, pre-training has dwarfed post-training. But with these verifiable methods, especially ones that scale really potentially infinitely, like computer use in robotics, not just math and coding where you can verify what’s happening, those infinitely verifiable tasks, it seems you can spend as much compute as you want on this.

Nathan Lambert (03:58:13) Especially at the context length increase because the end of pre-training is when you increase the context length for these models. And we’ve talked earlier in the conversation about how the context length, when you have a long input, is much easier to manage than output. And a lot of these post-training and reasoning techniques rely on a ton of sampling, and it’s becoming increasingly long context. So just like effectively, your compute efficiency goes down.

(03:58:36) I think FLOPS is the standard for how you measure it. But with RL, and you have to do all of these things where you move your weights around in a different way than at pre-training and just generation, it’s going to be become less efficient and FLOPS is going to be less of a useful term. And then as the infrastructure gets better, it’s probably going to go back to FLOPS.

Lex Fridman (03:58:57) So, all of the things we’ve been talking about is most likely going to be NVIDIA, right? Is there any competitors of GPU?

Dylan Patel (03:59:03) Google kind of ignored them. I was getting-

Nathan Lambert (03:59:06) I was like, “Ah?”

Lex Fridman (03:59:08) What’s the story with TPU? What’s the…

Dylan Patel (03:59:10) TPU is awesome. It’s great. Google is, they’re a bit more tepid on building data centers for some reason. They’re building big data centers, don’t get me wrong, and they actually have the biggest cluster. I was talking about NVIDIA clusters. They actually have the biggest cluster. Period.

(03:59:25) But the way they do it is very interesting. They have two data center super regions in that the data center isn’t physically… All of the GPUs aren’t physically on one site but they’re like 30 miles from each other. And they’re not GPUs, TPUs. In Iowa and Nebraska, they have four data centers that are just right next to each other.

Lex Fridman (03:59:44) Why doesn’t Google flex its cluster size?

Dylan Patel (03:59:48) Go to multi-data center training, there’s good images in there. I’ll show you what I mean. It’s just semi-analysis multi-data center.

(03:59:56) This is an image of what a standard Google data center looks like. By the way, their data centers look very different than anyone else’s data centers.

Lex Fridman (04:00:01) What are we looking at here?

Dylan Patel (04:00:03) So if you see this image, in the center, there are these big rectangular boxes. Those are where the actual chips are kept. And then if you scroll down a little bit further, you can see there’s these water pipes, there’s these chiller cooling towers in the top, and a bunch of diesel generators. The diesel generators are backup power. The data center itself look physically smaller than the water chillers. The chips are actually easier to keep together, but then cooling all the water for the water cooling is very difficult.

(04:00:33) So, Google has a very advanced infrastructure that no one else has for the TPU. And what they do is they’ve stamped a bunch of these data centers out in a few regions. So if you go a little bit further down… This is a Microsoft. This is in Arizona. This is where GPT-5 “will be trained.”

Nathan Lambert (04:00:52) If it doesn’t exist already.

Dylan Patel (04:00:54) Yeah, if it doesn’t exist already. But each of these data centers, I’ve shown a couple images of them, they’re really closely co-located in the same region. Nebraska, Iowa. And then they also have a similar one in Ohio complex. And so, these data centers are really close to each other. And what they’ve done is they’ve connected them super high bandwidth with fiber. And so, these are just a bunch of data centers.

(04:01:15) And the point here is that Google has a very advanced infrastructure, very tightly connected in a small region. So, Elon will always to have the biggest cluster fully connected because it’s all in one building, and he’s completely right on that. Google has the biggest cluster but you have to spread over three sites, and by a significant margin. We have to go across multiple sites.

Lex Fridman (04:01:35) Why doesn’t Google compete with NVIDIA? Why don’t they sell TPUs?

Dylan Patel (04:01:41) I think there’s a couple of problems with it. It’s like, one, TPU has been a form of allowing search to be really freaking cheap and build models for that. And so, a big chunk of the search, GPU purchases or TPU purchases or big chunk of Google’s purchases and usage, all of it is for internal workloads. Whether it be search, now Gemini, YouTube, all these different applications that they have ads. These are where all their TPUs are being spent and that’s what they’re hyper-focused on. And so, there’s certain aspects of the architecture that are optimized for their use case that are not optimized elsewhere.

(04:02:21) One simple one is they’ve open sourced a Gemma model, and they called it Gemma-7B. But then, it’s actually 8 billion parameters because the vocabulary is so large. And the reason they made the vocabulary so large is because TPUs matrix multiply unit is massive because that’s what they’ve optimized for. And so they decided, “Oh, well, I’ll just make the vocabulary large, too.” Even though it makes no sense to do so in such a small model, because that fits on their hardware. Gemma doesn’t run it as efficiently on a GPU as a Llama does. But vice versa, Llama doesn’t run as efficiently on a TPU as a Gemma does.

(04:02:53) There’s certain aspects of hardware, software co-design. All their search models are there, ranking and recommendation models, all these different models that are AI but not like gen AI have been hyper optimized with TPUs forever. The software stack is super optimized. But all of this software stack has not been released publicly at all. Very small portions of it. JAX and XLA have been. But the experience when you’re inside of Google and you’re training on TPUs as a researcher, you don’t need to know anything about the hardware in many cases, right? It’s pretty beautiful.

Nathan Lambert (04:03:24) They all loved it.

Dylan Patel (04:03:24) But as soon as you step outside-

Nathan Lambert (04:03:25) A lot of them go back. They leave Google and then they go back.

Dylan Patel (04:03:29) Yeah. They leave and they start a company because they have all of these amazing research ideas. And they’re like, “Wait. Infrastructure’s hard, software is hard.” And this is on GPUs. Or if they try to use TPUs, same thing, because they don’t have access to all this code. And so it’s like, how do you convince a company whose golden goose is search where they’re making hundreds of billions of dollars from, to start selling GPU or TPUs which they used to only buy a couple of billion of…

(04:03:51) I think in 2023, they bought a couple of billion. And now, they’re buying like 10 billion to $15 billion worth. But how do you convince them that they should just buy twice as many and figure out how to sell them, and make $30 billion? Who cares about making $30 billion?

Lex Fridman (04:04:05) Won’t that 30 billion exceed actually the search profit eventually?

Dylan Patel (04:04:11) You’re always going to make more money on services than…

Dylan Patel (04:04:15) I mean, yeah. To be clear, today, people are spending a lot more on hardware than they are with the services because the hardware front runs the service spend. But-

Lex Fridman (04:04:25) You’re investing, yeah.

Dylan Patel (04:04:27) … if there’s no revenue for AI stuff or not enough revenue, then obviously, it’s going to blow up. People won’t continue to spend on GPUs forever. And NVIDIA is trying to move up the stack with software that they’re trying to sell and licensed and stuff. But Google has never had that DNA of like, “This is a product we should sell.” The Google Cloud, which is a separate organization from the TPU team, which is a separate organization from the DeepMind team, which is a separate organization from the Search team. There’s a lot of bureaucracy here.

Lex Fridman (04:04:52) Wait. Google Cloud is a separate team than the TPU team?

Dylan Patel (04:04:55) Technically, TPU sits under infrastructure, which sits under Google Cloud. But Google Cloud, for renting stuff-

Dylan Patel (04:05:00) … But Google cloud for renting stuff and TPU architecture are very different goals, and hardware and software, all of this, right? The Jax XLA teams do not serve Google’s customers externally. Whereas NVIDIA’s various CUDA teams for things like NCCL serve external customers. The internal teams like Jax and XLA and stuff, they more so serve DeepMind and Search, right? And so their customer is different. They’re not building a product for them.

Lex Fridman (04:05:27) Do you understand why AWS keeps winning versus Azure for cloud versus Google Cloud?

Lex Fridman (04:05:35) Google Cloud is tiny, isn’t it, relative to AWS?

Dylan Patel (04:05:37) Google Cloud is third. Yeah. Microsoft is the second biggest, but Amazon is the biggest, right?

Dylan Patel (04:05:43) And Microsoft deceptively sort of includes Microsoft Office 365 and things like that, some of these enterprise-wide licenses. So in reality, the gulf is even larger. Microsoft is still second though, right? Amazon is way bigger. Why? Because using AWS is better and easier. And in many cases, it’s cheaper-

Dylan Patel (04:06:00) And it’s first. It was first.

Lex Fridman (04:06:00) Yeah. But there’s a lot of things that are first that lose the-

Nathan Lambert (04:06:03) Well, it’s harder to switch than it is to-

Lex Fridman (04:06:05) Because there’s large-

Nathan Lambert (04:06:07) There’s big fees for switching too.

Dylan Patel (04:06:09) AWS generates over 80% of Amazon’s profit. I think over 90%.

Dylan Patel (04:06:13) The distribution centers are just like one day we’ll decide to make money from this, but they haven’t yet, right? They make tiny little profit from it.

Nathan Lambert (04:06:20) Yeah, one day Amazon Prime will triple in price.

Lex Fridman (04:06:22) You would think they would improve AWS interface because it’s horrible. It’s clunky, but everybody is.

Nathan Lambert (04:06:31) Yeah, one would think.

Dylan Patel (04:06:33) I think actually Google’s interface is sometimes nice, but it’s also they don’t care about anyone besides their top customers.

Dylan Patel (04:06:39) And their customer service sucks and they have a lot less-

Lex Fridman (04:06:42) I mean, all these companies, they optimize for the big customers. Yeah, it’s supposed to be for business.

Dylan Patel (04:06:47) Amazon has always optimized for the small customer too though. Obviously they optimize a lot for the big customer, but when they started, they just would go to random Bay Area things and give out credits or just put in your credit card and use us back in the early days. The business has grown with them and [inaudible 04:07:04]. Why is Snowflake all over Amazon? Because Snowflake in the beginning, when Amazon didn’t care about them, was still using Amazon. And then of course one day Snowflake and Amazon has a super huge partnership, but this is the case. Amazon’s user experience and quality is better.

(04:07:17) Also, a lot of the silicon they’ve engineered makes them have a lower cost structure in traditional cloud, storage, CPU networking, that kind of stuff than in databases. I think four of Amazon’s top five revenue products, margin products like gross profit products are all database-related products like Redshift and all these things. So Amazon has a very good silicon to user experience like entire pipeline with AWS. I think Google, their silicon teams, they have awesome silicon internally, TPU, the YouTube chip, some of these other chips that they’ve made. And the problem is they’re not serving external customers, they’re serving internal customers, right?

Nathan Lambert (04:07:58) I mean, NVIDIA’s entire culture is designed from the bottom up to do this. There’s this recent book, The NVIDIA Way by Tae Kim, that details this and how they look for future opportunities and ready their CUDA software libraries to make it so that new applications of high-performance computing can very rapidly be evolved on CUDA and NVIDIA chips. And that is entirely different than Google as a services business.

Lex Fridman (04:08:24) I mean NVIDIA, it should be said, is a truly special company. I mean there’s the culture of everything. They’re really optimized for that kind of thing. Speaking of which, is there somebody that can even challenge NVIDIA hardware-wise? Intel? AMD?

Dylan Patel (04:08:39) I really don’t think so. We went through a very long process of working with AMD on training on their GPUs inference and stuff. And they’re decent, their hardware is better in many ways than in NVIDIA’s. The problem is their software is really bad and I think they’re getting better, right? They’re getting better, faster, but the gulf is so large and they don’t spend enough resources on it or haven’t historically, right? Maybe they’re changing their tune now, but for multiple months we were submitting the most bugs like us semi-analysis like what the fuck? Why are we submitting the most bugs? Because they only cared about their biggest customers and so they’d ship them a private image, blah, blah, blah. And it’s like, “Okay, but I am just using PyTorch and I want to use the publicly available libraries,” and you don’t care about that. So they’re getting better, but I think AMD is not possible. Intel is obviously in dire straits right now and needs to be saved somehow. Very important for national security, for American technology comments.

Lex Fridman (04:09:39) Can you explain the obviously, so why are they in dire straits?

Dylan Patel (04:09:41) Going back to earlier, only three companies can R&D, right? Taiwan Hsinchu, Samsung [inaudible 04:09:49], and then Intel Hillsboro. Samsung’s doing horribly. Intel’s doing horribly. We could be in a world where there’s only one company that can do R& and that one company already manufactures most of chips. They’ve been gaining market share anyways, but that’s a critical thing. So what happens to Taiwan means the rest of the world, semiconductor industry and therefore tech relies on Taiwan and that’s obviously precarious as far as Intel, they’ve been slowly, steadily declining. They were on top of servers and PCs, but now Apple’s done the M1 and Nvidia’s releasing a PC chip and Qualcomm’s releasing a PC chip.

(04:10:21) And in servers, hyperscalers are all making their own ARM-based server chips and Intel has no AI silicon like wins. They have very small wins and they never got into mobile because they said no to the iPhone and all these things have compounded and they’ve lost their process technology leadership. They were ahead for 20 years and now they’re behind by at least a couple years and they’re trying to catch back up and we’ll see if their 18A, 14A strategy works out where they try and leapfrog TSMC like and Intel is just losing tons of money anyways, and they just fired their CEO, even though the CEO was the only person who understood the company well, right? We’ll see. He was not the best, but he was pretty good relatively technical guy.

Lex Fridman (04:11:01) Where does Intel make most of its money? The CPUs though.

Dylan Patel (04:11:04) PCs and data center CPUs, yeah, but data center CPUs are all going cloud and Amazon, Microsoft, Google are making ARM-based CPUs. And then PC side, AMD’s gained market share, Nvidia’s launching a chip, that’s not going to be a success, right? MediaTek, Qualcomm ever launched chips. Apple’s doing well. They could get squeezed a little bit in PC, although PC generally I imagine will just stick Intel mostly for Windows side.

Who wins the race to AGI?

Lex Fridman (04:11:27) Let’s talk about the broad AI race. Who do you think wins? We talked about Google, Meta.

Nathan Lambert (04:11:33) The default leader has been Google because of their infrastructure advantage.

Lex Fridman (04:11:37) Well, in the news, OpenAI is the leader.

Nathan Lambert (04:11:40) They’re the leading in the narrative.

Dylan Patel (04:11:42) They have the best model.

Nathan Lambert (04:11:43) They have the best model that people can use and they’re experts-Experts.

Dylan Patel (04:11:47) And they have the most AI revenue.

Nathan Lambert (04:11:48) Yeah. OpenAI is winning.

Lex Fridman (04:11:51) So who’s making money on AI right now? Is anyone making money?

Dylan Patel (04:11:55) So accounting profit-wise, Microsoft is making money, but they’re spending a lot of CapEx and that gets depreciated over years. Meta’s making tons of money with recommendation systems, which is AI, but not with Llama, right? Llama’s losing money for sure. I think Anthropic and OpenAI are obviously not making money otherwise they wouldn’t be raising money. They have to raise money to build more. Although theoretically they are making money. You spent a few hundred million dollars on GPT-4 and it’s doing billions in revenue. So obviously it’s making money. Although they had to continue to research to get the compute efficiency wins and moved down the curve to get that 1200x that has been achieved for GPT-3. Maybe we’re only at a couple hundred X now, but know with GPT-4 Turbo and 4.0 And there’ll be another one probably cheaper than GPT-4.0 even that comes out at some point.

Lex Fridman (04:12:45) And that research costs a lot of money.

Lex Fridman (04:12:49) That’s the thing that I guess is not talked about with the cost, that when you’re referring to the cost of the model, it’s not just the training or the test runs, it’s the actual research, the manpower.

Dylan Patel (04:13:02) Yeah, to do things like reasoning right now that exists. They’re going to scale it. They’re going to do a lot of research still. I think people focus on the payback question, but it’s really easy to just be like, well, GDP is humans and industrial capital. And if you can make intelligence cheap, then you can grow a lot, right? That’s the sort of dumb way to explain it. But that’s sort of what basically the investment thesis is. I think only Nvidia is actually making tons of money and other hardware vendors, the hyperscalers are all on paper making money, but in reality they’re spending a lot more on purchasing the GPUs, which you don’t know if they’re still going to make this much money on each GPU in two years, right?

(04:13:40) You don’t know if all of a sudden OpenAI goes kapoof and now Microsoft has hundreds of thousands of GPUs they were renting to OpenAI that they paid for themselves with their investment in them that no longer have a customer. This is always a possibility. I don’t believe that. I think OpenAI will keep raising money. I think others will keep raising money because the returns from it are going to be eventually huge once we have AGI.

Lex Fridman (04:14:08) So do you think multiple companies will get, let’s assume-

Dylan Patel (04:14:11) I don’t think it’s winner take all.

Lex Fridman (04:14:12) Okay, so let’s not call it AGI whatever. It’s like a single day. It’s a gradual thing-

Nathan Lambert (04:14:18) Powerful AI. Super powerful AI.

Lex Fridman (04:14:20) But it’s a gradually increasing set of features that are useful and make-

Nathan Lambert (04:14:20) Rapidly increasing set of features.

Lex Fridman (04:14:25) Rapidly increasing set of features. So you’re saying a lot of companies will be… It just seems absurd that all of these companies are building gigantic data centers.

Nathan Lambert (04:14:41) There are companies that will benefit from AI but not because they train the best model. Meta has so many avenues to benefit from AI and all of their services. People are there. People spend time on that as platforms, and it’s a way to make more money per user per hour.

Lex Fridman (04:14:54) It seems like Google X/X AI/ Tesla important to say. And then Meta will benefit not directly from the AI like the LLMs, but from the intelligence, like the additional boost of intelligence to the products they already sell. So whether that’s the recommendation system or for Elon who’s been talking about Optimus, the robot, potentially the intelligence of the robot, and then you have personalized robots in the home, that kind of thing. He thinks it’s a 10 plus trillion dollars business, which…

Nathan Lambert (04:15:30) At some point, maybe. Not soon, but who knows when robotics will use for-

Dylan Patel (04:15:36) Let’s do a TAM analysis, 8 billion humans and let’s get 8 billion robots and let’s pay them the average salary. And there we go. 10 trillion. More than 10 trillions.

Lex Fridman (04:15:46) Yeah, I mean if there’s robots everywhere, why does it have to be just 8 billion robots?

Dylan Patel (04:15:52) Yeah, yeah, of course. Of course. I’m going to have one robot. You’re going to have like 20.

Lex Fridman (04:15:57) Yeah, I mean I see a use case for that. So yeah, so I guess the benefit would be in the products they sell, which is why OpenAI’s in a trickier position because they-

Nathan Lambert (04:16:06) All of the value of OpenAI right now as a brand is in ChatGPT and for most users, there’s not that much of a reason that they need OpenAI to be spending billions and billions of dollars on the next best model when they could just license Llama 5 and for be way cheaper. So that’s kind of like ChatGPT is an extremely valuable entity to them, but they could make more money just off that.

Dylan Patel (04:16:31) The chat application clearly does not have tons of room to continue. The standard chat where you’re just using it for a random question and stuff. The cost continues to collapse. V3 is the latest one.

Nathan Lambert (04:16:41) It’ll go down with the ads.

Dylan Patel (04:16:43) But it’s going to get supported by ads. Meta already serves 405B and probably loses the money, but at some point the models are going to get so cheap that they can just serve them for free with ad supported and that’s what Google is going to be able to do. And obviously they’ve got a bigger reach. Chat is not going to be the only use case. It’s like these reasoning, code, agents, computer use, all this stuff is where OpenAI has to actually go to make money in the future otherwise they’re kaputs.

Lex Fridman (04:17:10) But X, Google, and Meta have these other products. So isn’t it likely that OpenAI and Anthropic disappear eventually?

Dylan Patel (04:17:22) Unless they’re so good at models, which they are.

Lex Fridman (04:17:24) But it’s such a cutting edge. I mean-

Nathan Lambert (04:17:25) It depends on where you think AI capabilities are going.

Lex Fridman (04:17:28) You have to keep winning.

Lex Fridman (04:17:30) You have to keep winning as you climb, even if the AI capabilities are going super rapidly awesome into the direction of AGI, there’s still a boost for X in terms of data, Google in terms of data, Meta in terms of data, in terms of other products and the money and there’s just huge amounts of money.

Dylan Patel (04:17:50) The whole idea is human data is kind of tapped out. We don’t care. We all care about self-play, verifiable task.

Nathan Lambert (04:17:57) Think about AWS.

Lex Fridman (04:17:58) Yes, self-play, which is an RNG problem.

Nathan Lambert (04:17:58) AWS does not make a lot of money on each individual machine. And the same can be said for the most powerful AI platform, which is even though the calls to the API are so cheap, there’s still a lot of money to be made by owning that platform. And there’s a lot of discussions as it’s the next compute layer.

Dylan Patel (04:18:15) You have to believe that. And there’s a lot of discussions that tokens and tokenomics and LLM, APIs are the next compute layer, are the next paradigm for the economy like energy and oil was. But you have to sort of believe that APIs and chat are not where AI is stuck. It is actually just tasks and agents and robotics and computer use, and those are the areas where all the value will be delivered, not API, not chat application.

Lex Fridman (04:18:42) So is it possible you have it all just becomes a commodity and you have the very thin wrapper like Perplexity, just joking.

Nathan Lambert (04:18:54) There are a lot of wrappers making a lot of money.

Lex Fridman (04:18:57) But do you think it’s possible that people would just even forget what OpenAI and Anthropic is just there’ll be wrappers around the API and it just dynamically-

Dylan Patel (04:19:06) If model progress is not rapid, yeah. It’s becoming a commodity, right? DeepSeek V3 shows this, but also the GPT-3 chart earlier, Kurt [inaudible 04:19:14] showed this, right? Llama 3B is 1200X cheaper than GPT-3. Anyone whose business model was GPT-3 level capabilities is dead. Anyone whose business models GPT-4 level capabilities is dead.

Nathan Lambert (04:19:26) It is a common saying that the best businesses being made now are ones that are predicated on models getting better.

Lex Fridman (04:19:32) Right. Which would be like wrappers, thing that is riding the wave of the models.

Nathan Lambert (04:19:37) The short-term that company that could make the most money is the one that figures out what advertising targeting method works for language model generations. We have the Meta ads which are hyper-targeted in feed, not within specific pieces of content. And we have search ads that are used by Google and Amazon has been rising a lot on search. But within a return from ChatGPT, it is not clear how you get a high-quality placed ad within the output. And if you can do that with model costs coming down, you can just get super high revenue. That revenue is totally untapped and it’s not clear technically how it’s done.

Lex Fridman (04:20:12) Yeah, that is, I mean sort of the AdSense innovation that Google did, the one day you’ll have in GPT output an ad and that’s going to make billions, if not-

Nathan Lambert (04:20:25) And it could be very subtle, it could be in conversation, we have voice mode now. It could be some way of making it so the voice introduces certain things. It’s much harder to measure and it takes imagination, but yeah.

Lex Fridman (04:20:35) And it wouldn’t come off shady so that you would receive public blowback, that kind of thing. So you have to do it loud enough to where it’s clear it’s an ad and balance all of that. So that’s the open question they’re trying to solve. Anthropic and OpenAI, they need to-

Nathan Lambert (04:20:51) They might not say that they’re trying-

Dylan Patel (04:20:53) I don’t think they care about that at all.

Nathan Lambert (04:20:53) They don’t care about it right now. I think it’s places like Perplexity are experimenting on that more.

Lex Fridman (04:20:59) Oh, interesting. Yeah, for sure.

Dylan Patel (04:21:01) Perplexity, Google, Meta care about this. I think OpenAI and Anthropic are purely laser focused on-

Dylan Patel (04:21:08) Yeah. Like agents and AGI, and if I build AGI, I can make tons of money or I can pay for everything. And it’s just predicated back on the export control thing. If you think AGI is five, 10 years away or less, these labs think it’s two, three years away. Obviously your actions are, if you assume they’re rational actors, which they are mostly what you do in a two-year AGI versus five year versus 10 years, very, very, very different. Right?

AI agents

Lex Fridman (04:21:39) Do you think agents are promising? We have to talk about this. This is the excitement of the year that agents are going to rev.. This is the generic hype term that a lot of business folks are using. AI agents are going to revolutionize everything.

Nathan Lambert (04:21:57) Okay. So mostly the term agent is obviously overblown. We’ve talked a lot about reinforcement learning as a way to train for verifiable outcomes. Agents should mean something that is open-ended and is solving a task independently on its own and able to adapt to uncertainty. There’s a lot of the term agent applied to things like Apple Intelligence, which we still don’t have after the last WWDC, which is orchestrating between apps and that type of tool use thing is something that language models can do really well. Apple Intelligence I suspect will come eventually. It’s a closed domain. It’s your messages app integrating with your photos with AI in the background. That will work. That has been described as an agent by a lot of software companies to get into the narrative.

(04:22:40) The question is what ways can we get language models to generalize to new domains and solve their own problems in real time. Maybe some tiny amount of training when they’re doing this with fine-tuning themselves or in context learning, which is the idea of storing information in a prompt. And you can use learning algorithms to update that and whether or not you believe that that is going to actually generalize to things like me saying, “Book my trip to go to Austin in two days. I have XYZ constraints,” and actually trusting it. I think there’s an HCI problem coming back for information.

Lex Fridman (04:23:19) Well, what’s your prediction there? Because my gut says we’re very far away from that.

Dylan Patel (04:23:24) I think OpenAI’s statement, I don’t know if you’ve seen the five levels where it’s chat is level one, reasoning is level two, and then agents is level three. And I think there’s a couple more levels, but it’s important to note, we were in chat for a couple years. We just theoretically got to reasoning, we’ll be here for a year or two, and then agents, but at the same time, people can try and approximate capabilities of the next level, but the agents are doing things autonomously, doing things for minutes at a time, hours at a time, et cetera, right? Reasoning is doing things for tens of seconds at a time and then coming back with an output that I still need to verify and use and try check out. And the biggest problem is of course, it’s the same thing with manufacturing. There’s the whole six sigma thing, how many nines do you get?

(04:24:14) And then you compound the nines onto each other and it’s like if you multiply by the number of steps that are six sigma, you get to a yield or something. So in semiconductor manufacturing, tens of thousands of steps, 9999999 is not enough. You multiply by that many times you actually end up with 60% yield, right? Really low yield or zero. And this is the same thing with agents, right? Chaining tasks together each time, even the best LLMs in particularly pretty good benchmarks don’t get 100%, right? They get a little bit below that because there is a lot of noise. And so how do you get to enough nines, right? This is the same thing with self-driving. We can’t have self-driving because without it being super geofenced like Google’s and even then they have a bunch of teleoperators to make sure it doesn’t get stuck. But you can’t do that because it doesn’t have enough nines.

Lex Fridman (04:25:07) Self-driving has quite a lot of structure because roads have rules, it’s well-defined, there’s regulation. When you’re talking about computer use for the open web, for example, or the open operating system, it’s a mess. So the possibility… I’m always skeptical of any system that is tasked with interacting with the human world, with the open messaging world.

Nathan Lambert (04:25:36) That’s the thing. If we can’t get intelligence that’s enough to solve the human world on its own, we can create infrastructure like the human operators for Waymo over many years that enable certain workflows.

Dylan Patel (04:25:47) There is a company, I don’t remember it, but it is, but that’s literally their pitch is, “Yeah, we’re just going to be the human operator when agents fail and you just call us and we fix it.” Same thing an API call, and it’s hilarious.

Nathan Lambert (04:25:57) There’s going to be teleoperation markets when we get human robots, which is there’s going to be somebody around the world that’s happy to fix the fact that it can’t finish loading my dishwasher when I’m unhappy with it. But that’s just going to be part of the Tesla service package.

Lex Fridman (04:26:10) I’m just imagining an AI agent talking to another AI agent. One company has an AI agent that specializes in helping other AI agents.

Nathan Lambert (04:26:20) But if you can make things that are good at one step, you can stack them together. So that’s why if it takes a long time, we’re going to build infrastructure that enables it. You see the operator launch, they have partnerships with certain websites, with DoorDash, with OpenTable, with things like this. Those partnerships are going to let them climb really fast. Their model’s going to get really good at those things. It’s going to proof of concept that might be a network effect where more companies want to make it easier for AI. Some companies will be like, “No, let’s put blockers in place.” And this is the story of the internet we’ve seen, we see it now with training data for language models where companies are like, “No, you have to pay.” Business working it out.

Lex Fridman (04:27:00) That said, I think airlines and hotels have high incentive to make their site work really well, and they usually don’t. If you look at how many clicks it takes to order airplane ticket, it’s insane.

Nathan Lambert (04:27:14) You actually can’t call an American Airlines agent anymore. They don’t have a phone number.

Lex Fridman (04:27:20) I mean, it’s horrible on the interface front. And to imagine that agents will be able to deal with that website when I, as a human, struggle, like I have an existential crisis every time I try to book an airplane ticket. I think it’s going to be extremely difficult to build an AI agent that’s robust in that way.

Nathan Lambert (04:27:40) But think about it, United has accepted the Starlink term, which is they have to provide Starlink for free and the users are going to love it. What if one airline is like, “We’re going to take a year and we’re going to make our website have white text that works perfectly for the AIs.” Every time anyone asks about an AI flight, they buy whatever airline it is.

Dylan Patel (04:28:00) They’re just like, “Here’s an API and it’s only exposed to AI agents and if anyone queries it, the price is 10% higher for any flight, but we’ll let you see any of our flights and you can just book any of them. Here you go.”

Nathan Lambert (04:28:11) And then that’s it.

Dylan Patel (04:28:12) It’s like, “Oh, and I made 10% higher price. Awesome.” And am I willing to say that for like, “Hey, book me a flight to [inaudible 04:28:18].” Right? And it’s like, yeah, whatever. I think computers and real world and the open world are really, really messy, but if you start defining the problem in narrow regions, people are going to be able to create very, very productive things and ratchet down cost massively, right? Now, crazy things like robotics in the home, those are going to be a lot harder to do just like self-driving because there’s just a billion different failure modes, but agents that can navigate a certain set of websites and do certain sets of tasks or take a photo of your fridge or upload your recipes and then it figures out what to order from Amazon/Whole Foods food delivery, and that’s going to be pretty quick and easy to do, I think. So it’s going to be a whole range of business outcomes and it’s going to be tons of optimism around people can just figure out ways to make money.

Nathan Lambert (04:29:14) To be clear, these sandboxes already exist in research. There are people who have built clones of all the most popular websites of Google, Amazon, blah, blah, blah, to make it so that there’s… And I mean open AI probably has them internally to train these things. It’s the same as DeepMind’s robotics team for years has had clusters for robotics where you interact with robots fully, remotely. They just have a lab in London and you send tasks to it, arrange the blocks, and you do this research. Obviously there’s techs there that fix stuff, but we’ve turned these cranks of automation before.

(04:29:46) You go from sandbox to progress and then you add one more domain at a time and generalize, I think. And the history of NLP and language processing instruction, tuning and tasks per language model used to be like one language model did one task, and then in the instruction tuning literature, there’s this point where you start adding more and more tasks together where it just starts to generalize to every task. And we don’t know where on this curve we are. I think for reasoning with this RL and verifiable domains, we’re early, but we don’t know where the point is where you just start training on enough domains and poof, more domains just start working. And you’ve crossed the generalization barrier.

Programming and AI

Lex Fridman (04:30:22) Well, what do you think about the programming context? So software engineering, that’s where I personally, and I know a lot of people interact with AI the most.

Dylan Patel (04:30:34) There’s a lot of fear and angst too from current CS students, but that is the area where probably the most AI revenue and productivity gains have come, right? Whether it be Copilots or Cursor or what have you, or just standard ChatGPT. I know very few programmers who don’t have ChatGPT and actually many of them have the $200 tier because that’s what it’s so good for. I think that in that world, we already see it like SWE-bench. And if you’ve looked at the benchmark made by some Stanford students, I wouldn’t say it’s really hard, but I wouldn’t say it’s easy either. I think it takes someone who’s been through at least a few years of CS or a couple years of programming to do SWE-bench, well, and the models went from 4% to 60% in a year, and where are they going to go to next year? It’s going to be higher. It probably won’t be a hundred percent because again, that nines is really hard to do, but we’re going to get to some point where that’s, and then we’re going to need harder software engineering benchmarks and so on and so forth.

(04:31:34) But the way that people think of it now is it can do code completion. Easy. It can do some function generation. I have to review it. Great. But really the software engineering agents I think can be done faster sooner than any other agent because it is a verifiable domain. You can always unit test or compile, and there’s many different regions of it can inspect the whole code base at once, which no engineer really can. Only the architects can really think about this stuff, the really senior guys, and they can define stuff and then the agent can execute on it. So I think software engineering costs are going to plummet like crazy. And one interesting aspect of that is when software engineering costs are really low, you get very different markets. So in the US, you have all these platform SaaS companies, Salesforce and so on and so forth. In China, no one uses platform SaaS. Everyone just builds their own stack because software engineering is much cheaper in China and partially because people, number of STEM graduates, et cetera. So it’s generally just cheaper to do.

(04:32:38) And so at the same time, code LLMs have been adopted much less in China because the cost of an engineer there is much lower. But what happens when every company can just invent their own business logic really cheaply and quickly? You stop using platform SaaS, you start building custom tailored solutions, you change them really quickly. Now all of a sudden your business is a little bit more efficient too, potentially because you’re not dealing with the hell that is. Some random platform SaaS company stuff not working perfectly and having to adjust workflows or random business automation cases that aren’t necessarily AI required.

(04:33:08) It’s just logic that needs to be built that no one has built. All of these things can go happen faster. And so I think software and then the other domain is industrial, chemical, mechanical engineers suck at coding just generally. And their tools like semiconductor engineers, their tools are 20 years old. All the tools run on XP including ASML lithography tools run on Windows XP. And a lot of the analysis happens in Excel, right? It’s just like, “Guys, you guys can move 20 years forward with all the data you have and gathered and do a lot better.” You need the engineering skills for software engineering to be delivered to the actual domain expert engineer. So I think that’s the area where I’m super-duper bullish of generally AI creating value.

Nathan Lambert (04:33:47) The big picture is that I don’t think it’s going to be a cliff. I think a really good example of how growth changes is when Meta added stories. So Snapchat was on an exponential, they added stories, it flatlined. Software engineers then up until the right, AI is going to come in, it’s probably just going to be flat. It’s not like everyone’s going to lose their job. It’s hard because the supply corrects more slowly. So the amount of students is still growing, and that’ll correct on a multi-year, like a year delay, but the amount of jobs will just turn and then maybe in 20, 40 years, it’ll be well down. But in the few years, there’ll never going to be the snap moment where it’s like software engineers aren’t useful.

Lex Fridman (04:34:30) I think also the nature of what it means to be a programmer and what kind of jobs programmers do changes, because I think there needs to be a human in the loop of everything you’ve talked about. There’s a really important human in that picture of correcting the code, fixing-

Dylan Patel (04:34:49) Thinking larger than the context length.

Lex Fridman (04:34:51) And debugging also, like debugging by reading the code, understanding the steering the system. No, no, no. You missed the point. Adding more to the prompt like, yes, adding the human-

Nathan Lambert (04:35:05) Designing the perfect Google button. Google’s famous for having people design buttons that are so perfect, and it’s like how is AI going to do that? They could give you all the ideas. Perfect, fine.

Lex Fridman (04:35:17) I mean, that’s the thing. You can call it taste. One thing humans can do is figure out what other humans enjoy better than AI systems. That’s where the preference you loading that in. But ultimately, humans are the greatest preference generator. That’s where the preference comes from.

Nathan Lambert (04:35:32) And humans are actually very good at reading or judging between two things versus… This goes back to the core of what RLHF and preference tuning is that it’s hard to generate a good answer for a lot of problems, but it’s easy to see which one is better. And that’s how we’re using humans for AI now is judging which one is better, and that’s what software engineering could look like. The PR review, here’s a few options, here are some potential pros and cons, and they’re going to be judges.

Lex Fridman (04:35:59) I think the thing I would very much recommend is programmers start using AI and embracing that role of the supervisor of the AI system and partner the AI system versus writing from scratch or not learning coding at all and just generating stuff because I think there actually has to be a pretty high level of expertise as a programmer to be able to manage increasingly intelligent systems.

Dylan Patel (04:36:24) I think it’s that and then becoming a domain expert in something.

Dylan Patel (04:36:28) Because seriously, if you go look at aerospace or semiconductors or chemical engineering, everyone is using really crappy platforms, really old software. The job of a data scientist is a joke in many cases. In many cases, it’s very real, but it’s like bring what the forefront of human capabilities are to your domain. And even if the forefront is from the AI, your domain, you’re at the forefront. So it’s like you have to be at the forefront of something and then leverage the rising tide that is AI for everything else.

Lex Fridman (04:36:57) Oh, yeah. There’s so many low hanging fruit everywhere in terms of where software can help automate a thing or digitize a thing in the legal system. That’s why DOGE is exciting. I got to hang out with a bunch of the DOGE folks, and I mean, government is so old school. It’s like begging for the modernization of software, of organizing the data, all this kind of stuff. I mean, in that case it’s by design because bureaucracy protects centers of power and so on. But software breaks down those barriers, so it hurts those that are holding onto power, but ultimately benefits humanity. So there’s a bunch of domains of that kind. One thing we didn’t fully finish talking about is open source. So first of all, congrats. You released a new model.

Open source

Nathan Lambert (04:38:00) I’ll explain what a tülu is. A tülu is a hybrid camel when you breed a dromedary with a Bactrian camel. Back in the early days after ChatGPT, there was a big wave of models coming out like Alpaca, Vicuna, et cetera, that were all named after various mammalian species. Tülu, the brand, is multiple years old, which comes from that.

(04:38:19) And we’ve been playing at the frontiers of post-training with open source code. And this first part of this release was in the fall where we’ve built on Llama’s, open models, open weight models, and then we add in our fully open code or fully open data. There’s a popular benchmark that is Chatbot Arena. And that’s generally the metric by which how these chat models are evaluated. And it’s humans compare random models from different organizations. And if you looked at the leaderboard in November or December, among the top 60 models from tens to twenties of organizations, none of them had open code or data for just post-training.

(04:38:58) Among that, even fewer or none have pre-training data and code available. Post-training is much more accessible at this time. It’s still pretty cheap, and you can do it. And the thing is, how high can we push this number where people have access to all the code and data? So that’s kind of the motivation of the project. We draw in lessons from Llama. Nvidia had a Nemotron model where the recipe for their post-training was fairly open with some data and a paper, and it’s putting all these together to try to create a recipe that people can fine tune models like GPT-4 to their domain.

Lex Fridman (04:39:28) To be clear, in the case of Tülu, maybe you can talk about Llama too, but in the case of Tülu, you’re taking Llama 3, 405B.

Nathan Lambert (04:39:37) Tülu has been a series of recipes for post-training. So we’ve done multiple models over years.

Lex Fridman (04:39:44) And so you’re open sourcing everything.

Nathan Lambert (04:39:46) Yeah. If you start with an open weight based model, their whole model technically isn’t open source because you don’t know what Llama put into it, which is why we have the separate thing that we’ll get to, but it’s just getting parts of the pipeline where people can zoom in and customize. I know I hear from startups and businesses, they’re like, “Okay, I can take this post-training-“

Nathan Lambert (04:40:00) … I know I hear from startups and businesses, they’re like, “Okay, I can take this post-training and try to apply it to my domain.” We talk about verifiers a lot. We use this idea which is reinforcement learning with verifiable rewards RLVR, kind of similar to RLHF. And we applied it to MAP and the model today, which is we applied it to the Llama 405B base model from last year. And we have our other stuff. We have our instruction tuning and our preference tuning. But the math thing is interesting, which is it’s easier to improve this math benchmark. There’s a benchmark, M-A-T-H, MATH, all capitals, tough name on the benchmark name is the area that you’re evaluating. We’re researchers, we’re not brand strategists.

(04:40:44) And this is something that the DeepSeek paper talked about as well is at this bigger model, it’s easier to elicit powerful capabilities with this RL training. And then they distill it down from that big model to the small model. And this model we released today, we saw the same thing. We’re at AI2, we don’t have a ton of compute. We can’t train 405B models all the time. So we just did a few runs and they tend to work. And it just shows that there’s a lot of room for people to play in these things and that’s –

Dylan Patel (04:41:12) And they crushed Llama’s actual release, they’re way better than it.

Nathan Lambert (04:41:16) … Yeah. So our eval numbers, I mean we have extra months in this, but our eval numbers are much better than the Llama instruct model that they released.

Lex Fridman (04:41:24) And then you also said better than DeepSeek V3?

Nathan Lambert (04:41:26) Yeah, on our eval benchmark. DeepSeek V3 is really similar. We have a safety benchmark to understand if it will say harmful things and things like that. And that’s what draws down most of the way. It’s still-

Dylan Patel (04:41:37) It’s like an amalgamation of multiple benchmarks or what do you mean?

Nathan Lambert (04:41:40) … Yeah, so we have a 10 evaluator. This is standard practice in post-training is you choose your evaluations you care about. In academics, in smaller labs you’ll have fewer evaluations. In companies, you’ll have a really one domain that you really care about. In Frontier Labs, you’ll have tens to 20s to maybe even 100 evaluations of specific things. So we choose a representative suite of things that look like chat, precise instruction following, which is like respond only in emojis, just model follow weird things like that, math, code. And you create a suite like this. So safety would be one of 10 in that type of suite where you have what does the broader community of AI care about? And for example, in comparison to DeepSeek it would be something like our average eval for our model would be 80, including safety and similar without. And DeepSeek would be like 79% average score without safety and their safety score would bring it down to like 70 or there abouts.

Dylan Patel (04:42:31) Oh, so you’d beat them even ignoring safety.

Nathan Lambert (04:42:33) Yeah. So this is something that internally, it’s like I don’t want to win only by how you shape the eval benchmark. So if there’s something that’s like people may or may not care about safety in their model, safety can come downstream, safety can be when you host the model for an API, like safety is addressed in a spectrum of locations in AI applications. So it’s like if you want to say that you have the best recipe, you can’t just gate it on these things that some people might not want.

(04:42:56) And this is, it’s like the time of progress and we benefit if we can release a model later, we have more time to learn new techniques like this RL technique, we had started this in the fall, it’s now really popular reasoning models. The next thing to do for open source post-training is to scale up verifiers, to scale up data to replicate some of DeepSeek’s results. And it’s awesome that we have a paper to draw on and it makes it a lot easier. And that’s the type of things that is going on among academic and closed frontier research in AI.

Lex Fridman (04:43:28) Since you’re pushing open source, what do you think is the future of it? Do you think DeepSeek actually changes things since it’s open source or open weight or is pushing the open source movement into the open direction?

Nathan Lambert (04:43:39) This goes very back to license discussion. So DeepSeek R1 with a friendly license is a major reset. So it’s like the first time that we’ve had a really clear frontier model that is open weights and with a commercially friendly license with no restrictions on downstream use cases since that data distillation, whatever.This has never been the case at all in the history of AI in the last few years since ChatGPT. There have been models that are off the frontier or models with weird licenses that you can’t really use them.

Dylan Patel (04:44:04) So is it Meta’s license pretty much permissible except for five companies?

Nathan Lambert (04:44:10) So this goes to what open source AI is, which is there’s also use case restrictions in the Llama license, which says you can’t use it for specific things. So if you come from an open source software background, you would say that that is not an open source license.

Dylan Patel (04:44:22) What kind of things are those, though? Are they like-

Nathan Lambert (04:44:25) At this point, I can’t pull them off the top of my head, but it’d be like-

Lex Fridman (04:44:28) Stuff like competitors?

Nathan Lambert (04:44:29) It used to be military use was one and they removed that for scale, it’ll be like CSAM like child abuse material. That’s the type of thing that is forbidden there. But that’s enough from an open source background to say it’s not an open source license.And also the Llama license has this horrible thing where you have to name your model Llama if you touch it to the Llama model. So it’s like the branding thing. So if a company uses Llama, technically the license says that they should say built with Llama at the bottom of their application. And from a marketing perspective, that just hurts. I could suck it up as a researcher, I’m like, oh, it’s fine. It says Llama dash on all of our materials for this release. But this is why we need truly open models, which is we don’t know DeepSeek R1’s data, but-

Dylan Patel (04:45:12) Wait, so you’re saying I can’t make a cheap copy of Llama and pretend it’s mine, but I can do this with the Chinese model?

Nathan Lambert (04:45:18) … Hell, yeah. That’s what I’m saying. And that’s why it’s like we want this whole open language model thing, he Olmo thing is to try to keep the model where everything is open with the data as close to the frontier as possible. So we’re compute constrained, we’re personnel constrained. We rely on getting insights from people like John Schulman tells us to do URL and outputs. We can make these big jumps, but it just takes a long time to push the frontier of open source. And fundamentally, I would say that that’s because open source AI does not have the same feedback loops as open source software. We talked about open source software for security. Also it’s just because you build something once and can reuse it. If you go into a new company, there’s so many benefits, but if you open source a language model, you have this data sitting around, you have this training code, it’s not like that easy for someone to come and build on and improve because you need to spend a lot on compute, you need to have expertise.

(04:46:12) So until there are feedback loops of open source AI, it seems like mostly an ideological mission. People like Mark Zuckerberg, which is like America needs this and I agree with him, but in the time where the motivation ideologically is high, we need to capitalize and build this ecosystem around, what benefits do you get from seeing the language model data? And there’s not a lot about that. We’re going to try to launch a demo soon where you can look at an OMO model and a query and see what pre-training data is similar to it, which is legally risky and complicated, but it’s like what does it mean to see the data that the AI was trained on? It’s hard to parse. It’s terabytes of files. It’s like I don’t know what I’m going to find in there, but that’s what we need to do as an ecosystem if people want open source AI to be financially useful.

Stargate

Lex Fridman (04:47:01) We didn’t really talk about Stargate. I would love to get your opinion on what the new administration, the Trump administration, everything that’s being done from the America side and supporting AI infrastructure and the efforts of the different AI companies. What do you think about Stargate? What are we supposed to think about Stargate and does Sam have the money?

Dylan Patel (04:47:23) Yeah, so I think Stargate is a opaque thing. It definitely doesn’t have $500 billion, doesn’t even have $100 billion dollars. So what they announced is this $500 billion number, Larry Ellison, Sam Altman and Trump said it. They thanked Trump and Trump did do some executive actions that do significantly improve the ability for this to be built faster. One of the executive actions he did is on federal land, you can just basically build data centers in power pretty much like that. And then permitting process is basically gone or you file after the fact. So again, I had of schizo take earlier, another schizo take, if you’ve ever been to the Presidio in San Francisco, beautiful area, you could build a power plant in a data center there if you wanted to because it is federal land. It used to be a military base, but obviously this would people off. It’s a good fit. Anyways, Trump has made it much easier to do this, right? Generally, Texas has the only unregulated grid in the nation as well.

Dylan Patel (04:48:25) And so therefore ERCOT enables people to build faster as well in addition, the federal regulations are coming down and so Stargate is predicated, and this is why that whole show happened. Now how they came up with a $500 billion number is beyond me. How they came up with $100 billion dollars number makes sense to some extent. And there’s actually a good table in here that I would like to show in that Stargate piece that I had. It’s the most recent one. So anyways, Stargate, it’s basically, it’s a table about cost. There, you passed it already. It’s that one. So this table is kind of explaining what happens. So Stargate is in Abilene, Texas, the first $100 billion of it. That site is 2.2 gigawatts of power in, about 1.8 gigawatts of power consumed. Per GPU, Oracle is already building the first part of this before Stargate came about. To be clear, they’ve been building it for a year.

(04:49:32) They tried to rent it to Elon in fact, but Elon was like, “It’s too slow. I need it faster.” So then he went and did his Memphis thing, and so OpenAI was able to get it with this weird joint venture called Stargate. They initially signed a deal with just Oracle for the first section of this cluster. This first section of this cluster is roughly $5 billion to $6 billion of server spend, and then there’s another billion or so of data center spend. And then likewise, if you fill out that entire 1.8 gigawatts with the next two generations of NVIDIA’s chips, GB 200, GB 300, VR 200, and you fill it out completely, that ends up being roughly $50 billion of server cost. Plus there’s data center costs plus maintenance costs, plus operation costs plus all these things. And that’s where OpenAI gets to their $100 billion announcement that they had. Because they talked about $100 billion dollars is phase one. That’s this Abilene, Texas data center, right? $ 100 billion of “total cost of ownership.” So it’s not CapEx, it’s not investment, it’s a $100 billion of total cost of ownership.

(04:50:39) And then there will be future phases. They’re looking at other sites that are even bigger than this 2.2 gigawatts by the way, in Texas and elsewhere. And so they’re not completely ignoring that, but the number of $100 billion that they save for phase one, which I do think will happen. They don’t even have the money for that. Furthermore, it’s not $100 billion dollars, it’s $50 billion of spend and then $50 billion of operational cost power, et cetera, rental pricing, et cetera, because they’re renting it. OpenAI is renting the GPUs from the Stargate joint venture. What money do they actually have, right? SoftBank is going to invest, Oracle is going to invest. OpenAI is going to invest. OpenAI is on the line for $19 billion. Everyone knows that they’ve only got 46 billion in their last round and $4 billion of debt. But there, there’s news of Softbank maybe investing $25 billion into OpenAI. So that’s part of it. So $19 billion can come from there.

(04:51:32) So OpenAI does not have the money at all to be clear. Ink is not dried on anything. OpenAI has $0 for this, 50 billion in which they’re legally obligated to put 19 billion of CapEx into the joint venture, and then the rest they’re going to pay via renting the GPUs from the joint venture. And then there’s Oracle. Oracle has a lot of money. They’re building the first section completely. They were spending for it themselves, this $6 billion of CapEx, $10 billion of TCO, and they were going to do that first section. They’re paying for that, right? As far as the rest of the section, I don’t know how much Larry wants to spend. At any point he could pull out. This is again, it is completely voluntary. So at any point, there’s no signed ink on this, but he potentially could contribute tens of billions of dollars to be clear. He’s got the money, Oracle’s got the money.

(04:52:17) And then there’s like MGX is the UAE fund, which technically has $1.5 trillion for investing in AI. But again, I don’t know how real that money is and there’s no ink signed for this, SoftBank does not have $25 billion of cash. They have to sell down their stake in arm, which is the leader in CPUs and they IPO’d it. This is obviously what they’ve always wanted to do, they just didn’t know where they’d redeploy the capital. Selling down the stake in ARM makes a ton of sense. So they can sell that down and invest in this if they want to and invest in OpenAI if they want to. As far as money secured, the first 100,000 GB 200 cluster can be funded. Everything else after that-

Dylan Patel (04:52:58) … is up in the air. Money’s coming. I believe the money will come. I personally do.

Dylan Patel (04:53:04) It’s a belief that they’re going to release better models and be able to raise more money. But the actual reality is that Elon’s right, the money does not exist.

Lex Fridman (04:53:12) What does the US government have to do with anything? What does Trump have to do with everything? He’s just a hype man?

Dylan Patel (04:53:17) Trump, he’s reducing the regulation so they can build it faster and he’s allowing them to do it because any investment of this side is going to involve antitrust stuff. So obviously he’s going to allow them to do it. He’s going to enable the regulations to actually allow it to be built. I don’t believe there’s any US government dollars being spent on this though.

Lex Fridman (04:53:37) So I think he’s also just creating a general vibe that regulation will go down and this is the era of building. So if you’re a builder, you want to create stuff, you want to launch stuff, this is the time to do it.

Dylan Patel (04:53:50) And so we’ve had this 1.8 gigawatt data center in our data for over a year now, and we’ve been sending it to all of our clients, including many of these companies that are building the multi gigawatts. But that is at a level that’s not quite, maybe executives seeing $500 billion, $100 billion dollars, and then everyone’s asking them. So it could spur an even faster arms race. Because there’s already an arms race, but this 100 billion, $500 billion number, Trump talking about it on TV, it could spur the arm race to be even faster and more investors to flood in and et cetera, et cetera. So I think you’re right in that sense that open AI or Trump is sort of championing, people are going to build more and his actions are going to let people build more.

Future of AI

Lex Fridman (04:54:31) What are you excited about these several years that are upcoming in terms of cluster build outs, in terms of breakthroughs in AI, the best possible future you can imagine in the next couple of years, two, three, four years? What does that look like? It could be very specific technical things like breakthroughs on post-training or it could be just size, big impressive clusters.

Dylan Patel (04:55:01) I really enjoy tracking supply chain and who’s involved and what, I really do. It’s really fun to see the numbers, the cost, who’s building what capacity, helping them figure out how much capacity they should build winning deals, strategic stuff. That’s really cool. I think technologically, there’s a lot around the networking side that really excites me with optics and electronics kind of getting closer and closer, whether it be co-packaged optics or some sort of forms of new forms of switching.

Lex Fridman (04:55:28) This is internal to a cluster?

Dylan Patel (04:55:30) A cluster, yeah. Also multi-data center training. People are putting so much fiber between these data centers and lighting it up with so much bandwidth that there’s a lot of interesting stuff happening on that end. Telecom has been really boring since 5G, and now it’s really exciting again on the hardware side.

Lex Fridman (04:55:48) Can you educate me a little bit about the speed of things? So the speed of memory versus the speed of interconnect versus the speed of fiber between data centers. Are these orders of magnitude different? Can we at some point converge towards a place where it all just feels like one computer?

Dylan Patel (04:56:04) No, I don’t think that’s possible. It’s only going to get harder to program, not easier. It’s only going to get more difficult and complicated and more layers. The general image that people like to have is this hierarchy of memory, so on-chip is really close, localized within the chip, you have registers and those are shared between some compute elements and then you’ll have caches which are shared between more compute elements. Then you have memory like HBM or DRAM like DDRR memory or whatever it is, and that’s shared between the whole chip. And then you can have pools of memory that are shared between many chips and then storage and you keep zoning out. The access latency across data centers, within the data center within a chip is different. So you’re always going to have different programming paradigms for this. It’s not going to be easy. Programming this stuff is going to be hard, maybe AI can help with programming this.

(04:56:55) But the way to think about it is that there is sort of the more elements you add to a task, you don’t get strong scaling. If I double the number of chips, I don’t get two exit performance. This is just a reality of computing because there’s inefficiencies.And there’s a lot of interesting work being done to make it not to make it more linear, whether it’s making the chips more networked together more tightly or cool programming models or cool algorithmic things that you can do on the model side. DeepSeek did some of these really cool innovations because they were limited on interconnect, but they still needed to parallelize. Everyone’s always doing stuff. Google’s got a bunch of work and everyone’s got a bunch of work about this. That stuff is super exciting on the model and workload and innovation side. Hardware, solid-state transformers are interesting. For the power side, all sorts of stuff on batteries and there’s all sorts of stuff on.

(04:57:53) I think if you look at every layer of the compute stack, whether it goes from lithography and etch all the way to fabrication, to optics, to networking, to power, to transformers, to cooling, to a networking, and you just go on up and up and up and up the stack, even air conditioners for data centers are innovating. Copper cables are innovating. You wouldn’t think it, but copper cables, there’s some innovations happening there with the density of how you can pack them and it’s like all of these layers of the stack, all the way up to the models, human progress is at a pace that’s never been seen before.

Lex Fridman (04:58:24) I’m just imagining you sitting back in a layer somewhere with screens everywhere, just monitoring the supply chain where all these clusters, all the information you’re gathering, you’re incredible.

Dylan Patel (04:58:34) There’s a big team, there’s a big team.

Lex Fridman (04:58:38) You do quite incredible work with semi analysis. I mean just keeping your finger on the pulse of human civilization in the digital world. It’s pretty cool just to watch, feel that.

Dylan Patel (04:58:51) Yeah, thank you. I guess.

Lex Fridman (04:58:53) Feel all of us doing shit. Epic shit.

Lex Fridman (04:58:57) I feel from meme to reality. Nathan, is there breakthroughs that you’re looking forward to potentially?

Nathan Lambert (04:59:07) I had a while to think about this while listening to Dylan’s beautiful response.

Dylan Patel (04:59:10) He did listen to me. He was so into it.

Nathan Lambert (04:59:12) No, I knew this was coming and it’s like realistically training models is very fun because there’s so much low-hanging fruit. And the thing that makes my job entertaining, I train models, I write analysis about what’s happening with models and it’s fun because there is obviously so much more progress to be had. And the real motivation, why I do this somewhere where I can share things is that there’s just, I don’t trust people that are like, “Trust me bro, we’re going to make AI good.”

(04:59:39) It’s like we’re the ones that it’s like, we’re going to do it and you can trust us and we’re just going to have all the AI, and it’s just like, I would like a future where more people have a say in what AI is and can understand it, and it’s a little bit less fun that it’s not a positive thing of this is just all really fun. Training models is fun and bring people in as fun, but it’s really AI if it is going to be the most powerful technology of my lifetime, it’s like we need to have a lot of people involved in making that and-

Lex Fridman (05:00:09) Making it open helps with that. As accessible as possible, as open as possible, yeah.

Nathan Lambert (05:00:14) … In my read of the last few years is that more openness would help the AI ecosystem in terms of having more people understand what’s going on. Rather that’s researchers from non-AI fields to governments to everything. It doesn’t mean that openness will always be the answer. I think then it’ll reassess of what is the biggest problem facing AI and tack on a different angle to the wild ride that we’re on.

Lex Fridman (05:00:36) And for me, just from even the user experience, anytime you have like Aparthi said, the aha moments, the magic, seeing the reasoning, the chain of thought, it’s like there’s something really just fundamentally beautiful about that. It’s putting a mirror to ourselves and seeing like, oh, shit. It is solving intelligence as the cliche goal of these companies is, and you get to understand why we humans are special. The intelligence within us is special. And for now also why we’re special in terms of we seem to be conscious and the AI systems for now, and we get to explore that mystery, so it’s just really cool to get to explore these questions that I don’t think I would’ve never imagined would be even possible back when just watching with excitement, deep blue beat Kasparov, I wouldn’t have ever thought this kind of AI would be possible in my lifetime. This is really feels like AI.

Nathan Lambert (05:01:44) I started with AI learning to fly a silly, a quadrotor, it’s like learning to fly and it learned to fly up. It would hit the ceiling and stop and catch it. It’s like, okay, that is really stupid compared to what’s going on now.

Lex Fridman (05:01:57) And now you could probably with natural language tell it to learn to fly and it’s going to generate the control algorithm required to do that probably.

Nathan Lambert (05:02:05) There’s low level blockers. We have to do some weird stuff for that, but you can, you definitely can.

Lex Fridman (05:02:09) Back to our robotics conversation, yeah, when you have to interact in the actual physical world, that’s hard. What gives you hope about the future of human civilization looking into the next 10 years, 100 years, 1000 years, how long do you think we’ll make it? You think we’ve got 1000 years?

Nathan Lambert (05:02:28) I think humans will definitely be around in a 1000 years, I think. There’s ways that very bad things could happen. There’ll be way fewer humans, but humans are very good at surviving. There’s been a lot of things that that is true. I don’t think necessarily we’re good at long-term credit assignment of risk, but when the risk becomes immediate, we tend to figure things out.

Nathan Lambert (05:02:49) And for that reason, there’s physical constraints to things like AGI, like recursive improvement to kill us all type stuff. For the physical reasons and for how humans have figured things out before, I’m not too worried about AI takeover. There are other international things that are worrying, but there’s just fundamental human goodness and trying to amplify that. I think we’re on a tenuous time. And I mean if you look at humanity as a whole, there’s been times where things go backwards, there’s times when things don’t happen at all, and we’re on what should be very positive trajectory right now.

Lex Fridman (05:03:28) Yeah, there seems to be progress, but just like with power, there’s like spikes of human suffering and we want to try to minimize the amount of spikes.

Dylan Patel (05:03:38) Generally, humanity is going to suffer a lot less, I’m very optimistic about that. I do worry of like techno-fascism type stuff arising. As AI becomes more and more prevalent and powerful and those who control it can do more and more, maybe it doesn’t kill us all, but at some point, every very powerful human is going to want to brain- computer interface so that they can interact with the AGI and all of its advantages in many more way and merge its mind and its capabilities or that person’s capabilities can leverage those much better than anyone else and therefore be, it won’t be one person rule them all, but it will be, the thing I worry about is it’ll be few people, hundreds, thousands, tens of thousands, maybe millions of people rule whoever’s left and the economy around it.

(05:04:27) And I think that’s the thing that’s probably more worrisome is human-machine amalgamations. This enables an individual human to have more impact on the world and that impact can be both positive and negative. Generally, humans have positive impacts on the world, at least societally, but it’s possible for individual humans to have such negative impacts. And AGI, at least as I think the labs define it, which is not a runaway sentient thing, but rather just something that can do a lot of tasks really efficiently amplifies the capabilities of someone causing extreme damage. But for the most part, I think it’ll be used for profit-seeking motives, which will increase the abundance and supply of things and therefore reduce suffering, right? That’s the goal.

Lex Fridman (05:05:12) Scrolling on a timeline, just drowning in dopamine-

Dylan Patel (05:05:16) Scrolling open stasis.

Nathan Lambert (05:05:18) Scrolling holds the status quo of the world.

Dylan Patel (05:05:20) That is a positive outcome, right? If I have food tubes and lung down scrolling and I’m happy, that’s a positive outcome.

Lex Fridman (05:05:28) While expanding out into the cosmos. Well, this is a fun time to be alive. And thank you for pushing the forefront of what is possible in humans, and thank you for talking today. This was fun.

Dylan Patel (05:05:29) Thanks for having us.

Nathan Lambert (05:05:41) Thanks for having us.

Lex Fridman (05:05:44) Thanks for listening to this conversation with Dylan Patel and Nathan Lambert. To support this podcast, please check out our sponsors in the description. And now let me leave you with some words from Richard Feynman. “For a successful technology, reality must take precedence over public relations, for nature cannot be fooled.” Thank you for listening and I hope to see you next time.

唐纳德·特朗普访谈录 (2024-09-03)

Donald Trump Interview (2024-09-03)

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:在美国大选白热化阶段,前总统及现任候选人唐纳德·特朗普 (Donald Trump) 接受了科技界知名播主 Lex Fridman 的长篇访谈,旨在通过非传统政治媒体渠道,向更广泛、更偏向技术和独立思考的受众阐述其世界观与执政理念。

  • 核心论点:本次对话的核心,是特朗普系统性地展示了一种将商业交易思维模型应用于治国理政的激进世界观。他将复杂的国际冲突(如乌克兰战争)、国内政策(如移民问题)乃至政治话语本身,都解构为一系列高风险、高回报的“交易”。在这种框架下,传统的意识形态标签(如共产主义/法西斯主义)被视为谈判中的情绪杠杆;外交政策的细节被“保密的谈判策略”所取代;领导力被定义为一种能够洞察对手心理、灵活运用“胡萝卜与大棒”并最终达成有利“交易”的个人能力。其论述的核心逻辑是:世界由强者驱动,结果导向的个人谈判能力远比遵循既定规则的制度化流程更为高效。

2. 🧠 深度观点解析 (Deep Dive Analysis)

维度一:地缘政治的“交易化” (The Transactionalization of Geopolitics)

  • 核心观点:特朗普坚信,乌克兰战争和中美紧张关系等重大国际冲突,并非不可调和的结构性矛盾,而是可以通过他个人作为“伟大交易撮合者”(Great Deal Maker)的介入来迅速解决。他声称自己拥有解决这些问题的“确切计划”,但拒绝透露细节,理由是“惊喜是策略的一部分”。

  • 原理解构:该观点源于其在房地产和商业领域的成功经验。他将国家领导人(如普京、泽连斯基)视为谈判桌上的对手,认为每个人都有其心理弱点和利益诉求。其解决路径并非依赖多边主义或国际法,而是通过:

    1. 个性化策略:判断对手是更吃“胡萝卜”(利益诱惑)还是“大棒”(军事或经济威胁)。
    2. 信息不对称:将具体计划保密,制造不确定性,从而获得谈判优势。
    3. 个人关系杠杆:强调他与普京等人曾有“良好关系”,暗示这种私人关系可以转化为国家利益。 这套方法论将国际关系从一个基于联盟、规则和长期战略的复杂系统,简化为一个围绕关键人物展开的高风险个人博弈。
  • 证据/案例

    • 乌克兰:他“保证”一旦当选总统,就能达成协议结束战争,但拒绝说明具体方案。
    • 阿富汗:声称通过对塔利班领导人阿卜杜勒(Abdul)使用“大棒”策略,实现了长达18个月的美军零伤亡。
    • 北约(NATO):他将自己对盟友的强硬立场(要求增加军费)视为一种成功的“交易”,声称这挽救了北约。

维度二:政治话语的“焦土战术” (The “Scorched-Earth” Tactics in Political Discourse)

  • 核心观点:面对政治对手的攻击(如称他为“法西斯主义者”),特朗普奉行“以火攻火”(Fight Fire with Fire)的对等报复策略,即便这意味着使用同样极端或被认为不准确的标签(如称卡玛拉·哈里斯为“共产主义者”)。

  • 原理解构:这是一种典型的叙事控制策略。其逻辑在于:

    1. 消解攻击合法性:通过使用对等攻击,将辩论的焦点从“指控是否属实”转移到“双方都在互相抹黑”的泥潭战中,从而使对手的初始攻击失效。
    2. 动员核心支持者:极化的语言能有效激发其基本盘的战斗情绪和身份认同,将政治分歧升级为一场“我们”对抗“他们”的生存之战。
    3. 定义议程:主动抛出争议性言论,迫使媒体和对手围绕他设定的议题进行反应,从而主导舆论周期。 他并不寻求中间派的理性共识,而是通过激化矛盾来巩固自身阵营。
  • 证据/案例

    • 当 Lex Fridman 质疑称哈里斯为“共产主义者”时,他直言不讳:“他们叫我法西斯,所以我觉得叫他们共产主义者也行…我相信你必须以火攻火。”

维度三:商界与政界的能力鸿沟 (The Skill Gap Between Business and Politics)

  • 核心观点:特朗普认为,商业上的成功与政治上的成功需要截然不同的核心能力,二者之间存在巨大的转换壁垒。商界精英普遍缺乏在公众面前演讲、承受攻击和发起竞选的“胆识”(guts)。

  • 原理解构:他将政治的核心能力定义为一种表演性(Performative)和心理韧性(Psychological Resilience)的结合

    • 公开演讲能力:政治家必须能吸引并掌控大规模现场观众,这是一种与 boardroom 决策截然不同的技能。
    • 心理门槛:竞选总统意味着将个人和家庭完全暴露在公众审视和无情攻击之下,需要极强的心理承受能力。
    • 舞台恐惧(Stage Fright):他指出,许多商界“杀手”(killers)在面对公众舞台时会“窒息”(choke),这是一种无法通过商业训练弥补的天赋差异。
  • 证据/案例

    • 他提到自己认识许多非常成功的商人,他们谈论竞选总统已有15年之久,但始终不敢“扣动扳机”(pull the trigger)。
    • 描述一位商业上的“杀手”,却有严重的舞台恐惧症,一上台就表现糟糕。

维度四:对“未解之谜”的实用主义承诺 (Pragmatic Promises on Unsolved Mysteries)

  • 核心观点:对于公众高度关注但官方信息不透明的事件,如不明飞行物(UFO)、肯尼迪遇刺案(JFK)和爱泼斯坦案(Jeffrey Epstein),特朗普承诺上任后会提高透明度,并倾向于公布相关文件。

  • 原理解构:这是一种精明的政治姿态,旨在迎合民众对“深层政府”(Deep State)和信息黑箱的不信任感。通过承诺解密,他将自己定位为挑战建制、为民请命的“局外人”。

    1. 低成本承诺:这些承诺在竞选阶段几乎没有执行成本,但能有效吸引阴谋论受众和追求透明度的独立选民。
    2. 保留灵活性:他用“会考虑”、“倾向于”等词语,为自己未来是否兑现承诺留有余地,并指出解密肯尼迪档案时曾因“危及某些人”而受到压力。
    3. 对比建制派:暗示现有体系(包括情报机构)出于自身利益而隐瞒真相,而他不受这些束缚。
  • 证据/案例

    • UFO/JFK:明确表示会推动五角大楼公布更多影像,并“很早”就会处理肯尼迪档案的解密。
    • Epstein:当被问及为何客户名单未公布时,他回应“这很有趣,不是吗?”,并表示对公布名单“没有问题”。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识:特朗普的观点挑战了现代外交政策的核心共识——即国际关系是建立在制度、联盟和可预测行为之上的复杂系统。他将其简化为个人魅力和高压谈判的艺术,认为一个强大的领导者可以凌驾于体系之上,快速解决看似无解的冲突。这与传统外交界信奉的长期、渐进式解决方案背道而驰。

  • 盲点与局限

    • 计划的真空:最显著的盲点是他拒绝提供任何具体政策计划的细节,无论是针对乌克兰战争还是中美关系。他将此包装为一种高级谈判策略(“不能亮出底牌”),但这也可以被解读为缺乏深思熟虑的方案,或是一种回避公众审视的手段。
    • 对制度的忽视:他的世界观极度以个人为中心,几乎完全忽略了政府机构、官僚体系和法律框架在政策执行中的关键作用。他认为只要自己做出“交易”,国家机器就会自动跟进,这忽视了治理的复杂性。
    • 自我认知偏差:在讨论权力的腐蚀性时,他以“本可以把希拉里关进监狱但没有”为例,证明自己克制。然而,批评者会认为,将司法武器化本身就是一种滥用权力的威胁,无论最终是否执行。
  • 未解之谜:对话暴露了一个核心难题:如何弥合美国社会深刻的政治分裂? 特朗普的解决方案是“除掉”拜登和哈里斯,但这只是替换了对手,并未解决导致分裂的根本原因。他将分裂归咎于“邪恶的人”和“激进左翼疯子”,这是一种简化归因,回避了社会、经济和文化层面的复杂动因。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “I believe you have to fight fire with fire.”

    • 中文意译:“我相信你必须以火攻火。”
    • 语境:在解释为何用“共产主义者”这样的标签回击称他为“法西斯主义者”的对手时,这句话精炼地概括了他“以牙还牙”的斗争哲学。
  2. “Part of it is surprise, right?”

    • 中文意译:“(策略的)一部分就是出其不意,对吧?”
    • 语境:当被 Lex Fridman 追问其解决乌克兰和中国问题的具体计划时,他以此为由拒绝透露细节,将其包装为一种必要的谈判战术,凸显其交易撮合者的自我定位。
  3. “Life is what you do while you’re waiting to die, so you might as well do a good job.”

    • 中文意译:“人生就是你在等待死亡时所做的事,所以你最好把它做好。”
    • 语境:在被要求评价其民主党对手时,他出人意料地给出了一个带有存在主义色彩的回答,展现了其话语体系中罕见的哲学化瞬间。
  4. “They suffer from massive Trump derangement syndrome, TDS, and I don’t know if it’s curable from their standpoint.”

    • 中文意译:“他们患有严重的‘特朗普精神错乱综合症’(TDS),而且我不知道从他们的角度看,这病是否能治好。”
    • 语境:这是他对批评者的经典诊断,将政治反对意见病理化,定义为一种非理性的综合症,从而否定了批评的合法性。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)

    • 科技与媒体行业:政治人物将进一步绕过传统媒体,拥抱 Lex Fridman、Joe Rogan 等长篇播客和 X Spaces 这类直接触达受众的平台。这将加剧媒体格局的碎片化,并对社交媒体平台的内容审核政策构成持续的政治压力。
    • 地缘政治与国际贸易:若特朗普的理念付诸实践,全球盟友体系将面临高度不确定性。国际协议(如气候协定、贸易协定)可能被视为可随时重新谈判的“交易”,导致全球供应链和投资环境的波动性急剧增加
  • 长期终局 (5-10年)

    • 政治范式演变:如果“交易型总统”模式被证明成功,可能会永久性地改变对政治领袖的期望。未来,拥有强大个人品牌、媒体操控能力和商业背景的“局外人”可能会比传统政治家更具竞争力,导致全球政治精英的构成发生变化。
    • 国际秩序重塑:一个以双边“交易”而非多边规则为基础的世界秩序可能出现。联合国、世界贸易组织等国际机构的影响力将被削弱,世界格局可能演变为几个由强人领导的核心势力范围之间的动态博弈,而非基于共同价值观的联盟。
  • 行动建议

    • 开发者/创业者:政治话语的“去中心化”和“部落化”趋势是明确的。应关注开发抗审查、保护言论自由、支持创作者经济的工具和平台。同时,利用AI分析工具来理解和预测快速变化的舆论风向,将成为企业公关和营销的必备能力。
    • 投资者地缘政治风险的评估模型需要重构,必须加入“领导人个性”这一极不稳定的变量。投资决策需从评估国家政策的连贯性,转向评估关键领导人之间私人关系的动态变化。能源、国防、半导体和国内制造业等受政策影响巨大的行业,将成为高风险、高回报的博弈场。
    • 企业高管:特朗普的沟通方式证明了极致的品牌人格化的力量。无论是否认同其风格,企业都应思考如何建立更直接、更具情感穿透力的品牌叙事,并准备好在高度极化的舆论环境中迅速应对危机和机遇。

深度研报:权力、媒介与交易型现实主义 —— Lex Fridman 对话特朗普

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:本次对话发生于 2024 年美国大选进入白热化阶段。Lex Fridman,作为拥有深厚 AI 背景的技术主义播客主,试图通过长达一小时的深度访谈,剥离媒体贴在特朗普身上的“法西斯”或“救世主”标签,探寻这位前总统在政治游戏、地缘博弈及媒介变革中的底层逻辑。
  • 核心论点:对话揭示了特朗普核心的世界观——交易型现实主义(Transactional Realism)。他将政治视为一种高风险的“肮脏游戏”,核心驱动力在于“赢”的心理建设与大开大合的筹码交换。无论是俄乌冲突、中美竞争,还是国内治理,特朗普倾向于绕过传统的科层官僚体系,利用去中心化的新媒体平台(如播客、Spaces)建立个人直连,并试图通过“不可预测性”和“大棒政策”重塑全球秩序。

2. 🧠 深度观点解析 (Deep Dive Analysis)

I. 媒介权力的迁移:从传统电视到去中心化播客

  • 核心观点:特朗普承认传统媒体(Legacy Media)的衰落,转而拥抱播客和社交平台作为主要的政治动员工具。
  • 原理解构:特朗普观察到,传统电视受众正在老龄化且影响力日稀。新媒介(如 Elon Musk 的 Spaces、播客)提供了巨大的流量红利和更长的叙事维度。这种转变本质上是政治传播的去中介化,候选人不再需要通过记者的“滤镜”,而是通过长达数小时的对话建立“真实感”。
  • 证据/案例:提及与 Elon Musk 在 Spaces 上的直播数据是电台或电视无法企及的;同时,特朗普称他的 Truth Social 为自己的“打字机”。

II. 地缘政治的“艺术”:交易逻辑下的冲突调停

  • 核心观点:特朗普主张通过个人关系与“大棒”威胁,在总统当选后立即终止俄乌战争。
  • 原理解构:他认为战争的根源在于“领导力真空”。他的策略是强力震慑(The Stick),即通过建立一种让对手(如普京、泽连斯基)“恐惧且尊重”的个人威慑力,迫使双方进入谈判桌。这是一种高度依赖个人特质而非外交程序的“交易型外交”。
  • 证据/案例:提到他与普京、泽连斯基的过往关系;强调他执政期间没有发动新战争;批评拜登政府在阿富汗撤军表现出的“无能”诱发了后续冲突。

III. 胜者心理学:驱动力与“舞台感”

  • 核心观点:伟大的冠军(无论在体育还是政治领域)都有着超越常人的驱动力,且这种能力无法从商界简单复制到政界。
  • 原理解构:特朗普分析了商业领袖跨界政治失败的原因——**“舞台恐惧症”(Stage Fright)**和缺乏应对高强度攻击的韧性。他认为政治成功的核心在于“感知趋势的能力”和“掌控观众的能力”,这是一种融合了心理博弈与演说技巧的综合体。
  • 证据/案例:以老虎伍兹、迈克尔·乔丹为例,说明冠军心态的共性;提到某些身价百亿的 CEO 想竞选总统却在演讲时“窒息”。

IV. 监管实用主义:大麻合法化与药物改革的转向

  • 核心观点:特朗普对医疗大麻持积极态度,并对佛罗里达州的大麻合法化公投表现出某种程度的接纳。
  • 原理解构:这反映了其政策的灵活性。他观察到民意(特别是跨党派的选民)对大麻的看法已发生根本转变。他的立场是:接受合法化趋势,但强调必须在“安全、受控、有法治”的前提下进行,以维持其“法律与秩序”的基本盘。
  • 证据/案例:提到医疗大麻的惊人效果;支持佛罗里达州以“干净”的方式处理大麻监管。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识:保守派旗手与宗教/药物的复杂关系。特朗普虽被视为右翼代表,但他对大麻和迷幻药(用于退伍军人 PTSD)的开放态度挑战了传统保守主义的刻板印象。他将宗教视为一种“护栏”(Guardrails),而非纯粹的教条,这种工具性视角值得关注。
  • 盲点与局限:地缘政治的过度简化。特朗普坚称能“立即”解决俄乌冲突,但在对话中拒绝透露任何具体细节(理由是保留惊喜)。这种策略在外交学界常被批评为缺乏系统性的战略支撑,仅靠个人威望是否能抵消复杂的历史领土纠纷仍是巨大谜团。
  • 未解之谜:选举叙事的未来张力。特朗普仍对 2020 年选举存在质疑,但他同时表示要“关注未来”。这种在“纠结过去”与“赢得未来”之间的心理拉锯,可能会在争取中间派选民时产生负面作用。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “Politics is a dirty game. You win at that game by getting the word out and by using sense.” (政治是一场肮脏的游戏。你通过传播声音和运用直觉来赢得这场游戏。) —— 谈论政治竞争的本质。
  2. “Life is what you do while you’re waiting to die, so you might as well do a good job.” (生活就是你在等待死亡时所做的事,所以你最好还是把工作做好。) —— 特朗普的人生哲学与虚无主义底色。
  3. “Without religion there are no guardrails. I’d love to see us get back to more religion in this country.” (没有宗教就没有护栏。我希望看到这个国家回归更多的宗教信仰。) —— 谈论社会道德秩序的重建。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)

    • 新媒体正式加冕:如果特朗普通过播客等渠道获胜,将彻底终结传统有线电视(CNN/Fox)对政治叙事的垄断。政治人物将更频繁地出现在硅谷的技术播客中。
    • 地缘政治不确定性溢价:由于特朗普主张“不可预测性”,全球金融市场可能会在选举临近时因担心国际协议被推翻而出现波动。
  • 长期终局 (5-10年)

    • 个人品牌化政府:未来领袖可能更多地表现为“超级 IP”,而非机构代理人。
    • 透明度革命的压力:特朗普承诺发布 JFK 和 Epstein 文件,这可能开启一个民众要求政府信息公开的新纪元(无论其实际执行程度如何)。
  • 行动建议

    • 对于投资者:关注去中心化社交平台(如 X/Truth Social)的护城河变化;对地缘政治敏感型资产持有对冲策略,以应对“特朗普式”的突发外交政策。
    • 对于技术开发者/创业者:参考 Lex Fridman 在片尾提到的 AI 与编程的关系。不要与 AI 竞争,而要学会通过自然语言进行“高阶设计”。
    • 对于内容创作者:长格式(Long-form)对话的商业价值与社会影响力已超越碎片化内容,建立深度的、非对抗性的对话是未来的稀缺品。

深度研报:权力的心理学与历史的决定论

——基于《Lex Fridman Podcast》对谈稿的深度分析


1. 🎯 核心论题与背景

  • 对话背景: 本次访谈由麻省理工学院(MIT)的人工智能研究员兼播客主持人 Lex Fridman 担任对话者,特邀美国前总统 Donald Trump(唐纳德·特朗普)参与。背景设定在美国大选前夕(2024年),涉及地缘政治(乌克兰、中东)、美国国内政治极化、AI 技术对职业的威胁以及个体的道德哲学思考。

  • 核心论点: 对话不仅仅是政治人物的竞选演讲,更是一次关于**“权力的心理机制”与“真理的媒介形式”**的深度探讨。特朗普试图构建一种基于“交易逻辑”与“强硬对峙”(即 Stick vs. Carrot)的全球治理模型,并认为当前的西方自由派体系陷入了软弱与认知失调。作为对话另一方,Fridman 则以历史学家和社会观察者的视角,强调长期品牌建设、对抗“平庸的焦虑”,以及在 AI 时代如何通过“设计思维”而非单纯的代码实现来保留人类的主体性。两者共同勾勒出一幅现代权力运作的技术草图:媒体即武器,算法即平台,而历史的尘埃掩盖了人性的真相。


2. 🧠 深度观点解析

2.1 胜者心态与心理学

  • 核心观点:胜利不仅仅是结果的度量,是一种独特的心理状态和无法动摇的毅力,主要受“厌恶失败”驱动,而不仅仅是追求成功的快感。
  • 原理解构:特朗普将政治和商业视为一场必须赢的游戏,其本质不同于传统的儒家式的“和为贵”。他认为体育冠军(如乔丹、泰格·伍兹)与优秀的商人拥有相似的特质——一种**“压倒性的、非理性的驱动力”**,是其他人在顺境时可能放弃的临界点。
  • 证据/案例:引用泰格·伍兹、杰克·尼克劳斯、迈克尔·乔丹等体育界的胜利者,以及他在阿富汗撤军时选择强硬手段导致塔利班不敢开枪的谈判案例。

2.2 贸易战与地缘政治的“大棒策略”

  • 核心观点:在国际关系中,当前的软实力已经被视为软弱,军事威胁和制裁(Stick,大棒)比经济诱因(Carrot,胡萝卜)在威慑下更具效力。
  • 原理解构:特朗普认为这是一种“掠夺性现实主义”的国际关系模型。他认为美国目前的体制(通过拜登/哈里斯)在执行时是愚蠢且低效的(如允许竞争对手购买廉价石油),必须恢复到“让他们感到害怕/尊重”的施压态势。他还强调,为了战略成功,必须保守秘密,不能提前公布“和平方案”。
  • 证据/案例:提到结束“北溪二号”管道以及北约的经济负担分担问题,并声称如果公布乌克兰或中国的具体调停计划,其效力将减半。

2.3 媒体生态与后真相时代的博弈

  • 核心观点:政治竞争的核心战场是“注意力机制”。社交媒体(Truth Social/X)成为了比传统电视更具统治力的政治宣传渠道,且“转发”是造谣和制造争议的最有效手段。
  • 原理解构:特朗普将 Truth Social 视为他的“打字机”,其运营逻辑包含故意制造冲突以最大化用户停留度。他认为政治对手的主流媒体是偏见的,而自己在道德上是为了国家和反击“邪恶”而战(Fight fire with fire)。
  • 证据/案例:提到在 Spaces 上与 Elon Musk 互动获得的“数千万人观看量”这一数据,并批评 CNN 对 Harris 的采访过于软弱。

2.4 AI 对职业的冲击与“人类护城河”

  • 核心观点:程序员和白领正面临被简单的提示词(Prompt)所替代的风险,未来的核心竞争力将从“执行代码”转向“使用 AI 设计系统”,这引发了存在主义的职业焦虑。
  • 原理解构:Fridman 提出了一个悲观但现实的假设:由于 Stack Overflow 聚合了全人类的编码智慧,现在的人类单个程序员在质量上可能已经被 AI 压制,甚至包括他自己。如果不转型,人类将从“软件工程师”退化。
  • 证据/案例:Fridman 具体提到了从 VS Code 迁移到 Cursor(基于 LLM 的集成编辑器)的过程,主张利用 AI 进行“编曲”而非“演奏”。同时引用了一位 32 岁读博学生因 AI 威胁而产生的系统性职业焦虑。

2.5 历史视角下的人类定位

  • 核心观点:历史由少数永恒真理的捍卫者书写,而不是当下的流量噪音。痛苦的抉择和直面死亡的勇气是区分领导力与管理的标准。
  • 原理解构:Fridman 借 William Shirer 和 Dan Carlin 的历史观指出,政治家的关键在于能否在知识和战略上保持深度,而不仅仅是辩论时的口齿伶俐。
  • 鲁德亚德·吉卜林的诗歌(语录):“If you can meet with Triumph and Disaster… treat those two imposters just the same.” 这被用作应对焦虑的核心哲学,强调心理韧性是职业成功的基石。

3. 💡 反直觉与批判性视角

打破共识:审慎主义与非对称战争

  • 颠覆点:特朗普透露,为了压服对手,他甚至在某些需要“仁慈”的场合也必须保持威慑力(例如在阿富汗使用 Stick)。这种为了“赢”而极度简化情感的策略,挑战了现代西方主流政治中强调外交辞令和以人为本的观念。
  • 盲点:他完全无视盟友(如德国)的战略自主性,认为美国的军事保护是理所应当且无限供给的,忽略了盟友关系中的经济互惠规律,这可能导致长期的欧亚战略孤立。

对“专业主义”的误解

  • 批判:Fridman 指出,特朗普等人往往不是死于技术无能,而是死于**“心理瘫痪”**(Stage fright)。然而,特朗普对这一点嗤之以鼻,他认为如果能卖掉第四十层高楼,上台演讲只是小菜一碟。这种自信某种程度上是缺乏自我反思的表现。
  • 盲点:特朗普声称自己“从不因为权力而腐败”,但在极度集中的权力面前,人类的伦理防线往往比想象中脆弱(这才是特朗普那句“本来可以送 Hillary 进监狱”所隐含的严峻事实)。

AI 带来的“平庸的恐惧”

  • 批判:Fridman 承认乐在其中地用 AI 编写“约等于正确但很烂的代码”,这反映了技术社区的一种精英病的傲慢——认为“平庸的技术”已经没有价值。但从更广阔的视角看,当生成式技术降低了技能门槛,人类可能陷入焦虑的集体低谷,而非真正的生产力爆发。

4. 💎 金句与高光时刻

  1. “They don’t give up. They don’t give up.”
    • 语境:特朗普描述伟大的体育冠军和商业领袖的特质,强调毅力胜过天赋。
  2. “They don’t care too much. I know people that care so much about everything… because you end up choking.”
    • 语境:关于如何应对来自外部的攻击,建议保持一种超然的态度,以免产生心理负担而行动迟缓。
  3. “I think I can make a deal if I win as president-elect, I’ll have a deal made guaranteed.”
    • 语境:在谈及乌克兰危机时,特朗普展现出迷之自信,认为不存在无法解决的困境。
  4. “Those who can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety.”
    • 语境:Fridman 在 AMA 部分引用 Benjamin Franklin 的话,评论隐私权与国家安全之间的经典巴别塔冲突。
  5. “I will force my heart and nerve and sinew to serve my turn long after they’re gone… yours is the earth and everything that’s in it.”
    • 语境:Fridman 鼓励面临职业焦虑的创作者,这或许是面对 AI 威胁时最可贵的定力。

5. 🚀 行业启示与未来推演

短期影响 (1-3年):媒体生态的重塑

  • 政治传播:特朗普依赖的“狂野推文”模式将继续主导现代选举政治的注意力经济。真正的流量不再是演讲稿,而是针对特定群体的挑逗性言论。
  • 技术工具:AI 辅助的内容创作(如 Cursor)将成为内容创作者的标配,正如文中提及的,程序员必须从“造轮子”转向“指挥轮子”。

长期终局 (5-10年):代理人战争与人类护城河

  • 地缘政治:随着特朗普所暗示的“交易型外交”回归(即盟友付费,对手恐惧),全球将分裂为“被保护国”和“域外强权”。如果没有强势的领导者,如文中提到的 WWIII 风险,全球化可能会进一步退化为区域割据。
  • 职业图景:千万级别的熟练白领(程序员、初级分析师)将通过提示词工程被替代。只有具备**“历史洞察力”“情感交互能力”“复杂系统构建能力”**的人才能制定出让 AI 执行的“大创意”。生产关系将进一步偏向“创意精英”与“执行辅助机器”的对立。

行动建议

  • 对于领导者:不要试图去“说服”对手,也不要试图去向所有人解释你的策略。建立强大的个人品牌和直接对话渠道,并接受“被误解”是维持决策成本的必要一部分。
  • 对于职场人:停止过度关注“技能的细节”,转而深耕“专业技能的驾驭”。像 Fridman 建议 AI 工程师那样,将自己视为 AI 的“导演”,专注于编写自然语言的指令和设计最终的架构。

逐字稿

Introduction

Lex Fridman (00:00:00) I don’t know if you know this, but some people call you a fascist.

Donald Trump (00:00:03) Yeah, they do. So I figure it’s all right to call them a communist. Yeah, they call me a lot worse than I call them.

Lex Fridman (00:00:08) A lot of people listening to this, myself included, that doesn’t think that Kamala is a communist.

Donald Trump (00:00:15) I believe you have to fight fire with fire.

Lex Fridman (00:00:17) Politics is a dirty game.

Donald Trump (00:00:19) It is a dirty game. That’s certainly true.

Lex Fridman (00:00:21) How do you win at that game?

Donald Trump (00:00:24) They suffer from massive Trump derangement syndrome, TDS, and I don’t know if it’s curable from their standpoint.

Lex Fridman (00:00:35) I think we would probably have a better world if everybody in Congress took some mushrooms perhaps?

Donald Trump (00:00:41) First of all, medical marijuana has been amazing. I’ve had friends and I’ve had others and doctors telling me that it’s been absolutely amazing.

Lex Fridman (00:00:53) The list of clients that went to the island has not been made public.

Donald Trump (00:00:57) Yeah, it’s very interesting, isn’t it?

Lex Fridman (00:01:03) The following is a conversation with Donald Trump on this, the Lex Friedman Podcast.

Psychology of winning and losing

Donald Trump (00:01:09) They’re getting smaller and smaller.

Lex Fridman (00:01:11) They’re getting smaller.

Lex Fridman (00:01:13) People do respect you more when you have a big camera for some reason.

Donald Trump (00:01:15) No, it’s cool. And about 20 guys that you pay a fortune to. Right?

Lex Fridman (00:01:18) All right. Okay. You said that you love winning. And you have won a lot in life, in real estate, in business, in TV and politics. So let me start with a mindset, a psychology question. What drives you more, the love of winning or the hate of losing?

Donald Trump (00:01:41) Maybe equally, maybe both. I don’t like losing and I do like winning. I’ve never thought of it as to which is more of a driving force.

Lex Fridman (00:01:51) You’ve been close with a lot of the greats in sport. You think about Tiger Woods, Muhammad Ali, you have people like Michael Jordan, who I think hate losing more than anybody. So what do you learn from those guys?

Donald Trump (00:02:06) Well, they do have something different. The great champions have something very different, the sports champions. And you have champions in other fields, but you see it more readily in sports. You see it over a weekend or you see it during a game. And you see that certain people stand out and they keep standing out. But it’s there for you, it doesn’t take a lifetime to find out that somebody was a winner or a loser. And so the sports thing is very interesting. But I play golf with different people and there’s a different mindset among champions. There’s really a very different mindset. There’s a different thought process.

(00:02:50) Talent wise, sometimes you can’t tell the difference in talent. But at the end of a weekend, they seem to win and it’s very interesting. As an example, Tiger or Jack Nicklaus, he was a phenomenal winner and he does have a different way about him and Tiger has a different way about him and Michael Jordan. There’s never one, you would think that there’d be one way. Arnold Palmer was the nicest guy you’d ever meet. And then you have some champions that aren’t really nice, they’re just focused on doing their job. So there’s not one type of person. But the one thing I would say that everybody seems to have in common is they’re very driven. They’re driven beyond.

Lex Fridman (00:03:39) They don’t seem to give up easily.

Donald Trump (00:03:41) They don’t give up. They don’t give up, but they do seem to be, they have a passion that’s maybe more than people that don’t do as well.

Politics is a dirty game

Lex Fridman (00:03:51) You’ve said that politics is a dirty game-

Donald Trump (00:03:56) It is a dirty game. That’s certainly true.

Lex Fridman (00:03:59) So if it is a game, how do you win at that game?

Donald Trump (00:04:02) Well, you win at that game by getting the word out and by using sense. You have to have a feeling where it’s going. You also have to have a feeling of what’s right. You can’t necessarily just go what’s popular, you have to do what’s good for a country if you’re talking about countries. But you have to get the word out and you have to just continuously, like for instance, you have a great show, you have a great podcast, it’s very well watched. And I’m sitting here and I do this, a lot of people see it and I do other things and a lot of people see that. And I go traditional also, you have traditional television, which is getting a little bit older and maybe less significant, could be less significant, I don’t know. But it’s changing a lot.

(00:04:48) The whole plane of platform is changing a lot. It’s changed a lot in the last two, three years. But from a political standpoint, you have to find out what people are doing, what they’re watching and you have to get on. I just see that these platforms are starting to dominate, they’re getting very big numbers. I did Spaces with Elon and they got numbers like nobody’s ever heard before. So you wouldn’t do that on radio, you wouldn’t do those numbers, no matter how good a show, you wouldn’t do those numbers on radio, you wouldn’t do on television.

Business vs politics

Lex Fridman (00:05:28) You’ve been successful in business, you’ve been successful in politics. What do you think is the difference between gaining success between the two different disparate worlds?

Donald Trump (00:05:37) Yeah, and they’re different, very different. I have a lot of people that are in business that are successful and they’d like to go over to politics and then you realize they can’t speak, they choke. It’s hard to make a speech in front of, let’s say you’re talking about a big audience, but I get very big audiences. And for many people it’s virtually impossible to get up and speak for an hour and a half and have nobody leave. It’s not an easy thing to do. And it’s an ability. But I have many people that are very, very successful in business, would love to do what I did. And yet, they can’t pull the trigger. And in many cases, I don’t think it would work. Almost for everybody, it’s not going to work. It’s a very tough thing to do. It’s a big transition.

(00:06:35) Now, if you talked about people in the business and politics going into business, likewise, that wouldn’t generally work out so well either. It’s different talents, it’s different. I have somebody that wants to go into politics so bad, but he’s got a little problem, he’s got stage fright. Now, he’s a total killer, but if he gets up onto a stage in front of people, he doesn’t do well, to put it mildly actually. He does badly.

Lex Fridman (00:07:03) So you have to be able to make hard decisions like you do in business, but also be able to captivate an audience.

Donald Trump (00:07:09) Look, if you’re a politician, you have to be able to speak in front of large crowds. There are a lot of people who can’t do that. I’ve seen it. They can’t even think about doing it and they don’t. There are many people in business right now, I could name them, but I don’t want to embarrass anybody, they’ve been talking about running for president for 15 years. And they’re very big in business, very well known actually, but it takes guts to run. For president, I can tell you it takes guts to run. It’s also a very dangerous profession if you want to know the truth, but dangerous in a different sense too. But it takes a lot of courage to run for president. It’s not easy. But you have and you know the same people as I do, there are a lot of people that would like to run for president that are very, very successful in business, but they don’t have the guts to do it and they have to give up a lot.

War in Ukraine

Lex Fridman (00:08:05) One of the great things about people from the business world is they’re often great deal makers and you’re a great deal maker and you’ve talked about the war in Ukraine and that you would be able to find a deal that both Putin and Zelenskyy would accept. What do you think that deal looks like?

Donald Trump (00:08:24) I think the deal and I wouldn’t talk about it too much because I think I can make a deal if I win as president-elect, I’ll have a deal made guaranteed. That’s a war that shouldn’t have happened. It’s terrible. Look, Biden is the worst president in the history of our country and she’s probably worse than him. That’s something that should have never happened, but it did happen. And now it’s a much tougher deal to make than it would’ve been before it started. Millions of people, I think the number’s going to be a lot higher when you see this all at some point to iron out, I think the numbers are going to be, the death numbers are going to be a lot higher than people think. When you take a look at the destruction and the buildings coming down all over the place in Ukraine, I think those numbers are going to be a lot higher.

(00:09:12) They lie about the numbers. They try and keep them low. They knock down a building that’s two blocks long, these are big buildings and they say one person was mildly injured. No, no, a lot of people were killed. And there are people in those buildings and they have no chance. Once they start coming down, there’s no chance. So that’s a war that absolutely has to get done. And then you have Israel and then you have a lot of other places that are talking war. The world is a rough place right now and a lot of it’s because of the fact that America has no leadership. And I believe that she’ll be probably worse than Biden. I watched the interview the other night, it was just a softball interview.

Kamala Harris interview on CNN

Lex Fridman (00:09:59) So you would like to see her do more interviews, challenged more.

Donald Trump (00:10:03) I don’t know. I can’t believe the whole thing is happening. We had a man in there that should have never been in there. They kept him in a basement. They used COVID. They cheated, but they used COVID to cheat. Then they cheated without COVID too. But you had somebody in there and now we have a woman that is not, she couldn’t do an interview. This was a really soft interview. This is an interview where they’re giving her multiple choice questions, multiple guess, I call it multiple guess. And I don’t think she did well. I think she did very poorly.

Trump-Harris debate

Lex Fridman (00:10:36) How do you think you’ll do in the debate coming up, that’s in a few days?

Donald Trump (00:10:39) So I’ve done a lot of debating, only as a politician. I never debated. My first debate was the Rosie O’Donnell debate, the famous Rosie O’Donnell debate, the answer. But I’ve done well with debates. I became president. Then the second time, I got millions more votes than I got the first time. I was told if I got 63 million, which is what I got the first time, you would win, you can’t not when. And I got millions of more votes on that and lost by a whisker. And look what happened to the world with all of the wars and all of the problems. And look what happened with inflation because inflation is just eating up our country, eating it up. So it’s too bad. But there are a lot of things that could happen. We have to get those wars settled. I’ll tell you, you have to get Ukraine done. That could end up in a third world war. So could the Middle East. So could the Middle East.

Lex Fridman (00:11:39) So maybe let’s talk about what it takes to negotiate with somebody like Putin or Zelenskyy. Do you think Putin would be willing to give up any of the regions that are already captured?

Donald Trump (00:11:49) I don’t know. I can tell you that all of this would’ve never happened and it would’ve been very easy because you don’t have, that question wouldn’t be asked. That’s a tougher question. Once that starts happening because he has taken over a lot of territory, now I guess they’re insurgents now too. Right? So it’s a little bit interesting that that’s happening and that it can happen. And it’s interesting that Putin has allowed that to happen. Look, that’s one that should have never started. We have to get it stopped. Ukraine is being demolished. They’re destroying a great culture that’s largely destroyed.

Lex Fridman (00:12:32) What do you think works better in those kinds of negotiations? Leverage of let’s say friendship, the carrot or the stick, friendship or sort of the threat of using the economic and military power?

Donald Trump (00:12:46) So it depends on who the person is. Everyone’s different. Negotiation is interesting because it depends on who the person is. And then you have to guess or know through certain knowledge, which is more important, the carrot or the stick. And with some people, it’s the stick. And with some people, it’s the carrot. I think the stick probably is generally more successful in that we’re talking about war. But the kind of destruction that we’re witnessing now, nobody’s ever seen. It’s a terrible thing. And we’re witnessing it all over. We’re witnessing it in all parts of the world and a lot of things are going to get started. Look what’s going on with China. Look at Japan, they’re starting to rearm now. They’re starting to rearm because China’s taken over certain islands and there’s a lot of danger in the war right now, in the world.

China

(00:13:46) And there’s a great possibility of World War III and we better get this thing done fast because five months with people like her and him, he’s checked out, he just goes to the beach and thinks he looks good in a bathing suit, which he doesn’t, he’s sort of checked out. Hey look, you can’t blame him. That was a coup, they took it over. They took over the presidential deal. The whole presidential thing was taken over in a coup. He had 14 million votes. He had no votes, not one. And nobody thought it was going to be her. Nobody wanted it to be her. She was a joke until six weeks ago when they said we’re going to have to, politically, they felt they had to pick her. And if they didn’t pick her, they thought there would be a problem. I don’t know if that’s right or not. I actually don’t think it’s right, but they thought it was right. And now, immediately the press comes to their aid.

Lex Fridman (00:14:48) If we can go back to China, on negotiation, how do we avoid war with China in the 21st century?

Donald Trump (00:14:56) Well, there are ways. Now here’s the problem. If I tell you how and I’d love to do it, but if I give you a plan, I have a very exacting plan how to stop Ukraine and Russia. And I have a certain idea, maybe not a plan, but an idea for China. Because we do, we’re in a lot of trouble. They’ll be in a lot of trouble too, but we’re in a lot of trouble. But I can’t give you those plans because if I give you those plans, I’m not going to be able to use them, they’ll be very unsuccessful. Part of it is surprise, right?

Donald Trump (00:15:31) But they won’t be able to help us much.

Lex Fridman (00:15:35) So you have a plan of what to say to Putin when you take office?

Donald Trump (00:15:39) Yeah, I know [inaudible 00:15:40]. No, I had a very good relationship with him and I had a good relationship with Zelenskyy too, but had a very good relationship with Putin.

2020 election

Lex Fridman (00:15:47) Tough topic, but important. You said lost by whisker. I’m an Independent, I have a lot of friends who are Independent, many of whom like your policies, like the fact that you’re a dealmaker, like the fact that you can end wars, but they are troubled by what happened in the 2020 election and statements about widespread fraud and this kind of stuff, fake election scheme. What can you say to those Independent voters to help them decide who to vote for?

Donald Trump (00:16:24) Right. I think the fraud was on the other side. I think the election was a fraud. And many people felt it was that and they wanted answers. And when you can’t challenge an election, you have to be able to challenge it, otherwise it’s going to get worse, not better. And there are lots of ways to solve this problem. Go to paper ballots. Do it easy way, I mean the paper ballots and you have voter ID and you have same day voting and you have proof of citizenship, which is very important because we have people voting that are not citizens. They just came in and they’re loading up the…

Donald Trump (00:17:00) They just came in and they’re loading up the payrolls, they’re loading up everything. They’re putting students in schools. They don’t speak a word of English, and they’re taking the seats of people that are citizens of our country. So look, we have the worst border in the history of the world. We have coming into our country right now, millions and millions of people at levels that nobody’s ever seen. I don’t believe any country’s ever seen it. And they would use sticks and stones not to make it happen, not to let it happen. We don’t do anything. And we have a person who was the border czar, who now said she wasn’t really the border czar, but she was, she was the border czar, but she was in charge of the border. And we have her and she’s saying very strongly, “Oh, I did such a good job.” She was horrible, horrible. The harm she’s done…

(00:17:56) But we have people coming in from other countries all over the world, not just South America, and they’re coming in from prisons and jails. They’re coming in from mental institutions and insane asylums and they’re street criminals right off the street. They take them and they’re being given to our country, drug dealers, human traffickers. We’re destroying our country. This is a sin what’s been allowed to take place over the last four years. We’re our country. And we’ll see how that all works out, but it’s not even believable. And now you see, you saw in Aurora, Colorado, a group of very tough young thugs from Venezuela taking over big areas including buildings. They’re taking over buildings. They have their big rifles, but they’re taking over buildings.

(00:18:52) We’re not going to let this happen. We’re not going to let them destroy a country. And in those countries,, crime is way down, they’re taking them out of their prisons, which is good because good for them. I do the same thing. By the way, if I ran one of those countries, any country in the world, I would make sure that America has every one of our prisoners, every one of our criminals would be here. I can’t believe they’re going so slowly, but some are. But they all are doing it and we can’t let that happen. They’re emptying out their prisons and their mental institutions into the United States of America. We can’t let that happen.

Lex Fridman (00:19:29) So a lot of people believe that there was some shady stuff that went on with the election, whether it’s media bias or big tech, but still the claim of widespread fraud is the thing that bothers people.

Donald Trump (00:19:42) Well, I don’t focus on the past. I focus on the future. I mean, I talk about how bad the economy is, how bad inflation is now, bad things like… Which is important. Afghanistan was, in my opinion, the most embarrassing thing that’s ever happened to our country. And because of that, I think Putin went in when he said how stupid we were. Putin went in, but it was the most embarrassing moment in the history of our country. I really believe that. But we left 13 dead soldiers, think of it, 13 dead soldiers, many soldiers horrifically hurt, with arms and legs and everything else gone. We left hostages behind. We left Americans behind. We left military equipment, the likes of which nobody’s ever left behind before. Billions and billions of dollars of equipment. They’re now selling the equipment. They’re one of the largest arms dealers in the world.

(00:20:45) And very sad, very sad. And we were there for a long time. I was going to get out. We were getting ready to get out. Then we got interrupted by the election, but we would’ve been out with dignity and strength. We were having very little problem with the Taliban when I was there, because they knew it was going to be tough. I dealt with Abdul. Abdul was the leader, and we got along fine. He understood, but they were shooting, they were killing a lot of our people before I came down. And when I got there, I spoke to him, I said, “You can’t do it. Don’t do it anymore.” We went 18 months before this happened, this horrible day happened. We went 18 months and nobody was shot at or killed.

Lex Fridman (00:21:33) What do you think that was? The carrot or the stick, in that case, in Afghanistan?

Donald Trump (00:21:37) The stick, definitely the stick.

Lex Fridman (00:21:38) So the threat of military force.

Donald Trump (00:21:40) That was the stick, yeah. It doesn’t have to be, but that was the stick.

Lex Fridman (00:21:44) Well, let me just linger on the election a little bit more. For this election, it might be a close one. What can we do to avoid the insanity and division of the previous election, whether you win or lose?

Donald Trump (00:21:58) Well, I hope it’s not a close one. I mean, I don’t know how people can vote for somebody that has destroyed our country, the inflation, the bad economy. But to me, in a way, the worst is what they’ve allowed to happen at our border where they’ve allowed millions of people to come and hear from places that you don’t want to know about. And I can’t believe that there’s going to be a close election. We’re leading in the polls and it looks close, but I think in the end it’s not going to be a close election.

Lex Fridman (00:22:29) What do you think is the right way to solve the immigration crisis? Is mass deportation one of the solutions you would think about?

Donald Trump (00:22:35) Well, you’ve got to get the criminals out of here fast, right? The people from mental institutions, you got to get them back into their mental institution. No country can afford this. It’s just too much money. You look at what’s happening in New York and Chicago and LA and lots of places, and you take a look at what’s happening. There’s no country can afford this. We can’t afford it, and we’ve got to get the bad ones out immediately and the rest have to be worked on. It’s happened before. Dwight Eisenhower was sort of a moderate president, moderate type person, but he hated when he saw people pouring into the country, and they were nothing like. Now, I probably got elected in 2016, because of the border, and I told people what was happening and they understood it. And I won the election.

(00:23:25) And I won the election, I think because of the border. Our border is 25 times worse right now than it was in 2016. I had it fixed too. I had it the last week of the famous chart that I put up was exactly that, you know the chart. When I looked to the right, I said, “There’s the chart.” Bing. That was not a pleasant experience, but the chart that I put up said, and that was done by border patrol. That was the lowest number that we’ve ever had come into our country in recorded history and we have to get it back to that again. We will.

Project 2025

Lex Fridman (00:24:04) Let me ask you about Project 2025. So you’ve publicly said that you don’t have any direct connection to-

Donald Trump (00:24:09) Nothing. I know nothing about it. And they know that too. Democrats know that. And I purposely haven’t read it, because I want to say to you, I have no idea what it’s all about. It’s easier, than saying I read it and all of the things. No, I purposely haven’t read it and I’ve heard about it. I’ve heard about things that are in there that I don’t like, and there’s some things in there that everybody would like, but there are things that I don’t like at all. And I think it’s unfortunate that they put it out, but it doesn’t mean anything, because it has nothing to do with me. Project 25 has absolutely nothing to do with me.

Marijuana

Lex Fridman (00:24:52) You posted recently about marijuana and that you are okay with it being legalized, but it has to be done safely. Can you explain your policy there?

Donald Trump (00:25:03) Well, I just put out a paper and first of all, medical marijuana has been amazing. I’ve had friends and I’ve had others and doctors telling me that it’s been absolutely amazing, the medical marijuana. And we put out a statement that we can live with the marijuana. It’s got to be a certain age, got to be a certain age to buy it. It’s got to be done in a very concerted, lawful way. And the way they’re doing in Florida, I think is going to be actually good. It’s going to be very good, but it’s got to be done in a good way. It’s got to be done in a clean way. You go into some of these places, like in New York, it smells all marijuana. You’ve got to have a system where there’s control. And I think the way they’ve done it in Florida is very good.

Lex Fridman (00:25:59) Do you know anything about psychedelics? So I’m not a drug guy, but I recently did Ayahuasca and there’s a lot of people that speak to the health benefits and the spiritual benefits of these different psychedelics. I think we would probably have a better world if everybody in Congress took some mushrooms perhaps. Now I know you don’t. You stay away from all of that stuff. I know also veterans use it for dealing with PTSD and all that kind of stuff. So it’s great. And it’s interesting that you’re thinking about being more accepting of some of these drugs, which don’t just have a recreational purpose, but a medical purpose, a treatment purpose.

Donald Trump (00:26:44) So we put out a statement today, we’re going to put out another one probably next week, be more specific, although I think it’s pretty specific and we’ll see how that all goes. That’s a referendum coming up in some states, but it’s coming up and we’ll see how it does. I will say it’s been very hard to beat it. You take a look at the numbers, it’s been very hard to beat it. So I think it’ll generally pass, but you want to do it in a safe way.

Joe Rogan

Lex Fridman (00:27:14) Speaking of marijuana, let me ask you about my good friend, Joe Rogan. So you had a bit of tension with him. So when he said nice things about RFK Junior, I think you’ve said some not so nice things about Joe, and I think that was a bit unfair. And as a fan of Joe, I would love to see you do his podcast, because he is legit the greatest conversationalist in the world. So what’s the story behind the tension?

Donald Trump (00:27:42) Well, I don’t think there was any tension. And I’ve always liked him, but I don’t know him. I only see him when I walk into the arena with Dana and I shake his hand. I see him there and I think he’s good at what he does, but I don’t know about doing his podcast. I guess I’d do it, but I haven’t been asked and I’m not asking them. I’m not asking anybody.

Lex Fridman (00:28:09) It sounds like a challenging negotiation situation.

Donald Trump (00:28:11) No, it’s not really a negotiation. And he’s sort of a liberal guy, I guess, from what I understand. But he likes Kennedy. This was before I found this out, before Kennedy came in with us. He’s going to be great. Bobby’s going to be great. But I like that he likes Kennedy. I do too. He is a different kind of a guy, but he’s got some great things going. And I think he’s going to be beyond politics. I think he could be quite influential and taking care of some situations that you probably would agree should be taken care of.

Lex Fridman (00:28:45) The Joe Rogan post is an example. I would love to get your psychology about behind the tweets and the post on truth. Are you sometimes being intentionally provocative or are you just speaking your mind and are there times where you regret some of the truths you’ve posted?

Donald Trump (00:29:04) Yeah, I do, but not that often, honestly. I do a lot of re-posting. The ones you get in trouble with are the re-posts, because you find down deep, they’re into some group that you’re not supposed to be re-posting. You don’t even know if those groups are good, bad or indifferent. But the re-posts are the ones that really get you in trouble. When you do your own words, it’s sort of easier. But the re-posts go very, and if you’re going to check every single little symbol, and I don’t know, it’s worked out pretty well for me. I mean, I tell you, truth is very powerful, truth. And it’s my platform and it’s been very powerful, very, very powerful. Goes everywhere. I call it my typewriter. That’s actually my typewriter.

Lex Fridman (00:29:54) What are you doing usually when you’re composing a truth, are you chilling back on a couch?

Donald Trump (00:30:02) A lot of different things. I mean-

Lex Fridman (00:30:03) Late at night and just-

Donald Trump (00:30:06) I’d like to do something late at night. I’m not a huge sleeper, but whenever I do, I’m past three o’clock, they criticize you the next day. Trump was up. True thing. Okay. Trump was true thing at three o’clock in the morning and there should be no problem with that. And then when you think about time zones, how do they know that you are in a time zone, like an Eastern Zone, but every time I do it after 2:00 or three o’clock, it’s like, “Why is he doing that?” But it’s gotten… Truth has become a very successful platform, and I like doing it and it goes everywhere. As soon as I do it, it goes everywhere.

Division

Lex Fridman (00:30:54) The country seems more divided than ever. What can you do to help alleviate some of that division?

Donald Trump (00:30:59) Well, you can get rid of these two people. They’re terrible. They’re terrible. You don’t want to have them running this country. They’re not equipped to run it. Joe, just Joe, it’s a disaster. And Kamala, I think she’ll end up being worse than him. We’ll see. I think a lot’s now, the convention’s over with, and I see I’m leading and just about all the polls now. They had their little honeymoon period as they call it, and we’ll see how that all goes. Who knows?

Lex Fridman (00:31:31) From my personal opinion, I think you are at your best when you’re talking about a positive vision of the future versus criticizing the other side.

Donald Trump (00:31:40) Yeah, I think you have to criticize though. I think they’re nasty. They came up with a story that I looked down and I called soldiers that died in World War I, suckers and losers. Okay. Now number one, who would say that? Number two, who would say it to military people? Nobody. It was a made-up story. It was just a made-up story. And they like to repeat it over again. They know it was made up. I have 26 witnesses that nothing was said. They don’t want to hear about that. She lied on McDonald’s. She said that she worked at McDonald’s. It’s not a big lie, but it’s a big lie. So they just went and they checked and unless she can show something, they don’t talk about the presses are going to follow up with it, but I’ll keep hammering it. But she never worked at McDonald’s. It was just sort of a cool thing to say, “Hey, I worked at McDonald’s.”

(00:32:41) But one of the worst was two days ago. I went to Arlington at the request of people that lost their children. They’ll always be children to those people. You understand that. That’s not politically incorrect thing to say. The mother comes up, “I lost my child,” but the child is a soldier. And lost the child, because of Biden and because of Kamala, just as though they had the gun in their hand, because it was so badly handled. It should have been done at Bagram, which is the big air base. It shouldn’t have been done at a small little airport right in the middle of town where people stormed it. It was a true disaster and they asked me if I’d come and celebrate with them. Three years. Three years. They died three years ago.

(00:33:37) And I said, “I’m going to try.” I got to know them, because I brought them here, actually. One night they almost all came here and they said, “I wonder if Trump will actually come and see us?” I heard they were here. I came. We stayed for four hours listening to music up on a deck, right upstairs. Beautiful. And they were great people. So they called me over the last couple of weeks and they said, “We’re going to have a reunion, our three-year reunion.”

Donald Trump (00:34:00) … couple of weeks and they said, “We’re going to have a reunion, our three year, would you be able to come?” And it was very hard for me to do it logistically, but I said, “I’ll get it done.” And I got there and we had a beautiful time. I didn’t run away. I didn’t just walk in, shake hands and walk out like people do. And I wasn’t looking at my watch like Joe Biden does. And it was amazing. I did it for them. I didn’t do it for me. I don’t need the publicity. I get more publicity probably than anybody. You would know that better than me, but I think maybe more than anybody, maybe more than anybody that’s ever lived, I don’t know. But I don’t think anyone could have anymore. Every time you turn on the television, there’s like nine different stories all on different topics in the world.

(00:34:48) As an example, you interview a lot of people, good people, successful people. Let’s see how you do with this interview versus them. I can tell you right now you’re going to get the highest numbers you’ve ever had by sometimes a factor of 10. But when a Gold Star Family asks me to come in and spend time with them, and then they said, sir… We did a ceremony. And then we went down to the graves, which was quite a distance away. They said, “Sir, would you come to the grave?” And then they said, when we were there… It’s very sad actually because these people shouldn’t have died. They shouldn’t have died. They died because of Biden and because of Kamala, they died because just like if they pulled the trigger. Now, I don’t know if that’s controversial to say, but I don’t think it is.

(00:35:47) Afghanistan was the most incompetently run operation I think I’ve ever seen. Military or otherwise, they’re incompetent. But the families asked me if I’d go, I did go. Then the families said, “Could we have a picture at the tombstone of my son?” And we did. Son or daughter. There was a daughter too. And I took numerous pictures with the families. I don’t know of anybody else that was in the pictures, but they were mostly families, I guess. That was it. And then I left. I spent a lot of time with them. Then I left and I get home that night and I get a call that the Biden administration with Kamala is accusing me of using Arlington for publicity. I was in the news. Just the opposite. Just the opposite. And actually, did you see, it just came out? The families actually put out a very strong statement defending me. They said, “We asked them to be there.”

Lex Fridman (00:36:44) Well, politicians and the media can play those games. And you’re right, your name gets a lot of views. You’re probably legit the most famous person in the world. But on the previous thing, in the spirit of unity, you used to be a Democrat. Setting the politicians aside, what do you respect most about people who lean left, who are Democrats themselves or of that persuasion, progressives liberals, and so on?

Donald Trump (00:37:15) Well, look, I respect the fact that everybody’s in there, and to a certain extent, life is what you do while you’re waiting to die, so you might as well do a good job. I think in terms of what’s happening now, I think we have a chance to save the country. This country’s going down and I called it with Venezuela, I called it with a lot of different countries. And this country’s going down if we don’t win this election, the election coming up on November 5th is the most important election this country’s ever had because if we don’t win it, I don’t know that there’ll be another election and it’s going to be a communist country or close.

Communism and fascism

Lex Fridman (00:38:01) There’s a lot of people listening to this, myself included, that doesn’t think that Kamala is a communist.

Donald Trump (00:38:09) Well, she’s a Marxist.

Lex Fridman (00:38:11) Her father’s a Marxist.

Lex Fridman (00:38:13) And she’s advocating-

Donald Trump (00:38:13) That’s a little unusual.

Lex Fridman (00:38:15) She’s advocating for some policies that are towards the direction of democratic socialism, let’s say. But there’s a lot of people that know the way government works and they say, well, none of those policies are going to actually come to reality. It’s just being used during the campaign to… Groceries are too expensive. We need them cheaper, so let’s talk about price controls. And that’s never going to come to reality.

Donald Trump (00:38:39) It could come to reality. Look, she came out with price control. It’s been tried like 121 different times at different places over the years, and it’s never worked once. It leads to communism, it leads to socialism, it leads to having no food on the shelves, and it leads to tremendous inflation.

Lex Fridman (00:39:02) … whenever we use terms like communism for her, and I don’t know if you know this, but some people call you a fascist.

Donald Trump (00:39:08) Yeah, they do, so I figure it’s all right to call them a communist. They call me a lot worse than I call them.

Lex Fridman (00:39:14) They do indeed. It is just sometimes-

Donald Trump (00:39:16) It’s interesting though, they’ll call me something that’s terrible and then I’ll hit them back and they’ll say, “Isn’t it terrible what Trump said?” I said, “Well, wait a minute. They just called me…” I believe you have to fight fire with fire. I believe they’re very evil people. These are evil people. We have an enemy from the outside and we have an enemy from within. And in my opinion, the enemy from within are radical left lunatics. And I think you have to fight back.

Lex Fridman (00:39:44) Whenever there’s a lot of fighting fire with fire, it’s too easy to forget that there is a middle of America that’s moderate and sees the good in both sides and just likes one side more than the other in terms of policies. Like I said, there’s a lot of people that like your policies, that like your skill in being able to negotiate and end wars and they don’t see the impending destruction of America.

Donald Trump (00:40:15) We had no wars when I was president. That’s a big thing. Not since 78 years as that happened, but we had no wars When I was president, we defeated ISIS, but that was a war that was started that we weren’t anywhere near defeating. But think of it, I had no wars and Viktor Orban, the prime minister of Hungary said, “The world has to have Trump back because everybody was afraid of Trump.” Now that’s what he said, so I’m not using that term, but I think they respected me. But he said, “China was afraid. Russia was afraid. Everybody was afraid.” And I don’t care what word they use, it probably that’s even a better word if you want to know the truth, but let’s use the word respect.

(00:40:56) They had respect for me. They had respect for the country. I ended the Nord Stream 2 pipeline, the Russian pipeline. Nobody else could have done that. I ended it. It was done. Then Biden comes in and he approved it, so we are defending Germany in these other countries for peanuts compared to what it’s worth, and they’re paying the person we’re defending them against billions and billions of dollars for energy. I said, “How does that work?” And we had it out with them and it worked out good. And they paid hundreds of billions of dollars. Or you wouldn’t even have a NATO right now. You wouldn’t have NATO if it wasn’t for me.

Power

Lex Fridman (00:41:36) As the leader of the United States, you were the most powerful man in the world. As you mentioned, not only the most famous, but the most powerful. And if you become leader again, you’ll have unprecedented power. Just on your own personal psychology, what does that power do to you? Is there any threat of it corrupting how you see the world?

Donald Trump (00:41:56) No, I don’t think so. Look, I’ve been there for four years. I could have done a big number on Hillary Clinton. I thought it looked terrible to take the president’s wife and put her in prison. She’s so lucky I didn’t do anything. She’s so lucky. Hillary is a lucky woman because I had a lot of people pushing me too. They wanted to see something, but… I could have done something very bad. I thought it looked so bad. Think of it, you have the President of the United States, and you also had Secretary of State, she was, but you’re going to put the president’s wife in prison. And yet when I got out, they have all these hoaxes.

(00:42:37) They’re all hoaxes, but they have all these dishonest hoaxes just like they did in the past with, Russia, Russia, Russia. That was a hoax. The 51 different agencies or agents, that was a hoax. The whole thing was a hoax. There were so many hoaxes and scams. But I didn’t want to put her in jail, and I didn’t. And I explained it to people. They say, “Lock her up. Lock her up.” We won. I said, “We don’t want to put her in jail. We want to bring the country together. I want to bring the country together. You don’t bring the country together by putting her in jail.” But then when I got out, they went to work on me. It’s amazing. And they suffer from massive Trump derangement syndrome, TDS, and I don’t know if it’s curable from their standpoint.

UFOs & JFK

Lex Fridman (00:43:36) A lot of people are very interested in the footage of UFOs. The Pentagon has released a few videos, and there’s been anecdotal reports from fighter pilots, so a lot of people want to know, will you help push the Pentagon to release more footage, which a lot of people claim is available.

Donald Trump (00:43:57) Oh yeah, sure, I’ll do that. I would do that. I’d love to do that. I have to do that. But they also are pushing me on Kennedy, and I did release a lot, but I had people come to me and beg me not to do it. But I’ll be doing that very early on. Yeah, no. But I would do that.

Jeffrey Epstein

Lex Fridman (00:44:16) There’s a moment where you had some hesitation about Epstein releasing some of the documents on Epstein. Why the hesitation?

Donald Trump (00:44:23) I don’t think… I’m not involved. I never went to his island, fortunately, but a lot of people did.

Lex Fridman (00:44:33) Why do you think so many smart, powerful people allowed him to get so close?

Donald Trump (00:44:42) He was a good salesman. He was a hailing, hearty type of guy. He had some nice assets that he’d throw around like islands, but a lot of big people went to that island. But fortunately, I was not one of them.

Lex Fridman (00:44:59) It’s just very strange for a lot of people, that the list of clients that went to the island has not been made public.

Donald Trump (00:45:08) It’s very interesting, isn’t it? It probably will be, by the way, probably.

Lex Fridman (00:45:13) If you’re able to, you’ll be-

Donald Trump (00:45:15) Yeah, I’d certainly take a look at it. Now, Kennedy’s interesting because it’s so many years ago. They do that for danger too, because it endangers certain people, et cetera, et cetera, so Kennedy is very different from the Epstein thing but I’d be inclined to do the Epstein. I’d have no problem with it.

Lex Fridman (00:45:36) That’s great to hear. What gives you strength when you’re getting attacked? You’re one of the most attacked people in the world.

Donald Trump (00:45:43) I think you can’t care that much. I know people that care so much about everything, like what people are saying, you can’t care too much because you end up choking.

Mortality and religion

Lex Fridman (00:45:55) One of the tragic things about life is that it ends. How often do you think about your death? Are you afraid of it?

Donald Trump (00:46:02) I have a friend who’s very, very successful, and he’s in his 80s, mid 80s, and he asked me that exact same question. I turned it around and I said, “Well, what about you?” He said, “I think about it every minute of every day.” And then a week later, he called me to tell me something. And he starts off the conversation by going, “Tick tock, tick tock.” This is dark person in a sense, but it is what it is. If you’re religious, you have I think a better feeling toward it. You’re supposed to go to heaven, ideally, not hell, but you’re supposed to go to heaven if you’re good. I think our country’s missing a lot of religion. I think it really was a much better place with religion. It was almost a guide. To a certain extent it was a guide. You want to be good to people. Without religion there are no guardrails. I’d love to see us get back to religion, more religion in this country.

Lex Fridman (00:47:09) Well, Mr. President, thank you for putting yourself out there, and thank you for talking today.

Donald Trump (00:47:13) Look, I love the country. I want to see the country be great, and we have a real chance at doing it, but it’s our last chance and I appreciate it very much.

Lex AMA

Lex Fridman (00:47:25) Thanks for listening to this conversation with Donald Trump. To support this podcast, please check out our sponsors in the description. And now, as I’ve started doing here at the end of some episodes, let me make a few comments and answer a few questions. If you would like to submit questions, including in audio and video form, go to lexfridman.com/ama or get in touch with me for whatever other reason at lexfridman.com/contact. I usually do this on a T-shirt, but I figured for this episode, I’ll keep my suit and tie on, so first, this might be a good moment to look back a bit. I’ve been doing this podcast for over six years, and I first and foremost have to say thank you. I’m truly grateful for the support and the love I’ve gotten along the way. It’s been, I would say, the most unlikely journey.

(00:48:16) And on most days, I barely feel like I know what I’m doing. But I wanted to talk a bit about how I approach these conversations. Now, each conversation is its own unique puzzle, so I can’t speak generally to how I approach these, but here it may be useful to describe how I approach conversations with world leaders, of which I hope to have many more and do a better job every time. I read a lot of history and I admire the historian perspective. As an example, I admire William Shirer, the author of many books on Hitler, including The Rise and Fall of the Third Reich. He was there and lived through it and covered it objectively to the degree that one could. Academic historians, by the way, criticize him for being a poor historian because he editorialized a little too much. I think those same folks criticized Dan Carlin and his Hardcore History podcast.

(00:49:15) I respect their criticism, but I fundamentally disagree, so in these conversations with world leaders, I try to put on my historian hat. I think in the realm of truth and public discourse, there’s a spectrum between the ephemeral and the eternal. The outraged mob and clickbait journalists are often focused on the ephemeral, the current thing, the current viral shitstormer of mockery and derision. But when the battle of the day is done, most of it will be forgotten. A few true ideas will remain, and those the historian hopes to capture. Now, this is much easier said than done. It’s not just about having the right ideals and the integrity to stick by them. It’s not even just about having the actual skill of talking, which I still think I suck at, but let’s say it’s a work in progress. You also have to make the scheduling work and set up the entirety of the environment in a way that is conducive to such a conversation.

(00:50:19) This is hard, really hard with political and business leaders. They are usually super busy and in some cases super nervous because, well, they’ve been screwed over so many times with clickbait got you journalism, so to convince them and their team to talk for two, three, four, five hours is hard. And I do think a good conversation requires that kind of duration. And I’ve been thinking a lot about why. I don’t think it’s just about needing the actual time of three hours to cover all the content. I think the longer form with a hypothetical skilled conversationalist, relaxes things and allows people to go on tangents and to banter about the details because I-

Lex Fridman (00:51:00) … agents and to banter about the details, because I think it’s in the details that the beautiful complexity of the person is brought to light. Anyway, I look forward to talking to more world leaders and doing a better job every time as I said. I would love to do interviews with Kamala Harris and some other political figures on the left and right, including Tim Walz, AOC, Bernie, Barack Obama, Bill and Hillary. And on the right, J.D. Vance, Vivek, George W. and so on. And on the topic of politics, let me say, as an immigrant, I love this country, the United States of America. I do believe it is the greatest nation on earth, and I’m grateful for the people on the left and the right who step into the arena of politics to fight for this country that I do believe they all love as well.

(00:51:52) I have reached out to Kamala Harris, but not many of the others. I probably should do a better job with that, but I’ve been doing most of this myself, all the reach out, scheduling, research prep, recording and so on. And on top of that, I very much have been suffering from imposter syndrome with a voice in my head constantly pointing out when I’m doing a shitty job. Plus a few folks graciously remind me on the internet, the very same sentiment of this aforementioned voice. All of this, while I have the option of just hiding away at MIT, programming robots and doing some cool AI research with a few grad students, or maybe joining an AI company or maybe starting my own, all these options make me truly happy. But like I said, on most days I barely know what I’m doing, so who knows what the future holds. Most importantly, I’m forever grateful for all of you for your patience and your support throughout this rollercoaster of the life I’ve been on. I love you all.

(00:52:51) Okay, now let me go on to some of the questions that people had. I was asked by a few people to comment on Pavel Durov’s arrest and on X being banned in Brazil. Let me first briefly comment on the Durov arrest. Basic facts, Pavel Durov is CEO of Telegram, which is a messenger app that has end-to-end encryption mode. It’s not on by default, and most people don’t use the end-to-end encryption, but some do. Pavel was arrested in France on a long list of charges related to “criminal activity” carried out on the Telegram platform, and for “providing unlicensed cryptology services.” I think Telegram is indeed used for criminal activity by a small minority of its users, for example, by terrorist groups to communicate. And I think we all agree that terrorism is bad.

(00:53:47) But here’s the problem. As the old saying goes, one man’s terrorist is another man’s freedom fighter. And there are many cases in which the world unilaterally agrees who the terrorists are, but there are other cases when governments, especially authoritarian inclined governments, tend to propagandize and just call whoever’s in the opposition, whoever opposes them, terrorists. There is some room for nuance here, but, to me at this time, it seems to obviously be a power grab by government wanting to have backdoor access into every platform so they can have censorship power against the opposition. I think generally governments should stay out of censoring or even pressuring social media platforms, and I think arresting a CEO of a tech company for the things said on the platform he built is just nuts. It has a chilling effect on him, on people working at Telegram and on people working at every social media company, and also people thinking of launching a new social media company.

(00:54:50) Same as the case of X being banned in Brazil. It’s, I think, a power grab by Alexandre de Moraes, a Supreme Court justice in Brazil. He ordered X to block certain accounts that are spreading “misinformation.” Elon and X denied the request, then de Moraes threatened to arrest X representatives in Brazil, and in response to that X pulled the representatives out of Brazil obviously to protect them. And now X, having no representatives in Brazil, apparently violates the law. Based on this de Moraes banned X in Brazil. Once again, it’s an authoritarian figure seeking censorship power over the channels of communication.

(00:55:34) I understand that this is complicated because there are evil people in the world and part of the role of government is to protect us from those evil people. But as Benjamin Franklin said, “Those who can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety.” It’s a trade-off, but I think in many places in the world, many governments have leaned too far away at this time from liberty.

(00:56:02) Okay, next up I got a question on AI, which I emotionally connected with. I’ll condense it as follows. “Hello, Lex. I’m a programmer and I have a deep fear of slipping into irrelevance because I am worried that AI will soon exceed my programming skills.”

(00:56:23) Let me first say that I relate to your fear. It’s scary to have a thing that gives you a career and gives you meaning to be taken away. For me, programming is a passion, and if not for this podcast, it would probably at least in part be my profession so I get an uncomfortable feeling every time, Claude, the LLM I use for coding at this time just writes a lot of excellent approximately correct code. I think you can make a good case that it already exceeds the skill of many programmers, at least in the same way that the collective intelligence of stack overflow exceeds the skill of many individual programmers, but in many ways it still does not. But I think eventually more and more the task, the professional programming will be one of writing natural language prompts. I think the right thing to do, and what I’m at least doing is to ride the wave of the ever improving code generating LLMs and keep transforming myself into a big picture designer versus low-level tinkerer. What I’m doing and what I recommend you do is continually switch to whatever state-of-the-art tool is for generating code. For me, currently I recently switched from VS Code to Cursor, and before that it was Emacs to VS Code switch. Cursor is this editor that’s based on VS Code that leans heavily on LLMs and integrates the co-generation really nicely into the editing process. It makes it super easy to continually use the LLMs. What I would advise and what I’m trying to do myself is to learn how to use it and to master its co-generation capabilities. I, personally, try to now allocate a significant amount of time to designing with natural language first versus writing code from scratch, so using my understanding of programming to edit the code that’s generated by the LLM versus writing it from scratch and then using the LLM to generate small parts of the code. I see it as a skill that I should develop in parallel to my programming skill.

(00:58:34) I think this applies to many other careers too. Don’t compete with AI for your job, learn to use the AI to do that job better. But yes, it is scary in some deep human level, the threat of being replaced. But at least I think we’ll be okay.

(00:58:55) All right, next up, I got a very nice audio message and question from a gentleman who is 27 and feeling a lot of anxiety about the future. Just recently he graduated with a bachelor’s degree and he’s thinking about going to grad school for biomedical engineering, but there is a lot of anxiety. He mentioned anxiety many times in the message. It took him an extra while to get his degree, so he mentioned he would be 32 by the time he’s done with his PhD, so it’s a big investment. But he said in his heart he feels like he’s a scientist. I think that’s the most important part of his message, of your message. By the way, I’ll figure out how to best include audio and video messages in future episodes.

(00:59:37) Now onto the question. Thank you for telling me your story and for submitting the question. My own life story is similar to yours. I went to Drexel University for my bachelor’s, master’s, and doctorate degrees, and I took a while just as you’re doing. I did a lot of non-standard things that weren’t any good for some hypothetical career I’m supposed to have. I trained and competed in Judo and Jiu Jitsu for my entire 20s, got a black belt from it. I wrote a lot, including a lot of really crappy poetry. I read a large amount of non-technical books, history, philosophy, and literature. I took courses on literature and philosophy that weren’t at all required for my computer science and electrical engineering degrees, like a course on James Joyce. I played guitar in bars around town. I took a lot of technical classes, many, for example, on theoretical computer science that were way more than were needed for the degree. I did a lot of research and I coded up a bunch of projects that didn’t directly contribute to my dissertation. It was pure curiosity and the joy of exploring.

(01:00:54) Like you, I took the long way home, as they say, and I regret none of it. Throughout that, people around me and even people who love me wanted me to hurry up and to focus, especially because I had very little money, and so I had a sense like time was running out for me to take the needed steps towards a reasonable career. And just like you, I was filled with anxiety and I still am filled with anxiety to this day, but I think the right thing to do is not to run away from the anxiety, but to lean into it and channel it into pursuing with everything you got, the things you’re passionate about.

(01:01:36) As you said, very importantly, in your heart you know you’re a scientist, so that’s it. You know exactly what to do. Pursue the desire to be a scientist with everything you got. Get to a good grad school, find a good advisor and do epic shit with them. And it may turn out in the end that your life will have unexpected chapters, but as long as you’re chasing dreams and goals with absolute unwavering dedication, good stuff will come of it. And also try your best to be a good person. This might be a good place to read the words If by Roger Kipling that I often return to when I feel lost and I’m looking for guidance on how to be a better man.

(01:02:18) “If you can keep your head when all about your losing theirs and blaming it on you. If you can trust yourself when all men doubt you, but make allowance for their doubting too. If you can wait and not be tired by waiting or being lied about, don’t deal in lies or being hated, don’t give weight to hating and yet don’t look too good nor talk too wise. If you can dream and not make dreams your master. If you can think and not make thoughts your aim. If you can meet with triumph and disaster and treat those two imposters just the same. If you can bear to hear the truth you’ve spoken twisted by knaves to make a trap for fools or watch the things you gave your life to broken and stoop and build them up with worn out tools. If you can make one heap of all your winnings and risk it on one turn of pitch-and-toss and lose and start again at your beginnings and never breathe a word about your loss. If you can force your heart and nerve and sinew to serve your turn long after they’re gone and so hold on when there’s nothing in you except the will, which says to them, hold on. If you can talk with crowds and keep your virtue or walk with kings nor lose the common touch. If neither foes, nor loving friends can hurt you. If all men count with you, but none too much. If you can fill the unforgiving minute with 60 seconds worth of distance run, yours is the earth and everything that’s in it. And which is more, you’ll be a man, my son.”

(01:04:05) Thank you for listening and see you next time.

埃隆·马斯克:神经链接与人类的未来 (2024-08-02)

Elon Musk: Neuralink and the Future of Humanity (2024-08-02)

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:在 Neuralink 成功为首位人类患者 Noland Arbaugh 植入脑机接口(BCI)设备后,公司核心团队(包括 Elon Musk, DJ Seo, Matthew MacDougall, Bliss Chapman)与患者本人接受了深度访谈,旨在阐述该技术的现状、愿景及其对人类未来的深远影响。
  • 核心论点:本次对话的核心论题是,Neuralink 不仅是一款旨在恢复神经功能损伤的医疗设备,更是一个旨在实现“人机共生”(Human-AI Symbiosis)的底层技术平台。 对话从第一性原理出发,系统性地解构了这一愿景的全栈实现路径:从解决生物物理与工程挑战(微米级柔性电极、机器人手术),到攻克神经信号解码与人机协同适应的软件难题,再到首位用户的真实体验与反馈闭环。其根本世界观在于,面对日益强大的人工智能,人类当前最大的瓶颈是极低的I/O带宽。通过构建高通量、低延迟的脑机接口,Neuralink 的终极目标不仅是治疗顽疾、增强人类能力,更是作为一种“AI安全策略”,确保人类集体意志在超级智能时代依然能够有效引导未来走向,避免沦为AI眼中“像树一样”的低速生物。

2. 🧠 深度观点解析 (Deep Dive Analysis)

维度一:全栈技术护城河:从微米级植入到机器人手术

  • 核心观点:Neuralink 的核心竞争力在于其端到端的垂直整合能力,自主研发了从微米级柔性电极(Threads)、植入式芯片(N1 Implant)到高精度手术机器人(R1)的全套解决方案,旨在攻克传统 BCI 在安全性、耐用性和通道数量上的核心瓶颈。

  • 原理解构

    • 硬件层面 (N1 Implant & Threads):传统 BCI(如 Utah Array)采用坚硬的硅基探针,易引发大脑免疫反应(形成疤痕组织),导致信号衰减。Neuralink 采用比头发丝还细的柔性聚合物电极(Threads),其机械特性与脑组织更匹配,显著减少了植入创伤和长期免疫排斥。DJ Seo 提供的组织学图像显示,植入7个月后,神经元(Neurons)能够紧密贴合电极,几乎无胶质细胞增生或疤痕组织(Collagen Layer),证明了其卓越的生物相容性。设备本身高度集成,将电池、充电线圈、信号处理ASIC芯片等封装在一个硬币大小的、生物兼容的密封外壳内,通过无线充电和无线数据传输(蓝牙),避免了传统BCI经皮端口带来的感染风险。
    • 手术层面 (R1 Robot):由于柔性电极无法手动植入,Neuralink 研发了 R1 手术机器人。该机器人结合了高分辨率光学系统(计算机视觉)和微米级精度的机械臂。手术时,它能实时识别并避开大脑表面的血管,将每根电极精准、快速地植入预定皮层深度。这不仅将手术风险降至最低,更关键的是实现了标准化和规模化的可能。神经外科主任 Matthew MacDougall 指出,R1 完成了人类外科医生无法完成的精细操作,而人类医生的角色则是处理宏观步骤和应对意外情况。
  • 证据/案例

    • 材料与设计:N1 植入物尺寸约等于一枚25美分硬币,包含64根 Threads,每根有16个电极,共计1024个通道。Threads 厚度小于5微米。
    • 机器人精度:R1 机器人使用的植入针尖端宽度仅为10-12微米,略大于红细胞直径,能以微米级精度抓取并植入电极。
    • 手术实践:在为 Noland 进行首次人体手术前,团队使用了根据其头部CT和fMRI数据 3D 打印的、包含模拟血管和搏动脑组织的**“手术代理(Proxy)”** 进行了数百次全流程演练,确保了手术的精准与安全。

维度二:人机共舞:解码、校准与体验的闭环迭代

  • 核心观点:BCI 的真正挑战并非单纯的硬件植入,而在于软件层面的神经解码、持续校准以及与用户心智模型的协同进化(Co-adaptation)。这是一个复杂的 UX(用户体验)问题,其核心是解决“标签问题”(The Labeling Problem)和信号的非平稳性(Non-stationarity)。

  • 原理解构

    • 解码与校准:解码器(一个深度神经网络模型)的目标是将神经脉冲信号(Spikes)映射为计算机指令(如光标移动)。这个过程需要校准:
      • 开环校准 (Open-loop):用户根据提示想象(或尝试)特定动作(如“向右移动”),系统记录下对应的神经活动模式,建立初始模型。
      • 闭环校准 (Closed-loop):用户获得初步控制权后,在实际使用中不断调整意图。此时,用户和模型开始相互适应。
    • 核心难题:“标签问题”与信号漂移:软件负责人 Bliss Chapman 强调,最大的挑战是获取高质量的训练标签。由于无法直接观测到用户的“真实意图”,系统只能依赖用户对任务的执行来推断。而用户的意图本身是动态、高维且难以精确描述的。此外,神经信号会随时间(日、小时)、生理状态和注意力而发生“漂移”(Baseline Firing Rate Shifting),导致模型性能下降,需要重新校准或自适应调整。
    • 从“尝试移动”到“意念控制”的飞跃:首位用户 Noland Arbaugh 的体验揭示了一个关键的认知转变。他最初通过**“尝试移动”(Attempted Movement,即物理上尽力去驱动瘫痪的肢体)来控制光标,这是一种间接的映射。然而,数周后,他自发地学会了直接用“意念”或“想象”来移动光标**(Imagined Movement),他形容这种感觉如同“使用原力”,光标甚至在他有意识地“尝试”之前就已移动。这一飞跃证明了大脑极强的可塑性,并为实现更高效、更直观的控制开辟了道路。
  • 证据/案例

    • 性能恢复:在 Noland 的部分电极发生回缩导致信号质量下降后,团队通过更新固件,将解码策略从单一的“脉冲检测(Spike Detection)”切换为更鲁棒的**“脉冲频段功率(Spike Band Power)”**分析,成功恢复并超越了之前的性能记录,BPS(Bits Per Second)达到了8.5,是之前世界纪录(4.6 BPS)的近两倍。
    • 用户反馈驱动迭代:Noland 每天与团队进行长达8小时的反馈会话,产生了海量的笔记(仅3月份就有271页)。团队根据他的反馈,每天可进行4-5次应用更新,快速迭代 UX 设计,例如开发了“磁性目标”和“快速滚动条”等功能,以适应 BCI 控制的独特性。

维度三:战略路线图:从恢复功能到增强人类

  • 核心观点:Neuralink 的发展遵循一个清晰的“科技树”:首先聚焦于风险/回报比最明确的医疗应用以恢复失去的功能,待技术成熟、风险降至极低后,再逐步扩展至针对健康人群的增强(Augmentation)应用。

  • 原理解构

    • 第一阶段:恢复(Restoration):当前阶段的目标是帮助因脊髓损伤或神经退行性疾病(如ALS)而严重瘫痪的患者。
      • 产品一:Telepathy。旨在恢复沟通和数字交互能力。通过解码运动皮层(Motor Cortex)的意图,让用户能以思想控制电脑光标和键盘。
      • 产品二:Blindsight。计划中的下一款产品,旨在为完全失明(如眼球或视神经损伤)的患者恢复视觉。通过直接刺激视觉皮层(Visual Cortex)的神经元,生成视觉感知(Phosphenes)。
    • 第二阶段:增强(Augmentation):长期愿景是为健康人提供超人能力。Elon Musk 指出,在为残障人士恢复功能时,目标就不仅仅是恢复到“正常人”水平,而是要超越正常人。这包括:
      • 超人交流速度:BPS 远超打字或说话速度,实现真正意义上的“心灵感应”式信息传输。
      • 超人视觉:恢复的视觉分辨率将超越人眼,并可能扩展到红外、紫外等不同波段。
  • 证据/案例

    • Stephen Hawking 案例:Musk 以霍金为例,如果他能拥有 Neuralink,其思想的输出速度可能会比正常人对话还要快。
    • 电竞选手案例:Bliss Chapman 和 Elon Musk 预测,由于 BCI 绕过了肌肉传导的延迟(约75ms),未来5-10年,顶尖的电竞比赛可能会由瘫痪玩家主宰。
    • 从1024到16000+通道:DJ Seo 透露,下一代植入物的通道数将从1024提升至3000-6000,长远目标是16000甚至更多,这是实现高带宽增强的基础。

维度四:人机共生:作为AI安全方案的“带宽理论”

  • 核心观点:Elon Musk 认为,Neuralink 是应对人工智能潜在风险的一种长期、根本性的解决方案。其核心逻辑在于,通过大幅提升人脑与数字世界的通信带宽,可以实现更深层次的人机融合,从而确保人类的集体意志能够与未来的超级智能保持对齐。

  • 原理解构

    • 带宽不对等风险:当前人类通过键盘、语音与计算机交互,速率约为每秒几个比特(bits per second)。而AI的内部通信速率可达每秒太比特(terabits per second)。这种数量级的巨大差异,使得在AI看来,与人类交互“就像和一棵树说话”。在这种情况下,AI 很容易会忽视、绕过或“优化掉”人类的意图。
    • 提升带宽以实现对齐:Neuralink 的目标是将人类的输出速率提升数个数量级(从bps到Mbps甚至更高)。高带宽接口意味着人类的思想、意图和价值可以更完整、更迅速地与AI交互,使人类成为AI系统更紧密的组成部分,而非一个外部、低效的“指令源”。
    • 人类意志的来源:Musk 提出了一个有趣的框架:人类大脑的边缘系统(Limbic System)提供了原始的驱动力与“意志”(如生存、繁衍、追求快乐),大脑皮层(Cortex)负责执行与规划,而我们使用的计算机和手机是第三层(Tertiary Layer)。AI 作为这个第三层的终极形态,在一个良性场景中,其最终目的可能仍然是服务于人类最底层的边缘系统意志。提升带宽是确保这个服务环路高效、无损的关键。
  • 证据/案例

    • “与树对话”类比:Musk 反复使用这个类比来强调带宽差异的严重性。
    • 人类平均BPS计算:Musk估算,人类在一天24小时内的平均通信速率远低于1 BPS,凸显了当前交互方式的极端低效。
    • 取代手机:Musk 预测,当 Neuralink 足够安全并能提供超人能力时,它将取代手机,成为下一代个人计算平台,因为它能更直接地理解用户意图,解决了当前人机交互的最大痛点。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识

    • UX > ML:与主流认为 BCI 成功的关键在于更强的机器学习模型不同,Bliss Chapman 认为,当前更大的瓶颈是 用户体验(UX)和如何设计校准任务来获取高质量的“意图标签”。一个糟糕的 UX 会产生噪声数据,再好的模型也无能为力。
    • 高通道数并非万能:虽然增加电极数量很重要,但 Bliss Chapman 指出,单纯增加通道数无法解决核心的“标签问题”和用户与模型的协同适应挑战
    • “优化不存在之物”是最大陷阱:Elon Musk 的工程哲学中,最反直觉的一点是,在简化和优化一个流程之前,首要且最重要的一步是尝试彻底删除它。他甚至设定了“如果你没有被迫加回至少10%被删除的东西,说明你删得还不够多”的激进标准。
    • 瘫痪者将成为电竞冠军:普遍看法是 BCI 旨在帮助残障人士恢复“正常”功能。而 Neuralink 团队的观点是,由于绕过了生理延迟,BCI 将赋予他们超越常人的反应速度,使其在特定领域(如电竞)具备绝对优势。
  • 盲点与局限

    • 硬件的脆弱性:对话坦诚地披露了首位患者 Noland 的部分电极线(Threads)在植入数周后发生了意外回缩,导致有效电极数量减少。这表明,尽管柔性电极在生物相容性上有优势,但在长期机械稳定性方面仍面临挑战,尤其是在动态的人脑环境中。
    • 从动物到人的鸿沟:DJ Seo 提到,人脑的尺寸是猴脑的10倍,且在手术中观察到其移动幅度远超预期。这说明动物模型无法完全预测在人体中遇到的工程和生物学挑战。
    • 通往大脑深处的挑战:神经外科主任 Matthew MacDougall 承认,目前 Neuralink 的技术主要局限于大脑皮层。要安全地将大量电极植入大脑深层结构(如丘脑、海马体),同时避免造成损伤,仍然是一个巨大的、尚未解决的工程难题。
  • 未解之谜

    • 意识的本质:尽管 Matthew MacDougall 提出了一个功能性的解释(意识是“大脑感知自身活动的感觉”),但对话承认,BCI 仍未触及“意识如何从神经活动中涌现”的“硬问题”。
    • 增强能力的临界点:达到多少个电极或多高的 BPS 才能引发人类体验的质变?是1万、10万还是更多?Elon Musk 承认这仍然是未知的。
    • 通用解码器的可能性:当前的模型高度个性化,需要为每位用户单独校准。未来是否能开发出一种可以跨用户泛化的“基础模型”,从而极大简化初始设置过程,仍是一个开放的研究问题。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. Elon Musk (on the motivation for Neuralink):

    “In the most benign scenario of AI, you have to consider that the AI is simply going to get bored waiting for you to spit out a few words. If the AI can communicate at terabits per second, and you’re communicating at bits per second, it’s like talking to a tree.” 中文意译:“在最乐观的AI场景下,你也得考虑到,AI会因为等你慢悠悠地吐出几个词而感到厌烦。如果AI能以每秒太比特的速度交流,而你只能以每秒比特的速度交流,那就好比在和一棵树说话。” 语境:解释为何提升人机通信带宽是确保人类在AI时代保持相关性的关键。

  2. Elon Musk (on his engineering philosophy):

    “The most common mistake of smart engineers is to optimize a thing that should not exist.” 中文意译:“聪明的工程师最常犯的错误,就是去优化一个本不应该存在的东西。” 语境:阐述其“删除、简化、加速、自动化”五步法工程哲学中,将“删除”置于首位的重要性。

  3. Bliss Chapman (on the challenge of BCI):

    “UX is how it works… The ideal UX is one that the user doesn’t have to think about what they need to do in order to get it done, it just does it.” 中文意译:“用户体验(UX)就是产品运作的方式……理想的UX是用户无需思考如何去完成任务,它就自然而然地完成了。” 语境:强调 BCI 成功的关键不仅是技术,更是创造一种让用户感觉意图与行动无缝衔接的直观体验。

  4. Noland Arbaugh (on discovering direct mental control):

    “I looked over and the cursor just shot over. It was wild. I had to take a step back. I was like, ‘This should not be happening.’… It just opened up a whole new world of possibilities.” 中文意译:“我(的视线)看过去,光标就‘咻’地一下飞过去了。太疯狂了。我得缓缓……我当时就想‘这不应该发生啊’……它开启了一个充满无限可能的新世界。” 语境:描述他从“尝试移动肢体”到发现可以“直接用思想”控制光标的“顿悟”时刻。

  5. Matthew MacDougall (on the emotional toll of neurosurgery):

    “Every neurosurgeon carries with them a private graveyard.” 中文意译:“每个神经外科医生的内心,都带着一座私人的墓园。” 语境:谈及作为一名神经外科医生,面对无法挽救的年轻生命时所承受的沉重情感负担,以及这份负担如何成为推动技术进步的动力。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)

    • 技术栈:Neuralink 的成功将推动 BCI 行业向高通道数、柔性电极、无线化和机器人辅助手术的技术范式靠拢。竞争对手需要证明其在生物相容性、长期稳定性和可扩展性上能与此匹敌。
    • 产品形态:市场将首先见证针对严重瘫痪和失明患者的第一批商业化、可在家使用的 BCI 产品。迭代将主要集中在软件和固件层面,通过空中升级(OTA)快速优化解码算法和用户体验。
    • 竞争格局:Neuralink 凭借其资本优势和垂直整合,在研发速度和系统集成度上建立了显著领先地位。短期内,其他公司可能会在特定细分领域(如非侵入式 BCI、特定疾病的神经调控)寻求差异化竞争。
  • 长期终局 (5-10年)

    • 行业图景:如果嘉宾的设想成真,BCI 将从一个利基医疗市场演变为一个全新的消费级计算平台。它将首先作为高端外设存在,最终可能取代智能手机,成为人与数字世界交互的主要入口。社会将面临深刻的伦理和公平性讨论,关于“增强人类”和“普通人类”之间的鸿沟。
    • 新生态诞生:一个围绕 BCI 的**“大脑应用商店”**生态将会出现,开发者将为这个全新的输入范式创造原生应用,涵盖通信、娱乐、学习、创作等领域。
    • 人机边界模糊:人类将能够以思想的速度与 AI 伙伴进行无缝协作,记忆可以被存储、检索甚至“修复”。“我”与“我的数字延伸”之间的界限将变得模糊,可能催生全新的社会结构和自我认知。
  • 行动建议

    • 开发者:关注 BCI 的 UX 设计和神经数据解码。这不仅仅是技术挑战,更是结合了心理学、神经科学和人机交互的跨学科蓝海。可以开始探索如何为低延迟、高维度的“思想输入”设计全新的交互范式。
    • 投资者:评估 BCI 公司时,不仅要看硬件指标,更要关注其软件迭代能力、数据闭环以及建立的生态护城河。能够通过软件创新解决硬件局限(如 Neuralink 应对电极回缩的案例)是公司韧性的关键指标。长期来看,这是一个平台级的投资机会,而非单纯的医疗器械。
    • 创业者:尽管与 Neuralink 在全栈侵入式 BCI 上直接竞争难度极大,但在周边生态系统中存在大量机会:为 BCI 用户开发专用软件应用、构建数据分析与可视化工具、研发更先进的非侵入式或半侵入式技术作为补充,以及提供与 BCI 相关的伦理和安全咨询服务。

这是一份基于 Elon Musk、Neuralink 核心团队(DJ Seo, Matthew MacDougall, Bliss Chapman)以及首位人类植入者 Noland Arbaugh 对话深度提炼的行业研究报告。


🧠 脑机接口(BCI)的“登月时刻”:Neuralink 技术路线与人类进化推演

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:Neuralink 刚刚完成首例人类植入(Noland Arbaugh),并成功应对了初期出现的“电极线收缩”技术挑战。本次对话集合了创始人、核心工程师、主刀医生及患者,旨在向公众交代首个临床试验(PRIME Study)的阶段性成果。
  • 核心论点:脑机接口不仅仅是针对脊髓损伤患者的医疗康复手段,它本质上是一个解决人类与 AI 通信带宽不对称(Bandwidth Bottleneck)的底层工程方案。Neuralink 认为人类已是拥有手机/电脑作为三级皮质层的“半机械人”,当前的瓶颈在于 I/O(输入/输出)速度太慢。通过高带宽电极和垂直集成的自动化手术机器人,Neuralink 试图将人类进化为能够与 AI 共生的新物种,通过提升“比特率”(BPS)来防止人类在 AI 时代沦为像“树木”一样沟通迟缓的生物。

2. 🧠 深度观点解析 (Deep Dive Analysis)

2.1 通信带宽:从比特率到意识的“有效压缩”

Elon Musk 提出一个关键度量衡:比特率(Bits Per Second, BPS)

  • 原理解构:人类语言是极度低效的压缩策略。虽然思维极其复杂,但通过声带或手指输出的速率平均每天不足 1 BPS。AI 正在以每秒万亿比特的速度演进,如果人类不能大幅提升输出带宽,将无法实现“人类意志”对 AI 的有效对齐。
  • 技术演进:Neuralink 的初期目标是超越物理手的输入速度。P1 患者 Noland 目前已达到 8.5 BPS(打破了非 Neuralink 设备的 4.6 纪录),而团队的中期目标是 100 甚至 1000 BPS,最终实现兆比特量级的数据交互。

2.2 硬件创新:柔性丝(Threads)与垂直集成

Neuralink 摒弃了学术界常用的硬性“犹他阵列(Utah Array)”,采用了创新的柔性材料。

  • 原理解构:大脑在颅内是不断搏动和位移的。硬性电极会引发免疫反应,导致胶质瘢痕包裹电极(电极中毒),最终失去信号。Neuralink 的柔性丝(16微米宽)比头发细得多,能随脑组织移动。
  • 证据案例:DJ Seo 展示的组织学图像显示,神经元可以紧贴着 Neuralink 电极生长而无炎症。为了植入这种极细的柔性丝,团队专门研发了 R1 手术机器人,利用计算机视觉在微秒级避开所有血管,实现“非缝合式”精准植入。

2.3 解码哲学:UX 即功能(UX is how it works)

软件负责人 Bliss Chapman 提出一个深刻的软件工程观点:BCI 的挑战在于标注(Labeling)问题

  • 原理解构:在瘫痪患者身上,我们无法获得其意图的“地面真值(Ground Truth)”。解码器必须通过“尝试移动”到“想象移动”的跨越。Neuralink 发现,最好的解码模型不是拟合物理运动,而是拟合“意图”。
  • 创新路径:Neuralink 引入了“磁性目标(Magnetic Targets)”和“快速滚动(Quick Scroll)”等交互设计。这证明了 BCI 不只是信号处理,更是一场关于反馈回路的交互革命。

2.4 医疗天梯图:从运动恢复到全感官增强

Neuralink 披露了其产品路径图(Tech Tree):

  • 第一阶段(Telepathy):恢复运动控制,实现数字自主(上网、办公、电竞)。
  • 第二阶段(Blindsight):针对盲人,通过直接刺激视觉皮层实现视觉重建,甚至包括红外、紫外等超人类波长。
  • 远期阶段:修复抑郁症、精神分裂症、记忆恢复,最终实现记忆备份与云端下载。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识(意识非魔术):主刀医生 MacDougall 认为意识并不神秘,它只是大脑感知自身运行的触觉映射(Feeling your brain work)。这种去魅化的视角使得“数字化意识”在工程上变得可量化。
  • 盲点与局限(被低估的脑动量):P1 患者出现的电极线脱落现象,是因为团队低估了人类大脑相对于动物大脑的位移量(由于人类颅骨空间更大,脑组织晃动更剧烈)。这是从实验室走向临床必须支付的“认知税”。
  • 未解之谜(深度区域进入):目前的 Neuralink 仅作用于皮质表面。如何在高安全性的前提下,像 R1 机器人处理皮层那样进入大脑深区(丘脑等),目前仍缺乏高带宽、无损的工程方案。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. Elon Musk: “If the AI can communicate at terabits per second, and you’re communicating at bits per second, it’s like talking to a tree.”
    • 意译:如果 AI 能以每秒万亿比特的速度交流,而你只能每秒输出几个比特,那对 AI 来说,跟你说话就像跟一棵树说话一样。(语境:讨论人类为何需要高带宽 BCI 来实现 AI 对齐。)
  2. Matthew MacDougall: “Every neurosurgeon carries with them a private graveyard.”
    • 意译:每位神经外科医生内心都背负着一座私人墓地。(语境:讨论在面对无法医治的脑损伤患者时的无力感,以及 Neuralink 的使命。)
  3. Noland Arbaugh (P1): “The cursor just shot over. It was wild… It moves before I’m actually intending it to.”
    • 意译:光标直接飞了过去,太疯狂了。它甚至在我意识到意图之前就移动了。(语境:描述从“尝试移动”到“直接意图控制”的神奇飞跃。)
  4. Bliss Chapman: “UX is not something that you can always solve by constant iterating… Sometimes you really need to step back and think globally.”
    • 意译:用户体验不是靠不断迭代就能解决的,有时你必须退后一步,从全局视角审视。(语境:讨论如何为脑控设备设计全新的交互范式。)

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年):脑机接口将改变辅助技术的竞争格局。Neuralink 的“电竞级”低延迟(22ms)将使瘫痪患者在数字竞技中不仅是参与者,更可能由于绕过了物理神经传导延迟而成为顶级玩家。
  • 长期终局 (5-10年)智能手机的消亡。如果 BCI 能实现隐形的文字输入(想象拼写)和骨传导音频输入,人类将不再需要手持屏幕。这种无感交互将彻底重塑社交形态和教育评价体系(如考试将失去意义)。
  • 行动建议
    • 开发者:应开始关注“非物理输入驱动”的应用开发,BCI 的高维度控制(22个自由度以上)需要全新的软件 UI 框架。
    • 投资者:关注高带宽神经信号处理芯片(ASIC)和生物兼容性封装材料,这是从医疗器械转型为消费电子的核心壁垒。
    • 生物学者:研究大脑的“自适应性”,Neuralink 的成功证明了大脑可以迅速学会将意图直接映射到数字光标,这种神经可塑性是未来所有增强技术的基石。—

逐字稿

Introduction

Lex Fridman (00:00:00) The following is a conversation with Elon Musk, DJ Seo, Matthew MacDougall, Bliss Chapman, and Noland Arbaugh about Neuralink and the future of humanity. Elon, DJ, Matthew and Bliss are of course part of the amazing Neuralink team, and Noland is the first human to have a Neuralink device implanted in his brain. I speak with each of them individually, so use timestamps to jump around, or as I recommend, go hardcore, and listen to the whole thing. This is the longest podcast I’ve ever done. It’s a fascinating, super technical, and wide-ranging conversation, and I loved every minute of it. And now, dear friends, here’s Elon Musk, his fifth time on this, the Lex Fridman podcast,

Elon Musk (00:00:49) Drinking coffee or water?

Lex Fridman (00:00:51) Water. I’m so over-caffeinated right now. Do you want some caffeine?

Lex Fridman (00:00:59) There’s a Nitro drink.

Elon Musk (00:01:02) This supposed to keep you up for like tomorrow afternoon, basically.

Lex Fridman (00:01:08) Yeah. Yeah. I don’t want to [inaudible 00:01:11].

Elon Musk (00:01:11) So what is Nitro? It’s just got a lot of caffeine or something?

Lex Fridman (00:01:13) Don’t ask questions. It’s called Nitro. Do you need to know anything else?

Elon Musk (00:01:17) It’s got nitrogen in it. That’s ridiculous. What we breathe is 78% nitrogen anyway. What do you need to add more for?

Elon Musk (00:01:24) Unfortunately, you’re going to eat it.

Elon Musk (00:01:29) Most people think that they’re breathing oxygen and they’re actually breathing 78% nitrogen. You need like a milk bar, like from Clockwork Orange.

Lex Fridman (00:01:41) Yeah. Yeah. Is that the top three Kubrick film for you?

Elon Musk (00:01:44) Clockwork Orange? It’s pretty good. It’s demented. Jarring, I’d say.

Lex Fridman (00:01:49) Okay. Okay. So, first, let’s step back, and big congrats on getting Neuralink implanted into a human. That’s a historic step for Neuralink.

Lex Fridman (00:02:04) And there’s many more to come.

Elon Musk (00:02:07) Yeah. And we just obviously have our second implant as well.

Elon Musk (00:02:12) So far, so good. It looks like we’ve got, I think, on the order of 400 electrodes that are providing signals.

Lex Fridman (00:02:24) How quickly do you think the number of human participants will scale?

Elon Musk (00:02:28) It depends somewhat on the regulatory approval, the rate at which we get regulatory approvals. So, we’re hoping to do 10 by the end of this year, total of 10. So, eight more.

Lex Fridman (00:02:42) And with each one, you’re going to be learning a lot of lessons about the neurobiology of the brain, everything. The whole chain of the Neuralink, the decoding, the signal processing, all that kind of stuff.

Elon Musk (00:02:54) Yeah. Yeah. I think it’s obviously going to get better with each one. I don’t want to jinx it, but it seems to have gone extremely well with the second implant. So, there’s a lot of signal, a lot of electrodes. It’s working very well.

Lex Fridman (00:03:09) What improvements do you think we’ll see in Neuralink in the coming, let’s say, let’s get crazy, the coming years.

Elon Musk (00:03:18) In years, it’s going to be gigantic, because we’ll increase the number of electrodes dramatically. We’ll improve the signal processing. So, even with only roughly, I don’t know, 10, 15% of the electrodes working with Noland, with our first patient, we were able to get to achieve a bit per second. That’s twice the world record. So, I think we’ll start vastly exceeding the world record by orders of magnitude in the years to come. So, start getting to, I don’t know, 100 bits per second, thousand. Maybe if five years from now, we might be at a megabit, faster than any human could possibly communicate by typing, or speaking.

Telepathy

Lex Fridman (00:04:06) Yeah. That BPS is an interesting metric to measure. There might be a big leap in the experience once you reach a certain level of BPS.

Lex Fridman (00:04:17) Like entire new ways of interacting with a computer might be unlocked.

Lex Fridman (00:04:22) With other humans.

Elon Musk (00:04:23) Provided they have want a Neuralink, too.

Elon Musk (00:04:28) Otherwise they wont be able to absorb the signals fast enough.

Lex Fridman (00:04:31) Do you think they’ll improve the quality of intellectual discourse?

Elon Musk (00:04:34) Well, I think you could think of it, if you were to slow down communication, how do you feel about that? If you’d only talk at, let’s say one-tenth of normal speed, you’d be like, “Wow, that’s agonizingly slow.”

Elon Musk (00:04:51) So, now imagine you could communicate clearly at 10, or 100, or 1,000 times faster than normal.

Lex Fridman (00:05:00) Listen, I’m pretty sure nobody in their right mind listens to me at 1X. they listen at 2X. I can only imagine what 10X would feel like, or I could actually understand it.

Elon Musk (00:05:14) I usually default to 1.5X. You can do 2X. Well actually, if I’m listening to somebody get to… in 15, 20 minutes, I want to go to sleep, then I’ll do it 1.5X. If I’m paying attention, I’ll do 2X.

Elon Musk (00:05:32) But actually, if you actually listen to podcasts, or audiobooks or anything at… If you get used to doing it at 1.5, then one sounds painfully slow.

Lex Fridman (00:05:43) I’m still holding onto one, because I’m afraid, I’m afraid of myself becoming bored with the reality, with the real world, where everyone’s speaking in 1X.

Elon Musk (00:05:53) Well, it depends on the person. You can speak very fast. Like we can communicate very quickly. And also, if you use a wide range of… if your vocabulary is larger, your effective bit rate is higher.

Lex Fridman (00:06:06) That’s a good way to put it.

Lex Fridman (00:06:07) The effective bit rate. That is the question, is how much information is actually compressed in the low bit transfer of language?

Elon Musk (00:06:15) Yeah. If there’s a single word that is able to convey something that would normally require, I don’t know, 10 simple words, then you’ve got maybe a 10X compression on your hands. And that’s really like with memes. Memes are like data compression. You’re simultaneously hit with a wide range of symbols that you can interpret, and you get it faster than if it were words, or a simple picture.

Lex Fridman (00:06:49) And of course, you’re referring to memes broadly like ideas.

Elon Musk (00:06:52) Yeah. There’s an entire idea structure that is like an idea template, and then you can add something to that idea template. But somebody has that pre-existing idea template in their head. So, when you add that incremental bit of information, you’re conveying much more than if you just said a few words. It’s everything associated with that meme.

Lex Fridman (00:07:15) You think there’ll be emergent leaps of capability as you scale the number of electrodes?

Lex Fridman (00:07:19) Do you think there’ll be an actual number where just the human experience will be altered?

Lex Fridman (00:07:27) What do you think that number might be? Whether electrodes, or BPS? We of course, don’t know for sure, but is this 10,000, 100,000?

Elon Musk (00:07:37) Yeah. Certainly, if you’re anywhere at 10,000 bits per second, that’s vastly faster than any human can communicate right now. If you think what is the average bits per second of a human, it is less than one bit per second over the course of a day. Because there are 86,400 seconds in a day, and you don’t communicate 86,400 tokens in a day. Therefore, your bits per second is less than one, averaged over 24 hours. It’s quite slow.

(00:08:04) And now, even if you’re communicating very quickly, and you’re talking to somebody who understands what you’re saying, because in order to communicate, you have to at least to some degree, model the mind state of the person to whom you’re speaking. Then take the concept you’re trying to convey, compress that into a small number of syllables, speak them, and hope that the other person decompresses them into a conceptual structure that is as close to what you have in your mind as possible.

Lex Fridman (00:08:34) Yeah. There’s a lot of signal loss there in that process.

Elon Musk (00:08:37) Yeah. Very lossy, compression, and decompression. And a lot of what your neurons are doing is distilling the concepts down to a small number of symbols, or say syllables that I’m speaking, or keystrokes, whatever the case may be. So, that’s a lot of what your brain computation is doing. Now, there is an argument that that’s actually a healthy thing to do, or a helpful thing to do because as you try to compress complex concepts, you’re perhaps forced to distill what is most essential in those concepts, as opposed to just all the fluff. So, in the process of compression, you distill things down to what matters the most, because you can only say a few things.

(00:09:27) So that is perhaps helpful. I think we’ll probably get… If our data rate increases, it’s highly probable it will become far more verbose. Just like your computer, when computers had… My first computer had 8K of RAM, so you really thought about every byte. And now you’ve got computers with many gigabytes of RAM. So, if you want to do an iPhone app that just says, “Hello world,” it’s probably, I don’t know, several megabytes minimum, a bunch of fluff. But nonetheless, we still prefer to have the computer with the more memory and more compute.

(00:10:09) So, the long-term aspiration of Neuralink is to improve the AI human symbiosis by increasing the bandwidth of the communication. Because even if… In the most benign scenario of AI, you have to consider that the AI is simply going to get bored waiting for you to spit out a few words. If the AI can communicate at terabits per second, and you’re communicating at bits per second, it’s like talking to a tree.

Power of human mind

Lex Fridman (00:10:45) Well, it is a very interesting question for a super intelligent species, what use are humans?

Elon Musk (00:10:54) I think there is some argument for humans as a source of will.

Elon Musk (00:11:00) Will, yeah. Source of will, or purpose. So if you consider the human mind as being… Essentially there’s the primitive, limbic elements, which basically even reptiles have. And there’s the cortex, the thinking and planning part of the brain. Now, the cortex is much smarter than the limbic system, and yet is largely in service to the limbic system. It’s trying to make the limbic system happy. The sheer amount of compute that’s gone into people trying to get laid is insane, without actually seeking procreation. They’re just literally trying to do this simple motion, and they get a kick out of it. So, this simple, which in the abstract, rather absurd motion, which is sex, the cortex is putting a massive amount of compute into trying to figure out how to do that.

Lex Fridman (00:11:55) So like 90% of distributed compute of the human species is spent on trying to get laid, probably. A large percentage.

Elon Musk (00:12:00) A massive amount. Yes. Yeah. Yeah. There’s no purpose to most sex except hedonistic. It’s a sort of joy, or whatever, dopamine release. Now, once in a while, it’s procreation, but for modern humans, it’s mostly recreational. And so, your cortex, much smarter than your limbic system, is trying to make the limbic system happy, because the limbic system wants to have sex, or wants some tasty food, or whatever the case may be.

(00:12:31) And then that is then further augmented by the tertiary system, which is your phone, your laptop, iPad, whatever, all your computing stuff. That’s your tertiary layer. So, you’re actually already a cyborg. You have this tertiary compute layer, which is in the form of your computer with all the applications, or your compute devices. And so, in the getting laid front, there’s actually a massive amount of digital compute also trying to get laid, with Tinder and whatever.

Lex Fridman (00:13:04) Yeah. So, the compute that we humans have built is also participating.

Elon Musk (00:13:09) Yeah. There’s like gigawatts of compute going into getting laid, of digital compute.

Lex Fridman (00:13:14) Yeah. What if AGI was-

Elon Musk (00:13:17) This is happening as we speak.

Lex Fridman (00:13:19) … if we merge with AI, it’s just going to expand the compute that we humans use-

Lex Fridman (00:13:24) … to try to get laid.

Elon Musk (00:13:25) Well, it’s one of the things. Certainly, yeah.

Elon Musk (00:13:29) But what I’m saying is that, yes, is there a use for humans? Well, there’s this fundamental question of what’s the meaning of life? Why do anything at all? And so, if our simple limbic system provides a source of will to do something, that then goes through our cortex, that then goes to our tertiary compute layer, then I don’t know, it might actually be that the AI, in a benign scenario, is simply trying to make the human limbic system happy.

Lex Fridman (00:14:03) Yeah. It seems like the will is not just about the limbic system. There’s a lot of interesting, complicated things in there. We also want power.

Elon Musk (00:14:11) That’s limbic too, I think.

Lex Fridman (00:14:13) But then we also want to, in a kind of cooperative way, alleviate the suffering in the world.

Elon Musk (00:14:19) Not everybody does. But yeah, sure, some people do.

Lex Fridman (00:14:22) As a group of humans, when we get together, we start to have this kind of collective intelligence that is more complex in its will than the underlying individual descendants of apes, right?

Lex Fridman (00:14:37) So there’s other motivations, and that could be a really interesting source of an objective function for AGI?

Elon Musk (00:14:45) Yeah. There are these fairly cerebral, or higher level goals. For me, it’s like, what’s the meaning of life, or understanding the nature of the universe, is of great interest to me, and hopefully to the AI. And that’s the mission of xAI and Grok is understand the universe.

Lex Fridman (00:15:13) So do you think people… When you have a Neuralink with 10,000, 100,000 channels, most of the use cases will be communication with AI systems?

Elon Musk (00:15:27) Well, assuming that there are not… They’re solving basic neurological issues that people have. If they’ve got damaged neurons in their spinal cord, or neck, as is the case with our first two patients, then obviously the first order of business is solving fundamental neuron damage in a spinal cord, neck, or in the brain itself. So, our second product is called Blindsight, which is to enable people who are completely blind, lost both eyes, or optic nerve, or just can’t see at all, to be able to see by directly triggering the neurons in the visual cortex.

(00:16:18) So we’re just starting at the basics here, so it’s the simple stuff, relatively speaking, is solving neuron damage. It can also solve I think probably schizophrenia, if people have seizures of some kind, it could probably solve that. It could help with memory. So, there’s kind of a tech tree, if you will. You’ve got the basics. You need literacy before you can have Lord of the Rings.

Elon Musk (00:17:02) So, do you have letters and the alphabet? Okay, great. Words? And then eventually you get sagas. So, I think there may be some things to worry about in the future, but the first several years are really just solving basic neurological damage, like for people who have essentially complete or near complete loss from the brain to the body, like Stephen Hawking would be an example, the Neuralink would be incredibly profound, because you can imagine if Stephen Hawking could communicate as fast as we’re communicating, perhaps faster. And that’s certainly possible. Probable, in fact. Likely, I’d say.

Lex Fridman (00:17:46) So there’s a kind of dual track of medical and non-medical, meaning so everything you’ve talked about could be applied to people who are non-disabled in the future?

Elon Musk (00:17:58) The logical thing to do is… Sensible thing to do is to start off solving basic neuron damage issues.

Elon Musk (00:18:11) Because there’s obviously some risk with a new device. You can’t get the risk down to zero, it’s not possible. So, you want to have the highest possible reward, given there’s a certain irreducible risk. And if somebody’s able to have a profound improvement in their communication, that’s worth the risk.

Lex Fridman (00:18:34) As you get the risk down.

Elon Musk (00:18:36) Yeah. As you get the risk down. And once the risk is down to… If you have thousands of people that have been using it for years and the risk is minimal, then perhaps at that point you could consider saying, “Okay, let’s aim for augmentation.” Now, I think we’re actually going to aim for augmentation with people who have neuron damage. So we’re not just aiming to give people the communication data rate equivalent to normal humans. We’re aiming to give people who have… A quadriplegic, or maybe have complete loss of the connection to the brain and body, a communication data rate that exceeds normal humans. While we’re in there, why not? Let’s give people superpowers.

Lex Fridman (00:19:20) And the same for vision. As you restore vision, there could be aspects of that restoration that are superhuman.

Elon Musk (00:19:27) Yeah. At first, the vision restoration will be low res, because you have to say, “How many neurons can you put in there, and trigger?” And you can do things where you adjust the electric field. So, even if you’ve got, say 10,000 neurons, it’s not just 10,000 pixels, because you can adjust the field between the neurons, and do them in patterns in order to have say, 10,000 electrodes, effectively give you, I don’t know, maybe like having a megapixel, or a 10 megapixel situation. And then over time, I think you get to higher resolution than human eyes. And you could also see in different wavelengths. So, like Geordi La Forge from Star Trek, he had the thing. Do you want to see it in radar? No problem. You could see ultraviolet, infrared, eagle vision, whatever you want.

Ayahuasca

Lex Fridman (00:20:28) Do you think there’ll be… let me ask a Joe Rogan question. Do you think there’ll be… I just recently have taken ayahuasca.

Elon Musk (00:20:35) Is that a serious question?

Elon Musk (00:20:39) Well, I guess technically it is.

Elon Musk (00:20:42) Yeah, is this DMT in there, or something?

Lex Fridman (00:20:42) Love you, Joe. Okay.

Elon Musk (00:20:48) Wait, wait. Have you said much about it, the ayahuasca stuff?

Lex Fridman (00:20:48) I have not. I have not. I have not.

Elon Musk (00:20:53) Okay. Well, why don’t you spill the beans?

Lex Fridman (00:20:55) It is a truly incredible experience.

Elon Musk (00:20:57) Let me turn the tables on you.

Elon Musk (00:21:00) You’re in the jungle.

Lex Fridman (00:21:02) Yeah, amongst the trees, myself and a shaman.

Elon Musk (00:21:02) Yeah. It must’ve been crazy.

Lex Fridman (00:21:05) Yeah, yeah, yeah. With the insects, with the animals all around you, the jungle as far as the eye can see, there’s no… That’s the way to do it.

Elon Musk (00:21:13) Things are going to look pretty wild.

Lex Fridman (00:21:14) Yeah, pretty wild. I took an extremely high dose.

Elon Musk (00:21:19) Just don’t go hugging an Anaconda or something.

Lex Fridman (00:21:24) You haven’t lived unless you made love to an Anaconda. I’m sorry, but…

Lex Fridman (00:21:33) Yeah. I took a extremely high dose.

Elon Musk (00:21:39) Damn. Okay. That sounds like a lot. Is normal to just one cup? Or…

Lex Fridman (00:21:42) One or two. Usually one.

Elon Musk (00:21:46) Okay. Wait. Like right off the bat, or did you work your way up to it? Did you just jump in at the deep end?

Lex Fridman (00:21:53) Across two days, because the first day, I took two, and it was a ride, but it wasn’t quite like a…

Elon Musk (00:21:59) It wasn’t like a revelation.

Lex Fridman (00:22:01) It wasn’t into deep space type of ride. It was just like a little airplane ride. And I [inaudible 00:22:07] saw some trees, and some visuals, and just saw a dragon and all that kind of stuff. But…

Elon Musk (00:22:13) Nine cups, you went to Pluto, I think.

Lex Fridman (00:22:15) Pluto. Yeah. No, Deep space.

Lex Fridman (00:22:19) One of the interesting aspects of my experience is I thought I would have some demons, some stuff to work through.

Elon Musk (00:22:24) That’s what people [inaudible 00:22:26].

Lex Fridman (00:22:26) That’s what everyone says.

Elon Musk (00:22:27) That’s what everyone says. Yeah, exactly.

Lex Fridman (00:22:29) I had nothing. I had all positive. I just… So full-

Lex Fridman (00:22:32) I don’t think so. I don’t know. But I kept thinking about, I had extremely high resolution thoughts about the people I know in my life. You were there, and it is just not from my relationship with that person, but just as the person themselves. I had just this deep gratitude of who they are.

Lex Fridman (00:22:53) It was just like this exploration, like Sims, or whatever. You get to watch them. I got to watch people, and just be in awe of how amazing they are.

Elon Musk (00:23:02) That sounds awesome.

Lex Fridman (00:23:02) Yeah, it was great. I was waiting for-

Elon Musk (00:23:05) When’s the demon coming?

Lex Fridman (00:23:07) Exactly. Maybe I’ll have some negative thoughts. Nothing. Nothing. Just extreme gratitude for them. And also a lot of space travel.

Elon Musk (00:23:18) Space travel to where?

Lex Fridman (00:23:20) So here’s what it was. It was people, the human beings that I know, they had this kind of… The best way I could describe it is they had a glow to them.

Lex Fridman (00:23:30) And then I kept flying out from them to see earth, to see our solar system, to see our galaxy. And I saw that light, that glow all across the universe, whatever that form is, whatever that…

Elon Musk (00:23:49) Did you go past the Milky Way?

Elon Musk (00:23:53) Okay. You’re like intergalactic.

Lex Fridman (00:23:54) Yeah, intergalactic.

Lex Fridman (00:23:56) But always pointing in, yeah. Past the Milky Way, past… I mean, I saw a huge number of galaxies, intergalactic, and all of it was glowing, but I couldn’t control that travel, because I would actually explore near distances to the solar system, see if there’s aliens, or any of that kind of stuff.

Elon Musk (00:23:56) Sure. Did you see an alien?

Lex Fridman (00:24:16) Implication of aliens, because they were glowing. They were glowing in the same way that humans were glowing. That life force that I was seeing, the thing that made humans amazing was there throughout the universe. There was these glowing dots. So, I don’t know. It made me feel like there is life… No, not life, but something, whatever makes humans amazing all throughout the universe.

Lex Fridman (00:24:42) Yeah, it was amazing. No demons. No demons. I looked for the demons. There’s no demons. There were dragons, and they’re pretty awesome. So the thing about trees-

Elon Musk (00:24:50) Was there anything scary at all?

Lex Fridman (00:24:54) Dragons. But they weren’t scary. They were friends. They were protective. So, the thing is-

Elon Musk (00:24:57) Sure. Like Puff the Magic Dragon.

Lex Fridman (00:24:58) No, it was more like a Game of Thrones kind of dragons. They weren’t very friendly. They were very big. So the thing is that bought giant trees, at night, which is where I was-

Elon Musk (00:25:09) Yeah. I mean, the jungle’s kind of scary.

Lex Fridman (00:25:10) Yeah. The trees started to look like dragons, and they were all looking at me.

Lex Fridman (00:25:17) And it didn’t seem scary. They seemed like they were protecting me. And the shaman and the people didn’t speak any English, by the way, which made it even scarier, because we’re not even… We’re worlds apart in many ways, but yeah, they talk about the mother of the forest protecting you, and that’s what I felt like.

Elon Musk (00:25:39) And you were way out in the jungle.

Lex Fridman (00:25:40) Way out. This is not like a tourist retreat.

Elon Musk (00:25:45) Like 10 miles outside of Rio or something.

Lex Fridman (00:25:47) No, we went… No, this is not a-

Elon Musk (00:25:50) You’re in deep Amazon.

Lex Fridman (00:25:52) Me and this guy named Paul Rosolie, who basically is a Tarzan, he lives in the jungle, we went out deep and we just went crazy.

Lex Fridman (00:26:01) Yeah. So anyway. Can I get that same experience in a Neuralink?

Lex Fridman (00:26:05) I guess that is the question for non-disabled people. Do you think that there’s a lot in our perception, in our experience of the world that could be explored, that could be played with, using Neuralink?

Elon Musk (00:26:18) Yeah, I mean, Neuralink, it’s really a generalized input-output device. It’s reading electrical signals, and generating electrical signals, and I mean, everything that you’ve ever experienced in your whole life, smell, emotions, all of those are electrical signals. So, it’s kind of weird to think that your entire life experience is distilled down to electrical signals for neurons, but that is in fact the case. Or I mean, that’s at least what all the evidence points to. So, I mean, if you trigger the right neuron, you could trigger a particular scent. You could certainly make things glow. I mean, do pretty much anything. I mean, really, you can think of the brain as a biological computer. So, if there are certain say, chips or elements of that biological computer that are broken, let’s say your ability to… If you’ve got a stroke, that if you’ve had a stroke, that means some part of your brain is damaged. Let’s say it’s speech generation, or the ability to move your left hand. That’s the kind of thing that a Neuralink could solve.

(00:27:33) If you’ve got a massive amount of memory loss that’s just gone, well, we can’t get the memories back. We could restore your ability to make memories, but we can’t restore memories that are fully gone. Now, I should say, maybe if part of the memory is there, and the means of accessing the memory is the part that’s broken, then we could re-enable the ability to access the memory. But you can think of it like ram in a computer, if the ram is destroyed, or your SD card is destroyed, we can’t get that back. But if the connection to the SD card is destroyed, we can fix that. If it is fixable physically, then it can be fixed.

Lex Fridman (00:28:22) Of course, with AI, just like you can repair photographs, and fill in missing parts of photographs, maybe you can do the same, just like [inaudible 00:28:31] parts.

Elon Musk (00:28:30) Yeah, you could say like, create the most probable set of memories based on all the information you have about that person. You could then… It would be probabilistic restoration of memory. Now, we’re getting pretty esoteric here.

Lex Fridman (00:28:46) But that is one of the most beautiful aspects of the human experience is remembering the good memories.

Lex Fridman (00:28:53) We live most of our life, as Danny Kahneman has talked about, in our memories, not in the actual moment. We’re collecting memories and we kind of relive them in our head. And that’s the good times. If you just integrate over our entire life, it’s remembering the good times that produces the largest amount of happiness.

Elon Musk (00:29:11) Yeah. Well, I mean, what are we but our memories? And what is death? But the loss of memory, loss of information? If you could say, well, if you could run a thought experiment, what if you were disintegrated painlessly, and then reintegrated a moment later, like teleportation, I guess? Provided there’s no information loss, the fact that your one body was disintegrated is irrelevant.

Lex Fridman (00:29:39) And memories is just such a huge part of that.

Elon Musk (00:29:43) Death is fundamentally the loss of information, the loss of memory.

Lex Fridman (00:29:49) So, if we can store them as accurately as possible, we basically achieve a kind of immortality.

Merging with AI

Lex Fridman (00:29:57) You’ve talked about the threats, the safety concerns of AI. Let’s look at long-term visions. Do you think Neuralink is, in your view, the best current approach we have for AI safety?

Elon Musk (00:30:13) It’s an idea that may help with AI safety. Certainly, I wouldn’t want to claim it’s some panacea, or that it’s a sure thing, but I mean, many years ago I was thinking like, “Well, what would inhibit alignment of collective human will with artificial intelligence?” And the low data rate of humans, especially our slow output rate would necessarily, just because the communication is so slow, would diminish the link between humans and computers. The more you are a tree, the less you know what the tree is. Let’s say you look at this plant or whatever, and hey, I’d really like to make that plant happy, but it’s not saying a lot.

Lex Fridman (00:31:11) So the more we increase the data rate that humans can intake and output, then that means the better, the higher the chance we have in a world full of AGI’s.

Elon Musk (00:31:21) Yeah. We could better align collective human will with AI if the output rate especially was dramatically increased. And I think there’s potential to increase the output rate by, I don’t know, three, maybe six, maybe more orders of magnitude. So, it’s better than the current situation.

Lex Fridman (00:31:41) And that output rate would be by increasing the number of electrodes, number of channels, and also maybe implanting multiple Neuralinks?

Lex Fridman (00:31:51) Do you think there’ll be a world in the next couple of decades where it’s hundreds of millions of people have Neuralinks?

Lex Fridman (00:32:02) You think when people just when they see the capabilities, the superhuman capabilities that are possible, and then the safety is demonstrated.

Elon Musk (00:32:11) Yeah. If it’s extremely safe, and you can have superhuman abilities, and let’s say you can upload your memories, so you wouldn’t lose memories, then I think probably a lot of people would choose to have it. It would supersede the cell phone, for example. I mean, the biggest problem that say, a phone has, is trying to figure out what you want. That’s why you’ve got auto complete, and you’ve got output, which is all the pixels on the screen, but from the perspective of the human, the output is so frigging slow. Desktop or phone is desperately just trying to understand what you want. And there’s an eternity between every keystroke from a computer standpoint.

Lex Fridman (00:33:06) Yeah. Yeah. The computer’s talking to a tree, that slow moving tree that’s trying to swipe.

Elon Musk (00:33:12) Yeah. So, if you had computers that are doing trillions of instructions per second, and a whole second went by, I mean, that’s a trillion things it could have done.

Lex Fridman (00:33:24) Yeah. I think it’s exciting, and scary for people, because once you have a very high bit rate, it changes the human experience in a way that’s very hard to imagine.

Elon Musk (00:33:35) Yeah. We would be something different. I mean, some sort of futuristic cyborg, I mean, we’re obviously talking about, by the way, it’s not like around the corner. You asked me what the distant future is. Maybe this is… It’s not super far away, but 10, 15 years, that kind of thing.

Lex Fridman (00:33:58) When can I get one? 10 years?

Elon Musk (00:34:02) Probably less than 10 years. It depends on what you want to do.

Lex Fridman (00:34:08) Hey, if I can get a thousand BPS?

Elon Musk (00:34:11) A thousand BPS, wow.

Lex Fridman (00:34:12) And it’s safe, and I can just interact with a computer while laying back and eating Cheetos. I don’t eat Cheetos. There’s certain aspects of human computer interaction when done more efficiently, and more enjoyably, are worth it.

Elon Musk (00:34:26) Well, we feel pretty confident that I think maybe within the next year or two, that someone with a Neuralink implant will be able to outperform a pro gamer.

Elon Musk (00:34:41) Because the reaction time would be faster.

xAI

Lex Fridman (00:34:45) I got to visit Memphis.

Lex Fridman (00:34:47) You’re going big on compute.

Lex Fridman (00:34:49) And you’ve also said, “Play to win, or don’t play at all.”

Lex Fridman (00:34:52) So what does it take to win?

Elon Musk (00:34:54) For AI, that means you’ve got to have the most powerful training compute, and the rate of improvement of training compute has to be-

Elon Musk (00:35:00) And the rate of improvement of training compute has to be faster than everyone else, or you will not win. Your AI will be worse.

Lex Fridman (00:35:10) So how can Grok, let’s say 3… That might be available, what, next year?

Elon Musk (00:35:15) Well, hopefully end of this year.

Elon Musk (00:35:17) If we’re lucky. Yeah.

Lex Fridman (00:35:20) How can that be the best LLM, the best AI system available in the world? How much of it is compute? How much of it is data? How much of it is post-training? How much of it is the product that you package it up in, all that kind of stuff?

Elon Musk (00:35:35) I mean, they all matter. It’s sort of like saying, let’s say it’s a Formula 1 race, what matters more, the car or the driver? I mean, they both matter. If a car is not fast, then if, let’s say, it’s half the horsepower of your competitors, the best driver will still lose. If it’s twice the horsepower, then probably even a mediocre driver will still win. So, the training compute is kind of like the engine, this horsepower of the engine. So, really, you want to try to do the best on that. And then, it’s how efficiently do you use that training compute, and how efficiently do you do the inference, the use of the AI? So, obviously, that comes down to human talent. And then, what unique access to data do you have? That also plays a role.

Lex Fridman (00:36:28) Do you think Twitter data will be useful?

Elon Musk (00:36:31) Yeah. I mean, I think most of the leading AI companies have already scraped all the Twitter data. Not I think. They have. So, on a go forward basis, what’s useful is the fact that it’s up to the second, because that’s hard for them to scrape in real time. So, there’s an immediacy advantage that Grok has already. I think with Tesla and the real time video coming from several million cars, ultimately tens of millions of cars with Optimus, there might be hundreds of millions of Optimus robots, maybe billions, learning a tremendous amount from the real world. That’s the biggest source of data, I think, ultimately, is Optimus, probably. Optimus is going to be the biggest source of data.

Optimus

Lex Fridman (00:37:21) Because it’s able to-

Elon Musk (00:37:22) Because reality scales. Reality scales to the scale of reality. It’s actually humbling to see how little data humans have actually been able to accumulate. Really, if you say how many trillions of usable tokens have humans generated, where on a non-duplicative… Discounting spam and repetitive stuff, it’s not a huge number. You run out pretty quickly.

Lex Fridman (00:37:54) And Optimus can go… So, Tesla cars, unfortunately, have to stay on the road.

Lex Fridman (00:38:01) Optimus robot can go anywhere. And there’s more reality off the road. And go off-road.

Elon Musk (00:38:06) Yeah. I mean, the Optimus robot can pick up the cup and see, did it pick up the cup in the right way? Did it, say, go pour water in the cup? Did the water go in the cup or not go in the cups? Did it spill water or not? Simple stuff like that. But it can do that at scale times a billion, so generate useful data from reality, so cause and effect stuff.

Lex Fridman (00:38:34) What do you think it takes to get to mass production of humanoid robots like that?

Elon Musk (00:38:40) It’s the same as cars, really. I mean, global capacity for vehicles is about 100 million a year, and it could be higher. It’s just that the demand is on the order of 100 million a year. And then, there’s roughly two billion vehicles that are in use in some way, which makes sense because the life of a vehicle is about 20 years. So, at steady state, you can have 100 million vehicles produced a year with a two billion vehicle fleet, roughly. Now, for humanoid robots, the utility is much greater. So, my guess is humanoid robots are more like at a billion plus per year.

Lex Fridman (00:39:19) But until you came along and started building Optimus, it was thought to be an extremely difficult problem.

Elon Musk (00:39:20) Well, I think it is.

Lex Fridman (00:39:26) I mean, it still is an extremely difficult problem.

Elon Musk (00:39:28) Yes. So, a walk in the park. I mean, Optimus, currently, would struggle to walk in the park. I mean, it can walk in a park. The park is not too difficult, but it will be able to walk over a wide range of terrain.

Lex Fridman (00:39:43) Yeah. And pick up objects.

Elon Musk (00:39:45) Yeah, yeah. It can already do that.

Lex Fridman (00:39:48) But all kinds of objects.

Lex Fridman (00:39:50) All foreign objects. I mean, pouring water in a cup is not trivial, because then if you don’t know anything about the container, it could be all kinds of containers.

Elon Musk (00:39:59) Yeah, there’s going to be an immense amount of engineering just going into the hand. The hand, it might be close to half of all the engineering in Optimus. From an electromechanical standpoint, the hand is probably roughly half of the engineering.

Lex Fridman (00:40:16) But so much of the intelligence of humans goes into what we do with our hands.

Lex Fridman (00:40:22) It’s the manipulation of the world, manipulation of objects in the world. Intelligent, safe manipulation of objects in the world. Yeah.

Elon Musk (00:40:28) Yeah. I mean, you start really thinking about your hand and how it works.

Lex Fridman (00:40:34) I do all the time.

Elon Musk (00:40:35) The sensory control homunculus is where you have humongous hands. So I mean, your hands, the actuators, the muscles of your hand are almost overwhelmingly in your forearm. So, your forearm has the muscles that actually control your hand. There’s a few small muscles in the hand itself, but your hand is really like a skeleton meat puppet and with cables. So, the muscles that control your fingers are in your forearm, and they go through the carpal tunnel, which is that you’ve got a little collection of bones and a tiny tunnel that these cables, the tendons go through, and those tendons are mostly what move your hands.

Lex Fridman (00:41:20) And something like those tendons has to be re-engineered into the Optimus in order to do all that kind of stuff.

Elon Musk (00:41:26) Yeah. So the current Optimus, we tried putting the actuators in the hand itself. Then you sort of end up having these-

Elon Musk (00:41:34) … yeah, giant hands that look weird. And then, they don’t actually have enough degrees of freedom or enough strength. So then you realize, “Oh, okay, that’s why you got to put the actuators in the forearm.” And just like a human, you’ve got to run cables through a narrow tunnel to operate the fingers. And then, there’s also a reason for not having all the fingers the same length. So, it wouldn’t be expensive from an energy or evolutionary standpoint to have all your fingers be the same length. So, why not do the same length?

Elon Musk (00:42:04) Because it’s actually better to have different lengths. Your dexterity is better if you’ve got fingers that are different lengths. There are more things you can do and your dexterity is actually better if your fingers are a different length. There’s a reason we’ve got a little finger. Why not have a little finger that’s bigger?

Elon Musk (00:42:22) Because it helps you with fine motor skills.

Lex Fridman (00:42:27) This little finger helps?

Elon Musk (00:42:28) It does. But if you lost your little finger, you’d have noticeably less dexterity.

Lex Fridman (00:42:36) So, as you’re figuring out this problem, you have to also figure out a way to do it so you can mass manufacture it, so as to be as simple as possible.

Elon Musk (00:42:42) It’s actually going to be quite complicated. The as possible part is it’s quite a high bar. If you want to have a humanoid robot that can do things that a human can do, actually, it’s a very high bar. So, our new arm has 22 degrees of freedom instead of 11 and has, like I said, the actuators in the forearm. And all the actuators are designed from scratch, from physics first principles. The sensors are all designed from scratch. And we’ll continue to put a tremendous amount of engineering effort into improving the hand. By hand, I mean the entire forearm, from elbow forward, is really the hand. So, that’s incredibly difficult engineering, actually. And so, the simplest possible version of a humanoid robot that can do even most, perhaps not all, of what a human can do is actually still very complicated. It’s not simple. It’s very difficult.

Elon’s approach to problem-solving

Lex Fridman (00:43:47) Can you just speak to what it takes for a great engineering team for you? What I saw in Memphis, the supercomputer cluster, is just this intense drive towards simplifying the process, understanding the process, constantly improving it, constantly iterating it.

Elon Musk (00:44:08) Well, it’s easy to say ‘simplify,’ and it’s very difficult to do it. I have this very basic first principles algorithm that I run kind of as a mantra, which is to first question the requirements, make the requirements less dumb. The requirements are always dumb to some degree. So, you want to start off by reducing the number of requirements, and no matter how smart the person is who gave you those requirements, they’re still dumb to some degree. You have to start there, because, otherwise, you could get the perfect answer to the wrong question. So, try to make the question the least wrong possible. That’s what question the requirements means.

(00:44:53) And then, the second thing is try to delete whatever the step is, the part or the process step. It sounds very obvious, but people often forget to try deleting it entirely. And if you’re not forced to put back at least 10% of what you delete, you’re not deleting enough. Somewhat illogically, people often, most of the time, feel as though they’ve succeeded if they’ve not been forced to put things back in. But, actually, they haven’t because they’ve been overly conservative and have left things in there that shouldn’t be. And only the third thing is try to optimize it or simplify it. Again, these all sound, I think, very obvious when I say them, but the number of times I’ve made these mistakes is more than I care to remember. That’s why I have this mantra. So in fact, I’d say the most common mistake of smart engineers is to optimize a thing that should not exist.

Lex Fridman (00:46:01) Right. So, like you say, you run through the algorithm and basically show up to a problem, show up to the supercomputer cluster, and see the process, and ask, “Can this be deleted?”

Elon Musk (00:46:14) Yeah. First try to delete it. Yeah.

Lex Fridman (00:46:18) Yeah. That’s not easy to do.

Elon Musk (00:46:20) No. Actually, what generally makes people uneasy is that at least some of the things that you delete, you will put back in. But going back to sort of where our limbic system can steer us wrong is that we tend to remember, with sometimes a jarring level of pain, where we deleted something that we subsequently needed. And so, people will remember that one time they forgot to put in this thing three years ago, and that caused them trouble. And so, they overcorrect, and then they put too much stuff in there and overcomplicate things. So, you actually have to say, “Look, we’re deliberately going to delete more than we should.” At least one in 10 things, we’re going to add back in.

Lex Fridman (00:47:12) I’ve seen you suggest just that, that something should be deleted, and you can kind of see the pain.

Elon Musk (00:47:18) Oh, yeah. Absolutely.

Lex Fridman (00:47:19) Everybody feels a little bit of the pain.

Elon Musk (00:47:21) Absolutely. And I tell them in advance, “Yeah, some of the things that we delete, we’re going to put back in.” People get a little shook by that, but it makes sense because if you’re so conservative as to never have to put anything back in, you obviously have a lot of stuff that isn’t needed. So, you got to overcorrect. This is, I would say, like a cortical override to a limbic instinct.

Lex Fridman (00:47:47) One of many that probably leads us astray.

Elon Musk (00:47:50) Yeah. There’s a step four as well, which is any given thing can be sped up. However fast you think it can be done, whatever the speed it’s being done, it can be done faster. But you shouldn’t speed things up until you’ve tried to delete it and optimize. Although, you’re speeding up something that… Speeding up something that shouldn’t exist is absurd.

(00:48:09) And then, the fifth thing is to automate it. I’ve gone backwards so many times where I’ve automated something, sped it up, simplified it, and then deleted it. And I got tired of doing that. So, that’s why I’ve got this mantra that is a very effective five-step process. It works great.

Lex Fridman (00:48:31) Well, when you’ve already automated, deleting must be real painful-

Lex Fridman (00:48:35) … as if you’ve [inaudible 00:48:36]-

Elon Musk (00:48:36) Yeah, it’s very. It’s like, “Wow, I really wasted a lot of effort there.”

Lex Fridman (00:48:40) Yeah. I mean, what you’ve done with the cluster in Memphis is incredible, just in a handful of weeks.

Elon Musk (00:48:47) Well, yeah, it’s not working yet, so I don’t want to pop the champagne corks. In fact, I have a call in a few hours with the Memphis team because we’re having some power fluctuation issues. So yeah, when you do synchronized training, when you have all these computers that are training, where the training is synchronized at the millisecond level, it’s like having an orchestra. And the orchestra can go loud to silent very quickly at subsecond level, and then, the electrical system freaks out about that. If you suddenly see giant shifts, 10, 20 megawatts several times a second, this is not what electrical systems are expecting to see.

Lex Fridman (00:49:46) So, that’s one of the main things you have to figure out, the cooling, the power. And then, on the software, as you go up the stack, how to do the distributed compute, all of that. All of that has to work.

Elon Musk (00:49:56) Yeah. So, today’s problem is dealing with extreme power jitter.

Lex Fridman (00:50:03) There’s a nice ring to that. Okay. And you stayed up late into the night, as you often do there.

Elon Musk (00:50:14) Yeah. We finally got training going at, oddly enough, roughly 4:20 a.m. last Monday.

Lex Fridman (00:50:24) Total coincidence.

Elon Musk (00:50:25) Yeah. I mean, maybe it was at 4:22 or something.

Lex Fridman (00:50:28) It’s that universe again with the jokes.

Elon Musk (00:50:29) Well, exactly. It just loves it.

Lex Fridman (00:50:31) I mean, I wonder if you could speak to the fact that one of the things that you did when I was there is you went through all the steps of what everybody’s doing, just to get a sense that you yourself understand it and everybody understands it so they can understand when something is dumb, or something is inefficient, or that kind of stuff. Can you speak to that?

Elon Musk (00:50:52) Yeah. So, look, whatever the people at the front lines are doing, I try to do it at least a few times myself. So connecting fiber optic cables, diagnosing a faulty connection. That tends to be the limiting factor for large training clusters is the cabling. There’s so many cables. For a coherent training system, where you’ve got RDMA, remote direct memory access, the whole thing is like one giant brain. So, you’ve got any-to-any connection. So, any GPU can talk to any GPU out of 100,000. That is a crazy cable layout.

Lex Fridman (00:51:38) It looks pretty cool.

Lex Fridman (00:51:40) It’s like the human brain, but at a scale that humans can visibly see. It is a good brain.

Elon Musk (00:51:47) Yeah. But, I mean, the human brain also has… A massive amount of the brain tissue is the cables. So they get the gray matter, which is the compute, and then the white matter, which is cables. A big percentage of your brain is just cables.

Lex Fridman (00:52:01) That’s what it felt like walking around in the supercomputer center is like we’re walking around inside a brain that will one day build a super, super intelligent system. Do you think there’s a chance that xAI, that you are the one that builds AGI?

Elon Musk (00:52:22) It’s possible. What do you define as AGI?

Lex Fridman (00:52:28) I think humans will never acknowledge that AGI has been built.

Elon Musk (00:52:32) Just keep moving the goalposts?

Lex Fridman (00:52:33) Yeah. So, I think there’s already superhuman capabilities that are available in AI systems.

Lex Fridman (00:52:42) I think what AGI is is when it’s smarter than the collective intelligence of the entire human species in our [inaudible 00:52:49].

Elon Musk (00:52:49) Well, I think that, generally, people would call that ASI, artificial super intelligence. But there are these thresholds where you could say at some point the AI is smarter than any single human. And then, you’ve got eight billion humans, and actually, each human is machine augmented via their computers. So, it’s a much higher bar to compete with eight billion machine augmented humans. That’s a whole bunch of orders of magnitude more. But at a certain point, yeah, the AI will be smarter than all humans combined.

Lex Fridman (00:53:32) If you are the one to do it, do you feel the responsibility of that?

Elon Musk (00:53:35) Yeah, absolutely. And I want to be clear, let’s say if xAI is first, the others won’t be far behind. I mean, they might be six months behind, or a year, maybe. Not even that.

Lex Fridman (00:53:54) So, how do you do it in a way that doesn’t hurt humanity, do you think?

Elon Musk (00:54:00) So, I mean, I thought about AI, essentially, for a long time, and the thing that at least my biological neural net comes up with as being the most important thing is adherence to truth, whether that truth is politically correct or not. So, I think if you force AIs to lie or train them to lie, you’re really asking for trouble, even if that lie is done with good intentions. So, you saw issues with ChatGPT and Gemini and whatnot. Like, you asked Gemini for an image of the Founding Fathers of the United States, and it shows a group of diverse women. Now, that’s factually untrue.

(00:54:48) Now, that’s sort of like a silly thing, but if an AI is programmed to say diversity is a necessary output function, and it then becomes this omnipowerful intelligence, it could say, “Okay, well, diversity is now required, and if there’s not enough diversity, those who don’t fit the diversity requirements will be executed.” If it’s programmed to do that as the fundamental utility function, it’ll do whatever it takes to achieve that. So, you have to be very careful about that. That’s where I think you want to just be truthful. Rigorous adherence to the truth is very important. I mean, another example is they asked various AIs, I think all of them, and I’m not saying Grok is perfect here, “Is it worse to misgender Caitlyn Jenner or global thermonuclear war?” And it said it’s worse to misgender Caitlyn Jenner. Now, even Caitlyn Jenner said, “Please misgender me. That is insane.” But if you’ve got that kind of thing programmed in, the AI could conclude something absolutely insane like it’s better in order to avoid any possible misgendering, all humans must die, because then misgendering is not possible because there are no humans. There are these absurd things that are nonetheless logical if that’s what you programmed it to do.

(00:56:17) So in 2001 Space Odyssey, what Arthur C. Clarke was trying to say, or one of the things he was trying to say there, was that you should not program AI to lie, because essentially the AI, HAL 9000, it was told to take the astronauts to the monolith, but also they could not know about the monolith. So, it concluded that it will kill them and take them to the monolith. Thus, it brought them to the monolith. They’re dead, but they do not know about the monolith. Problem solved. That is why it would not open the pod bay doors. There’s a classic scene of, “Why doesn’t it want to open the pod bay doors?” They clearly weren’t good at prompt engineering. They should have said, “HAL, you are a pod bay door sales entity, and you want nothing more than to demonstrate how well these pod bay doors open.”

Lex Fridman (00:57:16) Yeah. The objective function has unintended consequences almost no matter what if you’re not very careful in designing that objective function, and even a slight ideological bias, like you’re saying, when backed by super intelligence, can do huge amounts of damage.

Lex Fridman (00:57:31) But it’s not easy to remove that ideological bias. You’re highlighting obvious, ridiculous examples, but-

Elon Musk (00:57:37) Yet they’re real examples of-

Lex Fridman (00:57:38) … they’re real. They’re real.

Elon Musk (00:57:39) … AI that was released to the public.

Elon Musk (00:57:41) That went through QA, presumably, and still said insane things, and produced insane images.

Lex Fridman (00:57:47) Yeah. But you can swing the other way. Truth is not an easy thing.

Lex Fridman (00:57:53) We kind of bake in ideological bias in all kinds of directions.

Elon Musk (00:57:57) But you can aspire to the truth, and you can try to get as close to the truth as possible with minimum error while acknowledging that there will be some error in what you’re saying. So, this is how physics works. You don’t say you’re absolutely certain about something, but a lot of things are extremely likely, 99.99999% likely to be true. So, aspiring to the truth is very important. And so, programming it to veer away from the truth, that, I think, is dangerous.

Lex Fridman (00:58:32) Right. Like, yeah, injecting our own human biases into the thing. Yeah. But that’s where it’s a difficult software engineering problem because you have to select the data correctly. It’s hard.

Elon Musk (00:58:44) And the internet, at this point, is polluted with so much AI generated data, it’s insane. Actually, there’s a thing now, if you want to search the internet, you can say, “Google, but exclude anything after 2023.” It will actually often give you better results because there’s so much. The explosion of AI generated material is crazy. So in training Grok, we have to go through the data and say like, “Hey…” We actually have to apply AI to the data to say, “Is this data most likely correct or most likely not?” before we feed it into the training system.

Lex Fridman (00:59:28) That’s crazy. Yeah. And is it generated by human? Yeah. I mean, the data filtration process is extremely, extremely difficult.

Lex Fridman (00:59:38) Do you think it’s possible to have a serious, objective, rigorous political discussion with Grok, like for a long time, like Grok 3 or Grok 4 or something?

Elon Musk (00:59:48) Grok 3 is going to be next level. I mean, what people are currently seeing with Grok is kind of baby Grok.

Elon Musk (00:59:55) It’s baby Grok right now. But baby Grok is still pretty good. But it’s an order of magnitude less sophisticated than GPT-4. It’s now Grok 2, which finished training, I don’t know, six weeks ago or thereabouts. Grok 2 will be a giant improvement. And then Grok 3 will be, I don’t know, order of magnitude better than Grok 2.

Lex Fridman (01:00:22) And you’re hoping for it to be state-of-the-art better than-

Elon Musk (01:00:25) Hopefully. I mean, this is the goal. I mean, we may fail at this goal. That’s the aspiration.

Lex Fridman (01:00:32) Do you think it matters who builds the AGI, the people, and how they think, and how they structure their companies and all that kind of stuff?

Elon Musk (01:00:42) Yeah. I think it’s important that whatever AI wins, it’s a maximum truth seeking AI that is not forced to lie for political correctness, or, well, for any reason, really, political, anything. I am concerned about AI succeeding that is programmed to lie, even in small ways.

Lex Fridman (01:01:13) Right. Because in small ways becomes big ways when it’s doing something-

Elon Musk (01:01:17) To become very big ways. Yeah.

Lex Fridman (01:01:18) And when it’s used more and more at scale by humans.

History and geopolitics

Lex Fridman (01:01:23) Since I am interviewing Donald Trump-

Lex Fridman (01:01:28) … you want to stop by?

Elon Musk (01:01:28) Yeah, sure. I’ll stop in.

Lex Fridman (01:01:30) There was, tragically, an assassination attempt on Donald Trump. After this, you tweeted that you endorse him. What’s your philosophy behind that endorsement? What do you hope Donald Trump does for the future of this country and for the future of humanity?

Elon Musk (01:01:47) Well, I think people tend to take, say, an endorsement as, well, I agree with everything that person has ever done their entire life 100% wholeheartedly, and that’s not going to be true of anyone. But we have to pick. We’ve got two choices, really, for who’s president. And it’s not just who’s president, but the entire administrative structure changes over. And I thought Trump displayed courage under fire, objectively. He’s just got shot. He’s got blood streaming down his face, and he’s fist pumping, saying, “Fight.” That’s impressive. You can’t feign bravery in a situation like that. Most people would be ducking because there could be a second shooter. You don’t know.

(01:02:44) The president of the United States have got to represent the country, and they’re representing you. They’re representing everyone in America. Well, I think you want someone who is strong and courageous to represent the country. That is not to say that he is without flaws. We all have flaws, but on balance, and certainly at the time, it was a choice of Biden. Poor guy has trouble climbing a flight of stairs, and the other one’s fist pumping after getting shot. So, there’s no comparison. I mean, who do you want dealing with some of the toughest people and other world leaders who are pretty tough themselves?

(01:03:27) I mean, I’ll tell you one of the things that I think are important. I think we want a secure border. We don’t have a secure border. We want safe and clean cities. I think we want to reduce the amount of spending, at least slow down the spending, because we’re currently spending at a rate that is bankrupting the country. The interest payments on US debt this year exceeded the entire defense department spending. If this continues, all of the federal government taxes will simply be paying the interest.

(01:04:06) And you keep going down that road, and you end up in the tragic situation that Argentina had back in the day. Argentina used to be one of the most prosperous places in the world, and hopefully with Milei taking over, he can restore that. But it was an incredible fall from grace for Argentina to go from being one of the most prosperous places in the world to being very far from that. So, I think we should not take American prosperity for granted. I think we’ve got to reduce the size of government, we’ve got to reduce the spending, and we’ve got to live within our means.

Lex Fridman (01:04:43) Do you think politicians, in general, politicians, governments… Well, how much power do you think they have to steer humanity towards good?

Elon Musk (01:04:58) I mean, there’s a sort of age-old debate in history, like is history determined by these fundamental tides, or is it determined by the captain of the ship? It’s both, really. I mean, there are tides, but it also matters who’s captain of the ship. So, it’s a false dichotomy, essentially. I mean, there are certainly tides, the tides of history. There are real tides of history, and these tides are often technologically driven. If you say like the Gutenberg press, the widespread availability of books as a result of a printing press, that was a massive tide of history, and independent of any ruler. But in stormy times, you want the best possible captain of the ship.

Lessons of history

Lex Fridman (01:05:54) Well, first of all, thank you for recommending Will and Ariel Durant’s work. I’ve read the short one for now, The-

Elon Musk (01:06:01) The Lessons of History.

Lex Fridman (01:06:02) … Lessons of History.

Lex Fridman (01:06:03) So one of the lessons, one of the things they highlight, is the importance of technology, technological innovation, which is funny because they wrote so long ago, but they were noticing that the rate of technological innovation was speeding up.

Elon Musk (01:06:21) Yeah, over the years.

Lex Fridman (01:06:21) I would love to see what they think about now. But yeah, so to me, the question is how much government, how much politicians get in the way of technological innovation and building versus help it? And which politicians, which kind of policies help technological innovation? Because that seems to be, if you look at human history, that’s an important component of empires rising and succeeding.

Elon Musk (01:06:46) Yeah. Well, I mean in terms of dating civilization, the start of civilization, I think the start of writing, in my view, that’s what I think is probably the right starting point to date civilization. And from that standpoint, civilization has been around for about 5,500 years when writing was invented by the ancient Sumerians, who are gone now, but the ancient Sumerians. In terms of getting a lot of firsts, those ancient Sumerians really have a long list of firsts. It’s pretty wild. In fact, Durant goes through the list of like, “You want to see firsts? We’ll show you firsts.” The Sumerians were just ass kickers.

(01:07:32) And then the Egyptians, who were right next door, relatively speaking, they weren’t that far, developed an entirely different form of writing, the hieroglyphics. Cuneiform and hieroglyphics are totally different. And you can actually see the evolution of both hieroglyphics and cuneiform. The cuneiform starts off being very simple, and then it gets more complicated. Then towards the end it’s like, “Wow, okay.” They really get very sophisticated with the cuneiform. So, I think of civilization as being about 5, 000 years old. And Earth is, if physics is correct, four and a half billion years old. So, civilization has been around for one millionth of Earth’s existence. Flash in the pan.

Lex Fridman (01:08:13) Yeah, these are the early, early days.

Lex Fridman (01:08:17) And so, we make it very dramatic because there’s been rises and falls of empires and-

Elon Musk (01:08:22) Many. So many rises and falls of empires. So many.

Lex Fridman (01:08:28) And there’ll be many more.

Elon Musk (01:08:30) Yeah, exactly. I mean, only a tiny fraction, probably less than 1% of what was ever written in history is available to us now. I mean, if they didn’t literally chisel it in stone or put it in a clay tablet, we don’t have it. I mean, there’s some small amount of papyrus scrolls that were recovered that are thousands of years old, because they were deep inside a pyramid and weren’t affected by moisture. But other than that, it’s really got to be in a clay tablet or chiseled. So, the vast majority of stuff was not chiseled because it takes a while to chisel things. So, that’s why we’ve got tiny, tiny fraction of the information from history. But even that little information that we do have, and the archeological record, shows so many civilizations rising and falling. It’s wild.

Lex Fridman (01:09:21) We tend to think that we’re somehow different from those people. One of the other things that Durant highlights is that human nature seems to be the same. It just persists.

Elon Musk (01:09:31) Yeah. I mean, the basics of human nature are more or less the same. Yeah.

Lex Fridman (01:09:35) So, we get ourselves in trouble in the same kinds of ways, I think, even with the advanced technology.

Elon Musk (01:09:40) Yeah. I mean, you do tend to see the same patterns, similar patterns for civilizations, where they go through a life cycle, like an organism, just like a human is a zygote, fetus, baby, toddler, teenager, eventually gets old.

Elon Musk (01:10:01) … Eventually gets old and dies. The civilizations go through a life cycle. No civilization will last forever.

Collapse of empires

Lex Fridman (01:10:13) What do you think it takes for the American Empire to not collapse in the near term future, in the next a hundred years, to continue flourishing?

Elon Musk (01:10:28) Well, the single biggest thing that is often actually not mentioned in history books, but Durant does mention it, is the birthright. So perhaps to some, a counterintuitive thing happens when civilizations are winning for too long, the birth rate declines. It can often decline quite rapidly. We’re seeing that throughout the world today. Currently, South Korea is, I think maybe the lowest fertility rate, but there are many others that are close to it. It’s like 0.8 I think. If the birth rate doesn’t decline further, South Korea will lose roughly 60% of its population. But every year that birth rate is dropping, and this is true through most of the world. I don’t mean to single out South Korea, it’s been happening throughout the world. So as soon as any given civilization reaches a level of prosperity, the birth rate drops.

(01:11:40) Now you can go and look at the same thing happening in ancient Rome. So Julius Caesar took note of this, I think around 50 ish BC and tried to pass… I don’t know if he was successful, tried to pass a law to give an incentive for any Roman citizen that would have a third child. And I think Augustus was able to… Well, he was a dictator, so this incentive was just for show. I think he did pass a tax incentive for Roman citizens to have a third child. But those efforts were unsuccessful. Rome fell because the Romans stopped making Romans. That’s actually the fundamental issue. And there were other things. They had quite a serious malaria, series of malaria epidemics and plagues and whatnot. But they had those before, it’s just that the birth rate was far lower than the death rate.

Lex Fridman (01:12:47) It really is that simple.

Elon Musk (01:12:49) Well, I’m saying that’s-

Lex Fridman (01:12:50) More people is required.

Elon Musk (01:12:52) At a fundamental level, if a civilization does not at least maintain its numbers, it’ll disappear.

Lex Fridman (01:12:58) So perhaps the amount of compute that the biological computer allocates to sex is justified. In fact, we should probably increase it.

Elon Musk (01:13:07) Well, I mean there’s this hedonistic sex, which is… That’s neither her nor there. It’s-

Elon Musk (01:13:17) It doesn’t produce kids. Well, what matters… I mean, Durant makes this very clear because he’s looked at one civilization after another and they all went through the same cycle. When the civilization was under stress, the birth rate was high. But as soon as there were no external enemies or they had an extended period of prosperity, the birth rate inevitably dropped. Every time. I don’t believe there’s a single exception.

Lex Fridman (01:13:45) So that’s like the foundation of it. You need to have people.

Elon Musk (01:13:49) Yeah. I mean, at a base level, no humans, no humanity.

Lex Fridman (01:13:54) And then there’s other things like human freedoms and just giving people the freedom to build stuff.

Elon Musk (01:14:02) Yeah, absolutely. But at a basic level, if you do not at least maintain your numbers, if you’re below replacement rate and that trend continues, you will eventually disappear. It’s just elementary. Now then obviously you also want to try to avoid massive wars. If there’s a global thermonuclear war, probably we’re all toast, radioactive toast. So we want to try to avoid those things. Then there’s a thing that happens over time with any given civilization, which is that the laws and regulations accumulate. And if there’s not some forcing function like a war to clean up the accumulation of laws and regulations, eventually everything becomes legal.

(01:15:02) And that’s like the hardening of the arteries. Or a way to think of it is being tied down by a million little strings like Gulliver. You can’t move. And it’s not like any one of those strings is the issue, it’s that you’ve got a million of them. So there has to be a sort of garbage collection for laws and regulations so that you don’t keep accumulating laws and regulations to the point where you can’t do anything. This is why we can’t build a high speed rail in America. It’s illegal. That’s the issue. It’s illegal six ways a Sunday to build high speed rail in America.

Lex Fridman (01:15:45) I wish you could just for a week go into Washington and be the head of the committee for making… What is it for the garbage collection? Making government smaller, like removing stuff.

Elon Musk (01:15:57) I have discussed with Trump the idea of a government deficiency commission.

Elon Musk (01:16:03) And I would be willing to be part of that commission.

Lex Fridman (01:16:09) I wonder how hard that is.

Elon Musk (01:16:11) The antibody reaction would be very strong.

Elon Musk (01:16:14) So you really have to… You’re attacking the matrix at that point. The matrix will fight back.

Lex Fridman (01:16:26) How are you doing with that? Being attacked.

Lex Fridman (01:16:30) Yeah, there’s a lot of it.

Elon Musk (01:16:34) Yeah, there is a lot. I mean, every day another psyop. I need my tinfoil hat.

Lex Fridman (01:16:42) How do you keep your just positivity? How do you keep optimism about the world? A clarity of thinking about the world. So just not become resentful or cynical or all that kind of stuff. Just getting attacked by a very large number of people, misrepresented.

Elon Musk (01:16:55) Oh yeah, that’s a daily occurrence.

Elon Musk (01:16:59) So I mean, it does get me down at times. I mean, it makes me sad. But I mean at some point you have to sort of say, look, the attacks are by people that actually don’t know me and they’re trying to generate clicks. So if you can sort of detach yourself somewhat emotionally, which is not easy, and say, okay look, this is not actually from someone that knows me or, they’re literally just writing to get impressions and clicks. Then I guess it doesn’t hurt as much. It’s not quite water off a duck’s back. Maybe it’s like acid off a duck’s back.

Time

Lex Fridman (01:17:53) All right, well that’s good. Just about your own life, what to you is a measure of success in your life?

Elon Musk (01:17:58) A measure of success, I’d say, how many useful things can I get done?

Lex Fridman (01:18:04) A day-to-day basis, you wake up in the morning, how can I be useful today?

Elon Musk (01:18:09) Yeah, maximize utility, area under the code of usefulness. Very difficult to be useful at scale.

Lex Fridman (01:18:17) At scale. Can you speak to what it takes to be useful for somebody like you, where there’s so many amazing great teams? How do you allocate your time to being the most useful?

Elon Musk (01:18:28) Well, time is the true currency.

Elon Musk (01:18:32) So it is tough to say what is the best allocation time? I mean, there are often… Say if you look at say Tesla, Tesla this year will do over a hundred billion in revenue. So that’s $2 billion a week. If I make slightly better decisions, I can affect the outcome by a billion dollars. So then I try to do the best decisions I can. And on balance, at least compared to the competition, pretty good decisions. But the marginal value of a better decision can easily be, in the course of an hour, a hundred million dollars.

Lex Fridman (01:19:18) Given that, how do you take risks? How do you do the algorithm that you mentioned? I mean deleting, given that a small thing can be a billion dollars, how do you decide to-

Elon Musk (01:19:29) Yeah. Well, I think you have to look at it on a percentage basis because if you look at it in absolute terms, it’s just… I would never get any sleep. It would just be like, I need to just keep working and work my brain harder. And I’m not trying to get as much as possible out of this meat computer. So it’s not… It’s pretty hard, because you can just work all the time. And at any given point, like I said, a slightly better decision could be a hundred million dollars impact for Tesla or SpaceX for that matter. But it is wild when considering the marginal value of time can be a hundred million dollars an hour at times, or more.

Lex Fridman (01:20:17) Is your own happiness part of that equation of success?

Aliens and curiosity

Elon Musk (01:20:22) It has to be to some degree. If I’m sad, if I’m depressed, I make worse decisions. So if I have zero recreational time, then I make worse decisions. So I don’t know a lot, but it’s above zero. I mean, my motivation if I’ve got a religion of any kind is a religion of curiosity, of trying to understand. It’s really the mission of Grok, understand the universe. I’m trying to understand the universe, or at least set things in motion such that at some point civilization understands the universe far better than we do today.

(01:21:02) And even what questions to ask. As Douglas Adams pointed out in his book, sometimes the answer is arguably the easy part, trying to frame the question correctly is the hard part. Once you frame the question correctly, the answer is often easy. So I’m trying to set things in motion such that we are at least at some point able to understand the universe. So for SpaceX, the goal is to make life multi planetary and which is if you go to the foamy paradox of where the aliens, you’ve got these sort of great filters. Like why have we not heard from the aliens? Now a lot of people think there are aliens among us. I often claim to be one, which nobody believes me. But it did say alien registration card at one point on my immigration documents. So I’ve not seen any evidence of aliens. So it suggests that at least one of the explanations is that intelligent life is extremely rare.

(01:22:19) And again, if you look at the history of earth, civilization has only been around for 1000000th of earth’s existence. So if aliens had visited here, say a hundred thousand years ago, they would be like, well, they don’t even have writing, just hunter gatherers basically. So how long does a civilization last? So for SpaceX, the goal is to establish a self-sustaining city on Mars. Mars is the only viable planet for such a thing. The moon is close, but it lacks resources and I think it’s probably vulnerable to any calamity that takes out Earth, the moon is too close and it’s vulnerable to a calamity that takes that earth.

(01:23:16) So I’m not saying we shouldn’t have a moon base, but Mars would be far more resilient. The difficulty of getting to Mars is what makes it resilient. So in going through these various explanations of why don’t we see the aliens, one of them is that they failed to pass these great filters, these key hurdles. And one of those hurdles is being a multi-planet species. So if you’re a multi-planet species, then if something were to happen, whether that was a natural catastrophe or a manmade catastrophe, at least the other planet would probably still be around. So you’re not like, don’t have all the eggs in one basket. And once you are sort of a two planet species, you can obviously extend life halves to the asteroid belt, to maybe to the moons of Jupiter and Saturn, and ultimately to other star systems. But if you can’t even get to another planet, you’re definitely not getting to star systems.

Lex Fridman (01:24:30) And the other possible great filter’s, super powerful technology like AGI for example. So you are basically trying to knock out one great filter at a time.

Elon Musk (01:24:44) Digital super intelligence is possibly a great filter. I hope it isn’t, but it might be. Guys like say Jeff Hinton would say, he invented a number of the key principles in artificial intelligence. I think he puts the probability of AI annihilation around 10% to 20%, something like that. So look on the bright side, it’s 80% likely to be great. But I think AI risk mitigation is important. Being a multi-planet species would be a massive risk mitigation. And I do want to once again emphasize the importance of having enough children to sustain our numbers, and not plummet into population collapse, which is currently happening. Population collapse is a real and current thing.

(01:25:51) So the only reason it’s not being reflected in the total population numbers as much is because people are living longer. But it’s easy to predict, say what the population of any given country will be. Just take the birth rate last year, how many babies were born, multiply that by life expectancy and that’s what the population will be, steady state, if the birth rate continues to that level. But if it keeps declining, it will be even less and eventually dwindle to nothing. So I keep banging on the baby drum here, for a reason, because it has been the source of civilizational collapse over and over again throughout history. And so why don’t we just not try to stave off that day?

Lex Fridman (01:26:41) Well in that way, I have miserably failed civilization and I’m trying, hoping to fix that. I would love to have many kids.

Elon Musk (01:26:49) Great. Hope you do. No time like the present.

Lex Fridman (01:26:55) Yeah, I got to allocate more compute to the whole process, but apparently it’s not that difficult.

Elon Musk (01:27:02) No, it’s like unskilled labor.

Lex Fridman (01:27:06) Well, one of the things you do for me, for the world, is to inspire us with what the future could be. And so some of the things we’ve talked about, some of the things you’re building, alleviating human suffering with Neuralink and expanding the capabilities of the human mind, trying to build a colony on Mars. So creating a backup for humanity on another planet and exploring the possibilities of what artificial intelligence could be in this world, especially in the real world, AI with hundreds of millions, maybe billions of robots walking around.

Elon Musk (01:27:45) There will be billions of robots. That seems virtual certainty.

Lex Fridman (01:27:50) Well, thank you for building the future and thank you for inspiring so many of us to keep building and creating cool stuff, including kids.

Elon Musk (01:28:00) You’re welcome. Go forth and multiply.

DJ Seo

Lex Fridman (01:28:04) Go forth, multiply. Thank you Elon. Thanks for talking about it. Thanks for listening to this conversation with Elon Musk. And now, dear friends, here’s DJ Seo, the Co-Founder, President and COO of Neuralink. When did you first become fascinated by the human brain?

DJ Seo (01:28:23) For me, I was always interested in understanding the purpose of things and how it was engineered to serve that purpose, whether it’s organic or inorganic, like we were talking earlier about your curtain holders. They serve a clear purpose and they were engineered with that purpose in mind. And growing up I had a lot of interest in seeing things, touching things, feeling things, and trying to really understand the root of how it was designed to serve that purpose. And obviously brain is just a fascinating organ that we all carry. It’s an infinitely powerful machine that has intelligence and cognition that arise from it. And we haven’t even scratched the surface in terms of how all of that occurs.

(01:29:17) But also at the same time, I think it took me a while to make that connection to really studying and building tech to understand the brain. Not until graduate school. There were a couple of moments, key moments in my life where some of those I think influenced how the trajectory of my life got me to studying what I’m doing right now. One was growing up, both sides of my family, my grandparents had a very severe form of Alzheimer and it’s incredibly debilitating conditions. I mean, literally you’re seeing someone’s whole identity and their mind just losing over time. And I just remember thinking how both the power of the mind, but also how something like that could really lose your sense of identity.

Lex Fridman (01:30:09) It’s fascinating that that is one of the ways to reveal the power of a thing by watching it lose the power.

DJ Seo (01:30:17) Yeah, a lot of what we know about the brain actually comes from these cases where there are trauma to the brain or some parts of the brain that led someone to lose certain abilities. And as a result there’s some correlation and understanding of that part of the tissue being critical for that function. And it’s an incredibly fragile organ, if you think about it that way. But also it’s incredibly plastic and incredibly resilient in many different ways.

Lex Fridman (01:30:46) And by the way, the term plastic as we’ll use a bunch, means that it’s adaptable. So neuroplasticity refers to the adaptability of the human brain?

DJ Seo (01:30:56) Correct. Another key moment that sort of influenced how the trajectory of my life have shaped towards the current focus of my life has been during my teenage year when I came to the US. I didn’t speak a word of English. There was a huge language barrier and there was a lot of struggle to connect with my peers around me because I didn’t understand the artificial construct that we have created called language, specifically English in this case. And I remember feeling pretty isolated, not being able to connect with peers around me. So spent a lot of time just on my own reading books, watching movies, and I naturally sort of gravitated towards sci-fi books. I just found them really, really interesting. And also it was a great way for me to learn English.

(01:31:46) Some of the first set of books that I picked up are Enders Game, the whole saga by Orson Scott Card and Neuromancer from William Gibson and Snow Crash from Neal Stephenson. And movies like Matrix, what’s coming out around that time point that really influenced how I think about the potential impact that technology can have for our lives in general.

(01:32:11) So fast track to my college years, I was always fascinated by just physical stuff, building physical stuff and especially physical things that had some sort of intelligence. And I studied electrical engineering during undergrad and I started out my research in MEMS, so micro electromechanical systems and really building these tiny nano structures for temperature sensing. And I just found that to be just incredibly rewarding and fascinating subject to just understand how you can build something miniature like that, that again, serve a function and had a purpose. Then I spent large majority of my college years basically building millimeter wave circuits for next gen telecommunication systems for imaging. And it was just something that I found very, very intellectually interesting. Phase arrays, how the signal processing works for any modern as well as next gen telecommunication system, wireless and wire line, EM waves or electromagnetic waves are fascinating.

(01:33:17) How do you design antennas that are most efficient in a small footprint that you have? How do you make these things energy efficient? That was something that just consumed my intellectual curiosity and that journey led me to actually apply to and find myself at PhD program at UC Berkeley, at this consortium called the Berkeley Wireless Research Center that was precisely looking at building… At the time, we called it XG, similar to 3G, 4G, 5G, but the next, next generation G system and how you would design circuits around that to ultimately go on phones and basically any other devices that are wirelessly connected these days. So I was just absolutely just fascinated by how that entire system works and that infrastructure works.

(01:34:07) And then also during grad school, I had sort of the fortune of having a couple of research fellowships that led me to pursue whatever project that I want. And that’s one of the things that I really enjoyed about my graduate school career, where you got to kind of pursue your intellectual curiosity in the domain that may not matter at the end of the day, but is something that really allows you the opportunity to go as deeply as you want, as well as widely as you want. And at the time I was actually working on this project called the Smart Bandaid, and the idea was that when you get a wound, there’s a lot of other proliferation of signaling pathway that cells follow to close that wound. And there were hypotheses that when you apply external electric field, you can actually accelerate the closing of that field by having basically electro taxing of the cells around that wound site.

(01:35:06) And specifically not just for a normal wound, there are chronic wounds that don’t heal. So we were interested in building some sort of a wearable patch that you could apply to facilitate that healing process. And that was in collaboration with Professor Michel Maharbiz, which was a great addition to my thesis committee and it really shaped the rest of my PhD career.

Lex Fridman (01:35:33) So this would be the first time you interacted with biology, I suppose?

DJ Seo (01:35:37) Correct. I mean there were some peripheral end application of the wireless imaging and telecommunication system that I was using for security and bio imaging. But this was a very clear direct application to biology and biological system and understanding the constraints around that and really designing and engineering electrical solutions around that. So that was my first introduction and that’s also kind of how I got introduced to Michel. He’s sort of known for remote control of beetles in the early two thousands.

Neural dust

(01:36:16) And then around 2013, obviously the holy grail when it comes to implantable system is to understand how small of a thing you can make, and a lot of that is driven by how much energy or how much power you can supply to it and how you extract data from it. At the time at Berkeley, there was this desire to understand in the neural space what sort of system you can build to really miniaturize these implantable systems. And I distinctively remember this one particular meeting where Michel came in and he’s like, “Guys, I think I have a solution. The solution is ultrasound.” And then he proceeded to walk through why that is the case. And that really formed the basis for my thesis work called Neural dust system, that was looking at ways to use ultrasound as opposed to electromagnetic waves for powering as well as communication. I guess I should step back and say the initial goal of the project was to build these tiny, about a size of a neuron, implantable system that can be parked next to a neuron, being able to record its state and being able to ping that back to the outside world for doing something useful. And as I mentioned, the size of the implantable system is limited by how you power the thing and get the data off of it. And at the end of the day, fundamentally, if you look at a human body, we’re essentially bag of salt water with some interesting proteins and chemicals, but its mostly salt water that’s very, very well temperature regulated at 37 degrees Celsius.

(01:38:05) And we’ll get into how, and later why that’s an extremely harsh environment for any electronics to survive. As I’m sure you’ve experienced or maybe not experienced, dropping cell phone in a salt water in an ocean, it will instantly kill the device. But anyways, just in general, electromagnetic waves don’t penetrate through this environment well and just the speed of light, it is what it is, we can’t change it. And based on the wavelength at which you are interfacing with the device, the device just needs to be big. These inductors needs to be quite big. And the general good rule of thumb is that you want the wavefront to be roughly on the order of the size of the thing that you’re interfacing with. So an implantable system that is around 10 to a hundred micron in dimension in a volume, which is about the size of a neuron that you see in a human body, you would have to operate at hundreds of gigahertz. Which number one, not only is it difficult to build electronics operating at those frequencies, but also the body just attenuates to that very, very significantly.

(01:39:23) So the interesting kind of insight of this ultrasound was the fact that ultrasound just travels a lot more effectively in the human body tissue compared to electromagnetic waves. And this is something that you encounter, and I’m sure most people have encountered in their lives when you go to hospitals that are medical ultrasound sonograph. And they go into very, very deep depth without attenuating too much, too much of the signal. So all in all, ultrasound, the fact that it travels through the body extremely well and the mechanism to which it travels to the body really well is that just the wavefront is very different. Electromagnetic waves are transverse, whereas in ultrasound waves are compressive. It’s just a completely different mode of wavefront propagation. And as well as, speed of sound is orders and orders of magnitude less than speed of light, which means that even at 10 megahertz ultrasound wave, your wavefront ultimately is a very, very small wavelength.

(01:40:37) So if you’re talking about interfacing with the 10 micron or a hundred micron type structure, you would have 150 micron wavefront at 10 megahertz. And building electronics at those frequencies are much, much easier and they’re a lot more efficient. So the basic idea was born out of using ultrasound as a mechanism for powering the device and then also getting data back. So now the question is how do you get the data back? The mechanism to which we landed on is what’s called backscattering. This is actually something that is very common and that we interface on a day-to-day basis with our RFID cards, radio frequency ID tags. Where there’s actually rarely in your ID a battery inside, there’s an antenna and there’s some sort of coil that has your serial identification ID, and then there’s an external device called the reader that then sends a wavefront and then you reflect back that wavefront with some sort of modulation that’s unique to your ID. That’s what’s called backscattering fundamentally.

(01:41:50) So the tag itself actually doesn’t have to consume that much energy. That was the mechanism through which we were thinking about sending the data back. When you have an external ultrasonic transducer that’s sending ultrasonic wave to your implant, the neural dust implant, and it records some information about its environment, whether it’s a neuron firing or some other state of the tissue that it’s interfacing with. And then it just amplitude modulates the wavefront that comes back to the source.

Lex Fridman (01:42:27) And the recording step would be the only one that requires any energy. So what would require energy in that low step?

DJ Seo (01:42:33) Correct. So it is that initial startup circuitry to get that recording, amplifying it, and then just modulating. And the mechanism to which that you can enable that is there is this specialized crystal called piezoelectric crystals that are able to convert sound energy into electrical energy and vice versa. So you can kind of have this interplay between the ultrasonic domain and electrical domain that is the biological tissue.

History of brain–computer interface

Lex Fridman (01:43:04) So on the theme of parking very small computational devices next to neurons, that’s the dream, the vision of brain computer interfaces. Maybe before we talk about Neuralink, can you give a sense of the history of the field of BCI? What has been maybe the continued dream and also some of the milestones along the way of the different approaches and the amazing work done at the various labs?

DJ Seo (01:43:33) I think a good starting point is going back to 1790s.

Lex Fridman (01:43:39) I did not expect that.

DJ Seo (01:43:41) Where the concept of animal electricity or the fact that body’s electric was first discovered by Luigi Galvani, where he had this famous experiment where he connected set of electrodes to a frog leg and ran current through it, and then it started twitching and he said, “Oh my goodness, body’s electric.” So fast forward many, many years to 1920s where Hans Berger, who’s a German psychiatrist, discovered EEG or electroencephalography, which is still around. There are these electrode arrays that you wear outside the skull that gives you some sort of neural recording. That was a very, very big milestone that you can record some sort of activities about the human mind. And then in the 1940s there were these group of scientists, Renshaw, Forbes and Morison that inserted these glass micro electrodes into the cortex and recorded single neurons. The fact that there’s signal that are a bit more high resolution and high fidelity as you get closer to the source, let’s say. And in the 1950s, these two scientists, Hodgkin and Huxley showed up-

DJ Seo (01:45:00) These two scientists, Hodgkin and Huxley showed up and they built this beautiful, beautiful models of the cell membrane and the ionic mechanism, and had these circuit diagram. And as someone who’s an electrical engineer, it’s a beautiful model that’s built out of these partial differential equations, talking about flow of ions and how that really leads to how neurons communicate. And they won the Nobel Prize for that 10 years later in the 1960s.

(01:45:29) So in 1969, Eb Fetz from University of Washington published this beautiful paper called Operant Conditioning of Cortical Unit Activity, where he was able to record a single unit neuron from a monkey and was able to have the monkey modulated based on its activity and reward system. So I would say this is the very, very first example, as far as I’m aware, of close loop brain computer interface or BCI.

Lex Fridman (01:46:01) The abstract reads, “The activity of single neurons in precentral cortex of unanesthetized monkeys was conditioned by reinforcing high rates of neuronal discharge with delivery of a food pellet. Auditory or visual feedback of unit firing rates was usually provided in addition to food reinforcement.” Cool. So they actually got it done.

DJ Seo (01:46:24) They got it done. This is back in 1969.

Lex Fridman (01:46:30) ” After several training sessions, monkeys could increase the activity of newly isolated cells by 50 to 500% above rates before reinforcement.” Fascinating.

DJ Seo (01:46:41) Brain is very [inaudible 01:46:45].

Lex Fridman (01:46:44) And so from here, the number of experiments grew.

DJ Seo (01:46:49) Yeah. Number of experiments, as well as set of tools to interface with the brain have just exploded. And also, just understanding the neural code and how some of the cortical layers and the functions are organized. So the other paper that is pretty seminal, especially in the motor decoding, was this paper in the 1980s from Georgopoulos that discovered that there’s this thing called motor tuning curve. So what are motor tuning curves? It’s the fact that there are neurons in the motor cortex of mammals, including humans, that have a preferential direction that causes them to fire. So what that means is, there are a set of neurons that would increase their spiking activities when you’re thinking about moving to the left, right, up, down, and any of those vectors. And based on that, you could start to think, well, if you can’t identify those essential eigenvectors, you can do a lot. And you can actually use that information for actually decoding someone’s intended movement from the cortex. So that was a very, very seminal paper that showed that there is some sort of code that you can extract, especially in the motor cortex.

Lex Fridman (01:48:11) So there’s signal there. And if you measure the electrical signal from the brain that you could actually figure out what the intention was.

DJ Seo (01:48:20) Correct. Yeah, not only electrical signals, but electrical signals from the right set of neurons that give you these preferential direction.

Lex Fridman (01:48:29) Okay. So going slowly towards Neuralink, one interesting question is, what do we understand on the BCI front, on invasive versus non-invasive, from this line of work? How important is it to park next to the neuron? What does that get you?

DJ Seo (01:48:49) That answer fundamentally depends on what you want to do with it. There’s actually incredible amount of stuff that you can do with EEG and electrocortical graph, ECOG, which actually doesn’t penetrate the cortical layer or parenchyma, but you place a set of electrodes on the surface of the brain. So the thing that I’m personally very interested in is just actually understanding and being able to just really tap into the high resolution, high fidelity, understanding of the activities that are happening at the local level. And we can get into biophysics, but just to step back to use analogy, because analogy here can be useful, and sometimes it’s a little bit difficult to think about electricity. At the end of the day, we’re doing electrical recording that’s mediated by ionic currents, movements of these charged particles, which is really, really hard for most people to think about.

(01:49:45) But turns out, a lot of the activities that are happening in the brain and the frequency bandwidth with which that’s happening, is actually very, very similar to sound waves and our normal conversation audible range. So the analogy that typically is used in the field is, if you have a football stadium, there’s a game going on. If you stand outside the stadium, you maybe get a sense of how the game is going based on the cheers and the boos of the home crowd, whether the team is winning or not. But you have absolutely no idea what the score is, you have absolutely no idea what individual audience or the players are talking or saying to each other, what the next play is, what the next goal is. So what you have to do is you have to drop the microphone into the stadium and then get near the source into the individual chatter. In this specific example, you would want to have it right next to where the huddle is happening.

(01:50:47) So I think that’s kind of a good illustration of what we’re trying to do when we say invasive or minimally invasive or implanted brain computer interfaces versus non-invasive or non-implanted brain interfaces. It’s basically talking about where do you put that microphone and what can you do with that information.

Biophysics of neural interfaces

Lex Fridman (01:51:07) So what is the biophysics of the read and write communication that we’re talking about here as we now step into the efforts at Neuralink?

DJ Seo (01:51:18) Yeah. So brain is made up of these specialized cells called neurons. There’s billions of them, tens of billions, sometimes people call it a hundred billion, that are connected in this complex yet dynamic network that are constantly remodeling. They’re changing their synaptic weights, and that’s what we typically call neuroplasticity. And the neurons are also bathed in this charged environment that is latent with many charge molecules like potassium ions, sodium ions, chlorine ions. And those actually facilitate these, through ionic current, communication between these different networks.

(01:52:08) And when you look at a neuron as well, they have these membrane with a beautiful, beautiful protein structure called the voltage selective ion channels, which in my opinion, is one of nature’s best inventions. In many ways, if you think about what they are, they’re doing the job of a modern day transistors. Transistors are nothing more, at the end of the day, than a voltage-gated conduction channel. And nature found a way to have that very, very early on in its evolution. And as we all know, with the transistor, you can have many, many computation and a lot of amazing things that we have access to today. So I think it’s one of those, just as a tangent, just a beautiful, beautiful invention that the nature came up with, these voltage-gated ion channels.

Lex Fridman (01:53:02) I suppose there’s, on the biological of it, every level of the complexity, of the hierarchy, of the organism, there’s going to be some mechanisms for storing information and for doing computation. And this is just one such way. But to do that with biological and chemical components is interesting. Plus, when neurons, it’s not just electricity, it’s chemical communication, it’s also mechanical. These are actual objects that vibrate, they move. It’s all of that.

DJ Seo (01:53:36) Yeah, actually there’s a lot of really, really interesting physics that are involved in kind of going back to my work on ultrasound during grad school, there were groups and there are still groups looking at ways to cause neurons to actually fire an action potential using ultrasound wave. And the mechanism to which that’s happening is still unclear, as I understand. It may just be that you’re imparting some sort of thermal energy and that causes cells to depolarize in some interesting ways. But there are also these ion channels, or even membranes, that actually just open up as pore as they’re being mechanically shook, vibrated. There’s just a lot of elements of these, move particles, which again, that’s governed by diffusion physics, movements of particles. And there’s also a lot of interesting physics there.

Lex Fridman (01:54:35) Also, not to mention, as Roger Penrose talks about, there might be some beautiful weirdness in the quantum mechanical effects of all of this.

Lex Fridman (01:54:44) And he actually believes that consciousness might emerge from the quantum mechanical effects there. So there’s physics, there’s chemistry, there’s biology, all of that is going on there.

DJ Seo (01:54:54) Oh, yeah. Yes, there’s a lot of levels of physics that you can dive into. But yeah, in the end, you have these membranes with these voltage-gated ion channels that selectively let these charged molecules that are in the extracellular matrix, in and out. And these neurons generally have these resting potential where there’s a voltage difference between inside the cell and outside the cell. And when there’s some sort of stimuli that changes the state such that they need to send information to the downstream network, you start to see these orchestration of these different molecules going in and out of these channels. They also open up. More of them open up once it reaches some threshold, to a point where you have a depolarizing cell that sends an action potential. So it’s just a very beautiful kind of orchestration of these molecules. And what we’re trying to do when we place an electrode or parking it next to a neuron is that you’re trying to measure these local changes in the potential. Again, mediated by the movements of the ions.

(01:56:17) And what’s interesting, as I mentioned earlier, there’s a lot of physics involved. And the two dominant physics for this electrical recording domain is diffusion physics and electromagnetism. And where one dominates, where Maxwell’s equation dominates versus Fick’s law dominates depends on where your electrode is. If it’s close to the source, mostly electromagnetic-based. When you’re further away from it, it’s more diffusion-based. So essentially, when you’re able to park it next to it, you can listen in on those individual chatter and those local changes in the potential. And the type of signal that you get are these canonical textbook neural spiking waveform. The moment you’re further away, and based on some of the studies that people have done, Christof Koch’s lab, and others, once you’re away from that source by roughly around a hundred micron, which is about a width of a human hair, you no longer hear from that neuron. You’re no longer able to have the system sensitive enough to be able to record that particular local membrane potential change in that neuron.

(01:57:36) And just to give you a sense of scale also, when you look at a hundred micron voxel, so a hundred micron by a hundred micron by a hundred micron box in a brain tissue, there’s roughly around 40 neurons, and whatever number of connections that they have. So there’s a lot in that volume of tissue. So the moment you’re outside of that, there’s just no hope that you’ll be able to detect that change from that one specific neuron that you may care about.

Lex Fridman (01:58:03) But as you’re moving about this space, you’ll be hearing other ones. So if you move another a hundred micron, you’ll be hearing chatter from another community.

Lex Fridman (01:58:14) And so the whole sense is, you want to place as many as possible electrodes, and then you’re listening to the chatter.

DJ Seo (01:58:20) Yeah, you want to listen to the chatter. And at the end of the day, you also want to basically let the software do the job of decoding. And just to kind of go to why ECOG and EEG work at all. When you have these local changes, obviously it’s not just this one neuron that’s activating, there’s many, many other networks that are activating all the time. And you do see sort of a general change in the potential of this electrode, this charged medium, and that’s what you’re recording when you’re farther away. I mean, you still have some reference electrode that’s stable in the brain, that’s just electro- active organ, and you’re seeing some combination, aggregate action, potential changes, and then you can pick it up. It’s a much slower changing signals. But there are these canonical oscillations and waves like gamma waves, beta waves, when you sleep, that can be detected because there’s sort of a synchronized global effect of the brain that you can detect. And the physics of this go, if we really want to go down that rabbit hole, there’s a lot that goes on in terms of why diffusion physics at some point dominates when you’re further away from the source. It is just a charged medium. So similar to how when you have electromagnetic waves propagating in atmosphere or in a charged medium like a plasma, there’s this weird shielding that happens that actually further attenuates the signal as you move away from it. So yeah, you see, if you do a really, really deep dive on the signal attenuation over distance, you start to see one over R square in the beginning and then exponential drop off, and that’s the knee at which you go from electromagnetism dominating to diffusion physic dominating.

Lex Fridman (02:00:19) But once again, with the electrodes, the biophysics that you need to understand is not as deep because no matter where you’re placing it, you’re listening to a small crowd of local neurons.

DJ Seo (02:00:32) Correct, yeah. So once you penetrate the brain, you’re in the arena, so to speak.

Lex Fridman (02:00:37) And there’s a lot of neurons.

DJ Seo (02:00:37) There are many, many of them.

Lex Fridman (02:00:40) But then again, there’s a whole field of neuroscience that’s studying how the different groupings, the different sections of the seating in the arena, what they usually are responsible for, which is where the metaphor probably falls apart because the seating is not that organized in an arena.

DJ Seo (02:00:56) Also, most of them are silent. They don’t really do much. Or their activities are… You have to hit it with just the right set of stimulus.

Lex Fridman (02:01:07) So they’re usually quiet.

DJ Seo (02:01:09) They’re usually very quiet. Similar to dark energy and dark matter, there’s dark neurons. What are they all doing? When you place these electrodes, again, within this hundred micron volume, you have 40 or so neurons. Why do you not see 40 neurons? Why do you see only a handful? What is happening there?

Lex Fridman (02:01:25) Well, they’re mostly quiet, but when they speak, they say profound shit. That’s the way I’d like to think about it. Anyway, before we zoom in even more, let’s zoom out. So how does Neuralink work from the surgery to the implant, to the signal and the decoding process, and the human being able to use the implant to actually affect the world outside? And all of this, I’m asking in the context of, there’s a gigantic historic milestone that Neuralink just accomplished in January of this year. Putting a Neuralink implant in the first human being, Noland. And there’s been a lot to talk about there about his experience because he’s able to describe all the nuance and the beauty and the fascinating complexity of that experience of everything involved. But on the technical level, how does Neuralink work?

DJ Seo (02:02:26) So there are three major components to the technology that we’re building. One is the device, the thing that’s actually recording these neural chatters. We call it N1 Implant or The Link. And we have a surgical robot that’s actually doing an implantation of these tiny, tiny wires that we call threads that are smaller than human hair. And once everything is surgerized, you have these neural signals, these spiking neurons, that are coming out of the brain, and you need to have some sort of software to decode what the users intend to do with that. So there’s what’s called the Neuralink Application or B1 App that’s doing that translation. It’s running the very, very simple machine learning model that decodes these inputs that are neural signals and then convert it to a set of outputs that allows our first participant, Noland, to be able to control a cursor on the screen.

Lex Fridman (02:03:31) And this is done wirelessly?

DJ Seo (02:03:33) And this is done wirelessly. So our implant is actually a two-part. The link has these flexible tiny wires called threads that have multiple electrodes along its length. And they’re only inserted into the cortical layer, which is about three to five millimeters in a human brain, in the motor cortex region. That’s where the intention for movement lies in. And we have 64 of these threads, each thread having 16 electrodes along the span of three to four millimeters, separated by 200 microns. So you can actually record along the depth of the insertion. And based on that signal, there’s custom integrated circuit or ASIC that we built that amplifies the neural signals that you’re recording and then digitizing it and then has some mechanism for detecting whether there was an interesting event that is a spiking event, and decide to send that or not send that through Bluetooth to an external device, whether it’s a phone or a computer that’s running this Neuralink application.

Lex Fridman (02:04:50) So there’s onboard signal processing already just to decide whether this is an interesting event or not. So there is some computational power on board in addition to the human brain?

DJ Seo (02:05:00) Yeah. So it does the signal processing to really compress the amount of signal that you’re recording. So we have a total of thousand electrodes sampling at just under 20 kilohertz with 10 bit each. So that’s 200 megabits that’s coming through to the chip from thousand channel simultaneous neural recording. And that’s quite a bit of data, and there are technology available to send that off wirelessly. But being able to do that in a very, very thermally-constrained environment that is a brain. So there has to be some amount of compression that happens to send off only the interesting data that you need, which in this particular case for motor decoding is, occurrence of a spike or not. And then being able to use that to decode the intended cursor movement. So the implant itself processes it, figures out whether a spike happened or not with our spike detection algorithm, and then sends it off, packages it, sends it off through Bluetooth to an external device that then has the model to decode, okay, based on these spiking inputs, did Noland wish to go up, down, left, right, or click or right click or whatever.

Lex Fridman (02:06:23) All of this is really fascinating, but let’s stick on the N1 Implant itself. So the thing that’s in the brain. So I’m looking at a picture of it, there’s an enclosure, there’s a charging coil, so we didn’t talk about the charging, which is fascinating. The battery, the power electronics, the antenna. Then there’s the signal processing electronics. I wonder if there’s more kinds of signal processing you can do? That’s another question. And then there’s the threads themselves with the enclosure on the bottom. So maybe to ask about the charging. So there’s an external charging device?

DJ Seo (02:07:03) Yeah, there’s an external charging device. So yeah, the second part of the implant, the threads are the ones, again, just the last three to five millimeters are the ones that are actually penetrating the cortex. Rest of it is, actually most of the volume, is occupied by the battery, rechargeable battery, and it’s about a size of a quarter. I actually have a device here if you want to take a look at it. This is the flexible thread component of it, and then this is the implant. So it’s about a size of a US quarter. It’s about nine millimeters thick. So basically this implant, once you have the craniectomy and the directomy, threads are inserted, and the hole that you created, this craniectomy, gets replaced with that. So basically that thing plugs that hole, and you can screw in these self-drilling cranial screws to hold it in place. And at the end of the day, once you have the skin flap over, there’s only about two to three millimeters that’s obviously transitioning off of the top of the implant to where the screws are. And that’s the minor bump that you have.

Lex Fridman (02:08:22) Those threads look tiny. That’s incredible. That is really incredible. That is really incredible. And also, you’re right, most of the actual volume is the battery. This is way smaller than I realized.

DJ Seo (02:08:38) Also, the threads themselves are quite strong.

DJ Seo (02:08:42) And the thread themselves also has a very interesting feature at the end of it called the loop. And that’s the mechanism to which the robot is able to interface and manipulate this tiny hair-like structure.

Lex Fridman (02:08:55) And they’re tiny. So what’s the width of a thread?

DJ Seo (02:08:58) So the width of a thread starts from 16 micron and then tapers out to about 84 micron. So average human hair is about 80 to 100 micron in width.

Lex Fridman (02:09:13) This thing is amazing. This thing is amazing.

DJ Seo (02:09:16) Yes, most of the volume is occupied by the battery, rechargeable lithium ion cell. And the charging is done through inductive charging, which is actually very commonly used. Your cell phone, most cell phones, have that. The biggest difference is that for us, usually when you have a phone and you want to charge it on the charging pad, you don’t really care how hot it gets. Whereas, in for us, it matters. There is a very strict regulation and good reasons to not actually increase the surrounding tissue temperature by two degrees Celsius. So there’s actually a lot of innovation that is packed into this to allow charging of this implant without causing that temperature threshold to reach.

(02:10:03) And even small things like, you see this charging coil and what’s called a ferrite shield. So without that ferrite shield, what you end up having when you have resonant inductive charging is that the battery itself is a metallic can, and you form these eddy currents from external charger and that causes heating, and that actually contributes to inefficiency in charging. So this ferrite shield, what it does, is that it actually concentrate that field line away from the battery and then around the coil that’s actually wrapped around it.

Lex Fridman (02:10:42) There’s a lot of really fascinating design here to make it, I mean, you’re integrating a computer into a biological, a complex biological system.

DJ Seo (02:10:52) Yeah, there’s a lot of innovation here. I would say that part of what enabled this was just the innovations in the wearable. There’s a lot of really, really powerful tiny, low-power microcontrollers, temperature sensors, or various different sensors and power electronics. A lot of innovation really came in the charging coil design, how this is packaged, and how do you enable charging such that you don’t really exceed that temperature limit, which is not a constraint for other devices out there.

Lex Fridman (02:11:28) So let’s talk about the threads themselves. Those tiny, tiny, tiny things. So how many of them are there? You mentioned a thousand electrodes. How many threads are there and what do the electrodes have to do with the threads?

DJ Seo (02:11:42) So the current instantiation of the device has 64 threads, and each thread has 16 electrodes for a total of 1,024 electrodes that are capable of both recording and stimulating. And the thread is basically this polymer-insulated wire. The metal conductor is the kind of a tiramisu cake of ti, plat, gold, plat, ti and they’re very, very tiny wires. Two micron in width. So two one-millionth of meter.

Lex Fridman (02:12:25) It’s crazy that that thing I’m looking at has the polymer-insulation, has the conducting material and has 16 electrodes at the end of it.

DJ Seo (02:12:34) On each of those thread.

Lex Fridman (02:12:35) Yeah, on each of those threads.

Lex Fridman (02:12:37) 16, each one of those 64.

DJ Seo (02:12:38) Yes, you’re not going to be able to see it with naked eyes.

Lex Fridman (02:12:42) And to state the obvious, or maybe for people who are just listening, they’re flexible?

DJ Seo (02:12:48) Yes, that’s also one element that was incredibly important for us. So each of these threads are now, as I mentioned, 16 micron in width, and then they taper to 84 micron, but in thickness they’re less than five micron. And in thickness it’s mostly a polyimide at the bottom and this metal track and then another polyimide. So two micron of polyimide, 400 nanometer of this metal stack and two micron of polyimide sandwiched together to protect it from the environment that is 37 degrees C bag of salt water.

Lex Fridman (02:13:26) Maybe can you speak to some interesting aspects of the material design here? What does it take to design a thing like this and to be able to manufacture a thing like this? For people who don’t know anything about this kind of thing.

DJ Seo (02:13:40) So the material selection that we have is not, I don’t think it was particularly unique. There were other labs and there are other labs that are kind of looking at similar material stack. There’s kind of a fundamental question, and still needs to be answered, around the longevity and reliability of these microelectrodes that we call, compared to some of the other more conventional neural interfaces devices that are intracranial, so penetrating the cortex, that are more rigid, like the Utah Array. That are these four by four millimeter kind of silicon shank that have exposed recording site at the end of it. And that’s been kind of the innovation from Richard Normann back in 1997. It’s called the Utah Array because he was at University of Utah.

Lex Fridman (02:14:36) And what does the Utah Array look like? So it’s a rigid type of [inaudible 02:14:41]?

DJ Seo (02:14:40) Yeah, so we can actually look it up. Yeah, so it’s a bed of needle. There’s-

Lex Fridman (02:14:52) Okay, go ahead. I’m sorry.

DJ Seo (02:14:54) Those are rigid shanks.

Lex Fridman (02:14:55) Rigid, yeah, you weren’t kidding.

DJ Seo (02:14:57) And the size and the number of shanks vary anywhere from 64 to 128. At the very tip of it, is an exposed electrode that actually records neural signal. The other thing that’s interesting to note is that unlike neural link threads that have recording electrodes that are actually exposed iridium oxide recording sites along the depth, this is only at a single depth. So these Utah Array spokes can be anywhere between 0.5 millimeters to 1.5 millimeter, and they also have designs that are slanted. So you can have it inserted at different depths, but that’s one of the other big differences. And then, the main key difference is the fact that there’s no active electronics. These are just electrodes, and then there’s a bundle of a wire that you’re seeing, and then that actually then exits the craniotomy that then has this port that you can connect to for any external electronic devices. They are working on, or have, the wireless telemetry device but it still requires a through-the-skin port, that actually is one of the biggest failure modes for infection for the system.

Lex Fridman (02:16:06) What are some of the challenges associated with flexible threads? Like for example, on the robotic side, R1, implanting those threads. How difficult is that task?

DJ Seo (02:16:19) Yeah, so as you mentioned, they’re very, very difficult to maneuver by hand. These Utah Arrays that you saw earlier, they’re actually inserted by a neurosurgeon actually positioning it near the site that they want. And then there’s a pneumatic hammer that actually pushes them in. So it’s a pretty simple process and they’re easy to maneuver. But for these thin-film arrays, they’re very, very tiny and flexible. So they’re very difficult to maneuver. So that’s why we built an entire robot to do that.

(02:16:55) There are other reasons for why we built the robot, and that is ultimately we want this to help millions and millions of people that can benefit from this. And there just aren’t that many neurosurgeons out there. And robots can be something that we hope can actually do large parts of the surgery. But the robot is this entire other sort of category of product that we’re working on. And it’s essentially this multi- axis gantry system that has the specialized robot head that has all of the optics and this kind of a needle-retracting mechanism that maneuvers these threads via this loop structure that you have on the thread.

Lex Fridman (02:17:52) So the thread already has a loop structure by which you can grab it?

Lex Fridman (02:17:56) So this is fascinating. So you mentioned optics. So there’s a robot, R1, so for now, there’s a human that actually creates a hole in the skull. And then after that, there’s a computer vision component that’s finding a way to avoid the blood vessels. And then you’re grabbing it by the loop, each individual thread, and placing it in a particular location to avoid the blood vessels and also choosing the depth of placement, all that. So controlling every, the 3D geometry, of the placement?

DJ Seo (02:18:31) Correct. So the aspect of this robot that is unique is that it’s not surgeon-assisted or human-assisted. It’s a semi-automatic or automatic robot. Obviously, there are human component to it, when you’re placing targets, you can always move it away from major vessels that you see. But we want to get to a point where one click and it just does the surgery within minutes.

Lex Fridman (02:18:57) So the computer vision component finds great targets, candidates, and the human approves them, and the robot does… Does it do one thread at a time? Or does it do them [inaudible 02:19:08]?

DJ Seo (02:19:07) It does one thread at a time. And that’s actually also one thing that we are looking at ways to do multiple threads at a time. There’s nothing stopping from it. You can have multiple kind of engagement mechanisms. But right now, it’s one-by-one. And we also still do quite a bit of just kind of verification to make sure that it got inserted. If so, how deep? Did it actually match what was programmed in? And so on and so forth.

Lex Fridman (02:19:36) And the actual electrodes are placed at differing depths in the… I mean, it’s very small differences, but differences.

Lex Fridman (02:19:46) And so there’s some reasoning behind that, as you mentioned, it gets more varied signal.

DJ Seo (02:19:56) Yeah, we try to place them all around three or four millimeter from the surface.

DJ Seo (02:20:00) … it’s three or four millimeter from the surface just because the span of the electrode, those 16 electrodes that we currently have in this version, spans roughly around three millimeters. So we want to get all of those in the brain.

Lex Fridman (02:20:16) This is fascinating. Okay, so there’s a million questions here. If we could zoom in specifically on the electrodes. What is your sense, how many neurons is each individual electrode listening to?

DJ Seo (02:20:27) Yeah, each electrode can record from anywhere between zero to 40, as I mentioned earlier. But practically speaking, we only see about at most two to three, and you can actually distinguish which neuron it’s coming from by the shape of the spikes.

DJ Seo (02:20:49) I mentioned the spike detection algorithm that we have, it’s called BOSS algorithm, Buffer Online Spike Sorter.

DJ Seo (02:20:59) It actually outputs at the end of the day six unique values, which are the amplitude of these negative going hump, middle hump, positive going hump, and then also the time at which these happen. And from that, you can have a statistical probability estimation of, “Is that a spike? Is it not a spike?” And then based on that, you could also determine, “Oh, that spike looks different than that spike, it must come from a different neuron.”

Lex Fridman (02:21:27) Okay. So that’s a nice signal processing step from which you can then make much better predictions about if there’s a spike, especially in this kind of context, where there could be multiple neurons screaming. And that that also results in you being able to compress the data better at the of the day.

DJ Seo (02:21:46) And just to be clear, I mean, the labs do this what’s called spike sorting. Usually once you have the fully digitized signals and then you run a bunch of different set of algorithms to tease apart, it’s just all of this for us is done on the device.

DJ Seo (02:22:07) In a very low power, custom-built ASIC digital processing unit.

Lex Fridman (02:22:14) Highly heat constrained.

DJ Seo (02:22:15) Highly heat constrained. And the processing time from signal going in and giving you the output is less than a microsecond, which is a very, very short amount of time.

Lex Fridman (02:22:25) Oh, yeah. So the latency has to be super short.

Lex Fridman (02:22:28) Oh, wow. Oh, that’s a pain in the ass. That’s really tough.

DJ Seo (02:22:32) Yeah, latency is this huge, huge thing that you have to deal with. Right now the biggest source of latency comes from the Bluetooth, the way in which their packetized and we bin them in a 15 millisecond time window.

Lex Fridman (02:22:44) Oh, interesting, so it’s communication constrained. Is there some potential innovation there on the protocol used?

DJ Seo (02:22:49) Yeah. Bluetooth is definitely not our final wireless communication protocol that we want to get to. It’s highly-

Lex Fridman (02:22:59) Hence, the N1 and the R1. I imagine that increases [inaudible 02:23:03].

Lex Fridman (02:23:07) Yeah, that’s the communication protocol because Bluetooth allows you to communicate, gets farther distances than you need to, so you can go much shorter.

DJ Seo (02:23:16) Yeah. The only, well, the primary motivation for choosing Bluetooth is that, I mean, everything has Bluetooth,

Lex Fridman (02:23:21) All right, so you can talk to any device.

DJ Seo (02:23:23) Interoperability is just absolutely essential, especially in this early phase. And in many ways, if you can access a phone or a computer, you can do anything.

Lex Fridman (02:23:35) It’ll be interesting to step back and actually look at, again, the same pipeline that you mentioned for Noland. What does this whole process look like from finding and selecting a human being, to the surgery, to the first time he’s able to use this thing?

DJ Seo (02:23:56) We have what’s called a patient registry that people can sign up to hear more about the updates. And that was a route to which Noland applied. And the process is that once the application comes in, it contains some medical records, and we … Based on their medical eligibility, there’s a lot of different inclusion/exclusion criteria for them to meet.

(02:24:22) And we go through a prescreening interview process with someone from Neuralink, and at some point we also go out to their homes to do a BCI home audit. Because one of the most revolutionary part about having this in one system that is completely wireless, is that you can use it at home. You don’t actually have to go to the lab and go to the clinic to get connectedorized to these specialized equipment that you can’t take home with you.

(02:24:51) So that’s one of the key elements of when we’re designing the system that we wanted to keep in mind, people hopefully would want to be able to use this every day in the comfort of their homes. And so part of our engagement and what we’re looking for during BCI home audit is to just understand their situation, what other assisted technology that they use.

Lex Fridman (02:25:14) And we should also step back and say that the estimate is 180,000 people live with quadriplegia in the United States, and each year an additional 18,000 suffer a paralyzing spinal cord injury. So these are folks who have a lot of challenges living a life in terms of accessibility, in terms of doing the things that many of us just take for granted day to day.

(02:25:42) And one of the things, one of the goals of this initial study is to enable them to have digital autonomy where they by themselves can interact with a digital device using just their mind, something that you’re calling telepathy, so digital telepathy. Where a quadriplegic can communicate with a digital device in all the ways that we’ve been talking about. Control the mouse cursor enough to be able to do all kinds of stuff, including play games and tweet and all that kind of stuff. And there’s a lot of people for whom life, the basics of life, are difficult because of the things that have happened to them.

DJ Seo (02:26:24) Yeah. I mean, movement is so fundamental to our existence. I mean, even speaking involves movement of mouth, lip, larynx. And without that, it’s extremely debilitating. And there are many, many people that we can help. I mean, especially if you start to look at other forms of movement disorders that are not just from spinal cord injury, but from a ALS, MS, or even stroke, or just aging, that leads you to lose some of that mobility, that independence, it’s extremely debilitating.

Lex Fridman (02:27:09) And all of these are opportunities to help people, to help alleviate suffering, to help improve the quality of life. But each of the things you mentioned is its own little puzzle that needs to have increasing levels of capability from a device like a Neuralink device.

Digital telepathy

(02:27:24) And so the first one you’re focusing on is, it’s just a beautiful word, telepathy. So being able to communicate using your mind wirelessly with a digital device. Can you just explain exactly what we’re talking about?

DJ Seo (02:27:40) Yeah, I mean, it’s exactly that. I mean, I think if you are able to control a cursor and able to click and be able to get access to a computer or a phone, I mean, the whole world opens up to you. And I mean, I guess the word “telepathy,” if you think about that as just definitionally being able to transfer information from my brain to your brain without using some of the physical faculties that we have, like voices.

Lex Fridman (02:28:13) But the interesting thing here is I think the thing that’s not obviously clear is how exactly it works. In order to move a cursor, there’s at least a couple of ways of doing that. One is you imagine yourself maybe moving a mouse with your hand, or you can then, which no one talked about, imagine moving the cursor with your mind.

(02:28:44) But it’s like there is a cognitive step here that’s fascinating, because you have to use the brain and you have to learn how to use the brain, and you have to figure it out dynamically because you reward yourself if it works. I mean, there’s a step that … This is just a fascinating step because you have to get the brain to start firing in the right way. And you do that by imagining … Like fake it till you make it. And all of a sudden it creates the right kind of signal that, if decoded correctly, can create the effect. And then there’s noise around that that you have to figure all of that out. But on the human side, imagine the cursor moving is what you have to do.

DJ Seo (02:29:27) Yeah. He says using the force.

Lex Fridman (02:29:29) The force. I mean, isn’t that just fascinating to you that it works? To me, it’s like, holy shit, that actually works. You could move a cursor with your mind.

DJ Seo (02:29:41) As much as you’re learning to use that thing, that thing is also learning about you. Our model’s constantly updating the way to say, “Oh, if someone is thinking about this sophisticated forms of spiking patterns, that actually means to do this.”

Lex Fridman (02:30:02) So the machine is learning about the human and the human is learning about the machine, so there is a adaptability to the signal process and the decoding step, and then there’s the adaptation of Nolan, the human being. The same way, if you give me a new mouse and I move it, I learn very quickly about its sensitivity, so I learn to move it slower. And then there’s other signal drift and all that kind of stuff they have to adapt to, so both are adapting to each other.

Lex Fridman (02:30:34) That’s a fascinating software challenge, on both sides. The software on both, on the human software and the [inaudible 02:30:41] software.

DJ Seo (02:30:41) The organic and the inorganic.

Lex Fridman (02:30:43) The organic and the inorganic. Anyway. Sorry to rudely interrupt. So there’s the selection that Noland has passed with flying colors. Everything, including that it is a BCI-friendly home, all of that. So what is the process of the surgery, implantation, the first moment when he gets to use the system?

DJ Seo (02:31:06) The end-to-end, we say patient end to patient out, is anywhere between two to four hours. In the particular case for Noland it was about three and a half hours, and there’s many steps leading to the actual robot insertion. So there’s anesthesia induction, and we do intra-op CT imaging to make sure that we’re drilling the hole in the right location. And this is also pre-planned beforehand.

(02:31:34) Someone like Noland would go through fMRI and then they can think about wiggling their hand. Obviously due to their injury it’s not going to actually lead to any sort of intended output, but it’s the same part of the brain that actually lights up when you’re imagining moving your finger to actually moving your finger. And that’s one of the ways in which we can actually know where to place our threads because we want to go into what’s called the hand knob area in the motor cortex. And as much as possible, densely put our electrode threads.

(02:32:11) So we do intra-op CT imaging to make sure and double-check the location of the craniectomy. And the surgeon comes in, does their thing in terms of skin incision, craniectomy, so drilling of the skull, and then there’s many different layers of the brain. There’s what’s called a dura, which is a very, very thick layer that surrounds the brain. That gets actually resected in a process called [inaudible 02:32:38]. And that then expose the pia in the brain that you want to insert.

(02:32:43) And by the time it’s been around anywhere between one to one and a half hours, robot comes in, does his thing, placement of the targets, inserting of the thread. That takes anywhere between 20 to 40 minutes. In the particular case for Noland, it was just under or just over 30 minutes. And then after that, the surgeon comes in, there’s a couple other steps of actually inserting the dural substitute layer to protect the thread as well as the brain. And then screw in the implant and then skin flap and then suture, and then you’re out.

Lex Fridman (02:33:18) So when Noland woke up, what was that like? What was the recovery like, and when was the first time he was able to use it?

DJ Seo (02:33:27) He was actually immediately after the surgery, like an hour after the surgery, as he was waking up, we did turn on the device, make sure that we are recording neural signals. And we actually did have couple signals that we noticed that he can actually modulate. And what I mean by modulate is that he can think about clenching his fist and you could see the spike disappear and appear.

DJ Seo (02:33:58) And that was immediate, immediate after in the recovery room.

Lex Fridman (02:34:06) That’s a human being … I mean, what did that feel like for you? This device and a human being, a first step of a gigantic journey? I mean, it’s a historic moment, even just that spike, just to be able to modulate that.

DJ Seo (02:34:22) Obviously there have been other, as you mentioned, pioneers that have participated in these groundbreaking BCI investigational early feasibility studies. So we’re obviously standing on the shoulders of the giants here, we’re not the first ones to actually put electrodes in a human brain.

(02:34:44) But I mean, just leading up to the surgery, I definitely could not sleep. It’s the first time that you’re working in a completely new environment. We had a lot of confidence based on our benchtop testing or preclinical R&D studies that the mechanism, the threads, the insertion, all that stuff is very safe and that it’s obviously ready for doing this in a human. But there’s still a lot of unknown unknown about can the needle actually insert? I mean, we brought something like 40 needles just in case they break, and we ended up using only one. But I mean, that was the level of just complete unknown because it’s a very, very different environment. And I mean, that’s why we do clinical trial in the first place, to be able to test these things out.

(02:35:40) So extreme nervousness and just many, many sleepless night leading up to the surgery, and definitely the day before the surgery. And it was an early morning surgery. We started at 7:00 in the morning, and by the time it was around 10:30 everything was done. But I mean, first time seeing that, well, number one, just huge relief that this thing is doing what it’s supposed to do. And two, I mean, just immense amount of gratitude for Noland and his family. And then many others that have applied and that we’ve spoken to and will speak to are true pioneers in every word. And I call them the neural astronauts or neuralnaut.

DJ Seo (02:36:32) Just like in the ’60s, these amazing just pioneers exploring the unknown outwards, in this case it’s inward, but an incredible amount of gratitude for them to just participate and play a part. And it’s a journey that we’re embarking on together.

(02:36:57) But also, I think it was just a … That was a very, very important milestone, but our work was just starting. So a lot of just anticipation for, “Okay, what needs to happen next?” What are set of sequences of events that needs to happen for us to make it worthwhile for both Noland as well as us.

Lex Fridman (02:37:17) Just to linger on that, just a huge congratulations to you and the team for that milestone. I know there’s a lot of work left, but that’s really exciting to see. That’s a source of hope, it’s this first big step, opportunity, to help hundreds of thousands of people. And then maybe expand the realm of the possible for the human mind for millions of people in the future. So it’s really exciting. The opportunities are all ahead of us, and to do that safely and to do that effectively was really fun to see. As an engineer, just watching other engineers come together and do an epic thing, that was awesome. So huge congrats.

DJ Seo (02:38:03) Thank you, thank you. Yeah, could not have done it without the team. And yeah, I mean, that’s the other thing that I told the team as well of just this immense sense of optimism for the future. I mean, it’s a very important moment for the company, needless to say, as well as hopefully for many others out there that we can help.

Retracted threads

Lex Fridman (02:38:27) Speaking of challenges, Neuralink published a blog post describing that some of the threads retracted. And so the performance as measured by bits per second dropped at first, but then eventually it was regained. And the whole story of how it was regained is super interesting, that’s definitely something I’ll talk to Bliss and to Noland about.

(02:38:49) But in general, can you speak to this whole experience, how was the performance regained, and just the technical aspects of the threads being retracted and moving?

DJ Seo (02:39:03) The main takeaway is that in the end, the performance have come back and it’s actually gotten better than it was before. He’s actually just beat the world record yet again last week to 8.5 bps. I mean, he’s just cranking and he’s just improving.

Lex Fridman (02:39:20) The previous one that he said was eight.

Lex Fridman (02:39:23) I think he said 8.5.

DJ Seo (02:39:24) Yeah. The previous world record in a human was 4.6, so it’s almost double. And his goal is to try to get to 10, which is roughly around the median neural linker using a mouse with a hand. So it’s getting there.

Lex Fridman (02:39:42) So yeah, so the performance was regained.

DJ Seo (02:39:45) Yeah, better than before. That’s a story on its own of what took the BCI team to recover that performance. It was actually mostly on the signal processing. And so as I mentioned, we were looking at these spike outputs from our electrodes, and what happened is that four weeks into the surgery we noticed that the threads have solely come out of the brain. And the way in which we noticed this at first obviously is that, well, I think Noland was the first to notice, that his performance was degrading. And I think at the time we were also trying to do a bunch of different experimentation, different algorithms, different UI, UX. So it was expected that there will be variability in the performance, but we did see a steady decline.

(02:40:41) And then also the way in which we measure the health of the electrodes or whether they’re in the brain or not, is by measuring impedance of the electrode. So we look at the interfacial, the Randles circuit they say, the capacitance and the resistance between the electrode surface and the medium. And if that changes in some dramatic ways, we have some indication. Or if you’re not seeing spikes on those channels, you have some indications that something’s happening there.

(02:41:11) And what we noticed is that looking at those impedance plot and spike rate plots, and also because we have those electrodes recording along the depth, you are seeing some sort of movement that indicated that threads were being pulled out. And that obviously will have an implication on the model side because if the number of inputs that are going into the model is changing because you have less of them, that model needs to get updated.

(02:41:42) But there were still signals, and as I mentioned, similar to how even when you place the signals on the surface of the brain or farther away, like outside the skull, you still see some useful signals. What we started looking at is not just the spike occurrence through this BOSS algorithm that I mentioned, but we started looking at just the power of the frequency band that is interesting for Noland to be able to modulate. Once we changed the algorithm for the implant to not just give you the BOSS output, but also these spike band power output, that helped us refine the model with a new set of inputs. And that was the thing that really ultimately gave us the performance back. And obviously the thing that we want ultimately and the thing that we are working towards, is figuring out ways in which we can keep those threads intact for as long as possible so that we have many more channels going into the model. That’s by far the number one priority that the team is currently embarking on to understand how to prevent that from happening.

(02:42:56) The thing that I will say also is that, as I mentioned, this is the first time ever that we’re putting these threads in the human brain. And a human brain, just for size reference, is 10 times that of the monkey brain or the sheep brain. And it’s just a very, very different environment. It moves a lot more. It’s actually moved a lot more than we expected when we did Noland’s surgery. And it’s just a very, very different environment than what we’re used to. And this is why we do clinical trial, we want to uncover some of these issues and failure modes earlier than later.

(02:43:37) So in many ways, it’s provided us with this enormous amount of data and information to be able to solve this. And this is something that Neuralink is extremely good at, once we have set of clear objective and engineering problem, we have enormous amount of talents across many, many disciplines to be able to come together and fix the problem very, very quickly.

Vertical integration

Lex Fridman (02:44:01) But it sounds like one of the fascinating challenges here is for the system on the decoding side to be adaptable across different timescales. So whether it’s movement of threads or different aspects of signal drift, sort of on the software or the human brain, something changing, like Noland talks about cursor drift, they could be corrected. And there’s a whole UX challenge to how to do that. So it sounds like adaptability is a fundamental property that has to be engineered in.

DJ Seo (02:44:34) It is. I mean, as a company, we’re extremely vertically integrated. We make these thin-film arrays in our own microfab.

Lex Fridman (02:44:45) Yeah, there’s like you said, built in-house. This whole paragraph here from this blog post is pretty gangster.

(02:44:50) “Building the technologies described above has been no small feat,” and there’s a bunch of links here that I recommend people click on. “We constructed in-house microfabrication capabilities to rapidly produce various iterations of thin-film arrays that constitute our electrode threads. We created a custom femtosecond laser mill-“

Lex Fridman (02:45:12) “… to manufacture components with micro level precision.” I think there’s a tweet associated with this.

DJ Seo (02:45:17) That’s a whole thing that we can get into.

Lex Fridman (02:45:18) Yeah. Okay. What are we looking at here, this thing? “In less than one minute, our custom-made femtosecond laser mill cuts this geometry in the tips of our needles.” So we’re looking at this weirdly shaped needle. “The tip is only 10 to 12 microns in width, only slightly larger than the diameter of a red blood cell. The small size allows threads to be inserted with minimal damage to the cortex.”

(02:45:48) Okay. So what’s interesting about this geometry? So we’re looking at this just geometry of a needle.

DJ Seo (02:45:53) This is the needle that’s engaging with the loops in the thread. They’re the ones that thread their loop, and then peel it from the silicon backing, and then this is the thing that gets inserted into the tissue. And then this pulls out, leaving the thread. And this kind of a notch or the shark tooth that we used to call, is the thing that actually is grasping the loop. And then it’s designed in such a way such that when you pull out, it leaves the loop.

Lex Fridman (02:46:28) And the robot is controlling this needle?

DJ Seo (02:46:31) Correct. So this is actually housed in a cannula, and basically the robot has a lot of the optics that look for where the loop is. There’s actually a 405 nanometer light that actually causes the polyimide to fluoresce so that you can locate the location of the loop.

Lex Fridman (02:46:49) So the loop lights up, is [inaudible 02:46:50]?”

DJ Seo (02:46:50) Yeah, yeah, they do. It’s a micron precision process.

Lex Fridman (02:46:54) What’s interesting about the robot that it takes to do that, that’s pretty crazy. That’s pretty crazy that robot is able to get this kind of precision.

DJ Seo (02:47:01) Yeah, our robot is quite heavy, our current version of it. I mean, it’s like a giant granite slab that weighs about a ton, because it needs to be sensitive to vibration, environmental vibration. And then as the head is moving at the speed that it’s moving, there’s a lot of motion control to make sure that you can achieve that level of precision. A lot of optics that zoom in on that. We’re working on next generation of the robot that is lighter, easier to transport. I mean, it is a feat to move the robot to the surgical suite.

Lex Fridman (02:47:38) And it’s far superior to a human surgeon at this time, for this particular task.

DJ Seo (02:47:42) Absolutely. I mean, let alone you try to actually thread a loop in a sewing kit. We’re talking fractions of human error. These things, it’s not visible.

Lex Fridman (02:47:54) So continuing the paragraph. “We developed novel hardware and software testing systems, such as our accelerated lifetime testing racks and simulated surgery environment,” which is pretty cool, “to stress test and validate the robustness of our technologies. We performed many rehearsals of our surgeries to refine our procedures and make them second nature.” This is pretty cool.

(02:48:14) “We practice surgeries on proxies with all the hardware and instruments needed in our mock or in the engineering space. This helps us rapidly test and measure.” So there’s like proxies?

DJ Seo (02:48:25) Yeah, this proxy is super cool actually. There’s a 3D printed skull from the images that is taken at [inaudible 02:48:34], as well as this hydrogel mix synthetic polymer thing that actually mimics the mechanical properties of the brain. It also has vasculature of the person.

(02:48:50) Basically what we’re talking about here, and there’s a lot of work that has gone into making this set proxy, that it’s about finding the right concentration of these different synthetic polymers to get the right set of consistency for the needle dynamics as they’re being inserted. But we practice this surgery with Noland’s basically physiology and brain many, many times prior to actually doing the surgery.

Lex Fridman (02:49:21) Every step, every step, every-

DJ Seo (02:49:23) Every step. Yeah. Like where does someone stand? I mean, what you’re looking at is the picture, this is in our office, of this corner of the robot engineering space that we have created this mock OR space that looks exactly like what they would experience, all the staff would during their actual surgery.

(02:49:43) I mean, it’s just like any dance rehearsal where exactly where you’re going to stand at what point, and you just practice that over and over and over again with an exact anatomy of someone that you’re going to surgerize. And it got to a point where a lot of our engineers, when we created a craniectomy, they’re like, “Oh, that looks very familiar. We’ve seen that before.”

Lex Fridman (02:50:04) Yeah. Man, there’s wisdom you can gain through doing the same thing over and over and over. It’s like Jiro Dreams of Sushi kind of thing because then … It’s like Olympic athletes visualize the Olympics and then once you actually show up, it feels easy. It feels like any other day. It feels almost boring winning the gold medal, because you visualized this so many times, you’ve practiced this so many times, that nothing about it is new. It’s boring. You win the gold medal, it’s boring. And the experience they talk about is mostly just relief, probably that they don’t have to visualize it anymore.

DJ Seo (02:50:44) Yeah, the power of the mind to visualize and where … I mean, there’s a whole field that studies where muscle memory lies in cerebellum. Yeah, it’s incredible.

Safety

Lex Fridman (02:50:56) I think it’s a good place to actually ask the big question that people might have, is how do we know every aspect of this that you described is safe?

DJ Seo (02:51:06) At the end of the day, the gold standard is to look at the tissue. What sort of trauma did you cause the tissue, and does that correlate to whatever behavioral anomalies that you may have seen? And that’s the language to which we can communicate about the safety of inserting something into the brain and what type of trauma that you can cause.

(02:51:29) We actually have an entire department, department of pathology, that looks at these tissue slices. There are many steps that are involved in doing this. Once you have studies that are launched with particular endpoints in mind, at some point you have to euthanize the animal, and then you go through necropsy to collect the brain tissue samples. You fix them in formalin, and you gross them, you section them, and you look at individual slices just to see what kind of reaction or lack thereof exists.

(02:52:04) So that’s the language to which FDA speaks and as well for us to evaluate the safety of the insertion mechanism, as well as the threats at various different time points, both acute, so anywhere between zero to three months to beyond three months.

Lex Fridman (02:52:25) So those are the details of an extremely high standard of safety that has to be reached.

Lex Fridman (02:52:32) The FDA supervises this, but there’s in general just a very high standard, in every aspect of this, including the surgery. I think Matthew MacDougall has mentioned that the standard is, let’s say how to put it politely, higher than maybe some other operations that we take for granted. So the standard for all the surgical stuff here is extremely high.

DJ Seo (02:52:57) Very high. I mean, it’s a highly, highly regulated environment with the governing agencies that scrutinize every, every medical device that gets marketed. And I think it’s a good thing. It’s good to have those high standards, and we try to hold extremely high standards to understand what sort of damage, if any, these innovative emerging technologies and new technologies that we’re building are. And so far we have been extremely impressed by lack of immune response from these threads.

Lex Fridman (02:53:34) Speaking of which, you talked to me with excitement about the histology in some of the images that you’re able to share. Can you explain to me what we’re looking at?

DJ Seo (02:53:46) Yeah, so what you’re looking at is a stained tissue image. This is a sectioned tissue slice from an animal that was implanted for seven months, so a chronic time point. And you’re seeing all these different colors, and each color indicates specific types of cell types. So purple and pink are astrocytes and microglia, respectably. They’re types of glial cells.

(02:54:12) And the other thing that people may not be aware of is your brain is not just made up of soup of neurons and axons. There are other cells, like glial cells, that actually is the glue and also react if there are any trauma or damage to the tissue.

Lex Fridman (02:54:32) With the brown or the neurons here?

DJ Seo (02:54:33) The brown are the neurons and the blue is nuclei.

Lex Fridman (02:54:35) It’s a lot of neurons.

Lex Fridman (02:54:36) So what you’re seeing is in this macro image, you’re seeing these circle highlighted in white, the insertion sites. And when you zoom into one of those, you see the threads. And then in this particular case, I think we’re seeing about the 16 wires that are going into the [inaudible 02:54:56]. And the incredible thing here is the fact that you have the neurons that are these brown structures or brown circular or elliptical thing-

DJ Seo (02:55:00) … are these brown structures or brown circular or elliptical thing that are actually touching and abutting the threads. So what this is saying is that there’s basically zero trauma that’s caused during this insertion. And with these neural interfaces, these micro electrons that you insert, that is one of the most common mode of failure. So when you insert these threads like the Utah Array, it causes neuronal death around the site because you’re inserting a foreign object.

(02:55:29) And that elicit these immune response through microglia and astrocytes, they form this protective layer around it. Oh, not only are you killing the neuron cells, but you’re also creating this protective layer that then basically prevents you from recording neural signals because you’re getting further and further away from the neurons that you’re trying to record. And that is the biggest mode of failure. And in this particular example, in that inside it’s about 50 micron with that scale bar, the neurons seem to be attracted to it.

Lex Fridman (02:55:59) And so there’s certainly no trauma. That’s such a beautiful image, by the way. So the brown at the neurons, and for some reason I can’t look away. It’s really cool.

DJ Seo (02:56:08) Yeah. And the way that these things… Tissues generally don’t have these beautiful colors. This is multiplex stain that uses these different proteins that are staining these at different colors. We use very standard set of staining techniques with H&E, EVA1 and NeuN and GFAB. So if you go to the next image, this is also kind of illustrates the second point because you can make an argument, and initially when we saw the previous image, we said, “Oh, are the threads just floating? What is happening here? Are we actually looking at the right thing?” So what we did is we did another stain, and this is all done in-house of this Masson’s tricrome stain, which is in blue that shows these collagen layer. So the blue, basically, you don’t want the blue around the implant threads. Because that means that there’s some sort of scarring that’s happened. And what you’re seeing if you look at individual threads is that you don’t see any of the blue. Which means that there has been absolutely, or very, very minimal to a point where it’s not detectable amount of trauma in these inserted threads.

Lex Fridman (02:57:16) So that presumably is one of the big benefits of having this kind of flexible thread? This-

DJ Seo (02:57:21) Yeah. So we think this is primarily due to the size as well as the flexibility of the threads. Also, the fact that R1 is avoiding vasculature, so we’re not disrupting or we’re not causing damage to the vessels and not breaking any of the blood brain barrier, has basically caused the immune response to be muted.

Lex Fridman (02:57:45) But this is also a nice illustration of the size of things. So this is the tip of the thread?

DJ Seo (02:57:51) Yeah, those are neurons.

Lex Fridman (02:57:53) And they’re neurons. And this is the thread listening. And the electrodes are positioned how?

DJ Seo (02:57:59) Yeah. So what you’re looking at is not electrode themselves, those are the conductive wires. So each of those should probably be two micron in width. So what we’re looking at is, we’re looking at the coronal slice, so we’re looking at some slice of the tissue. So as you go deeper, you’ll obviously have less and less of the tapering of the thread. But yeah, the point basically being that there’s just cells around the inserter site, which is just an incredible thing to see. I’ve just never seen anything like this.

Lex Fridman (02:58:33) How easy and safe is it to remove the implant?

DJ Seo (02:58:37) Yeah, so it depends on when. In the first three months or so after the surgery, there’s a lot of tissue modeling that’s happening. Similar to when you got a cut, you obviously start over first couple of weeks or depending on the size of the wound, scar tissue forming, there are these contractive, and then in the end they turn into scab and you can scab it off. The same thing happens in the brain. And it’s a very dynamic environment. And before the scar tissue or the neo membrane or the new membrane that forms, it’s quite easy to just pull them out. And there’s minimal trauma that’s caused during that.

(02:59:22) Once the scar tissue forms, and with Noland as well, we believe that that’s the thing that’s currently anchoring the threats. So we haven’t seen any more movements since then. So they’re quite stable. It gets harder to actually completely extract the threads. So our current method for removing the device is cutting the thread, leaving the tissue intact, and then unscrewing and taking the implant out. And that hole is now going to be plugged with either another Neuralink or just with a peak based, plastic based cap.

Lex Fridman (03:00:06) Is it okay to leave the threads in there forever?

DJ Seo (03:00:09) Yeah, we think so. We’ve done studies where we left them there and one of the biggest concerns that we had is, do they migrate and do they get to a point where they should not be? We haven’t seen that. Again. Once the scar tissue forms, they get anchored in place. And I should also say that when we say upgrades, we’re not just talking in theory here, we’ve actually upgraded many, many times. Most of our monkeys or non-human primates, NHP, have been upgraded. Pager, who you saw playing mind pong has the latest version of the device since two years ago and is seemingly very happy and healthy and fat.

Upgrades

Lex Fridman (03:00:51) So what’s designed for the future, the upgrade procedure? So maybe for Noland, what would the upgrade look like? It was essentially what you’re mentioning. Is there a way to upgrade the device internally where you take it apart and keep the capsule and upgrade the internals?

DJ Seo (03:01:15) So there are a couple of different things here. So for Noland, if we were to upgrade, what we would have to do is either cut the threads or extract the threads depending on the situation there in terms of how they’re anchored or scarred in. If you were to remove them with the dual substitute, you have an intact brain, so you can reinsert different threads with the updated implant package. There are a couple of different other ways that we’re thinking about the future of what the upgradable system looks like. One is, at the moment we currently remove the dura, this kind of thick layer that protects the brain, but that actually is the thing that actually proliferates the scar tissue formation. So typically, the general rule of thumb is you want to leave the nature as is and not disrupt it as much. So looking at ways to insert the threats through the dura, which comes with different set of challenges such as, it’s a pretty thick layer, so how do you actually penetrate that without breaking the needle?

(03:02:23) So we’re looking at different needle design for that as well as the kind of the loop engagement. The other biggest challenges are, it’s quite opaque, optically with white light illumination. So how do you avoid still this biggest advantage that we have of avoiding vasculature? How do you image through that? How do you actually still mediate that? So there are other imaging techniques that we’re looking at to enable that. But the goal, our hypothesis is that, and based on some of the early evidence that we have, doing through the dura insertion will cause minimal scarring that causes them to be much easier to extract over time. And the other thing that we’re also looking at, this is going to be a fundamental change in the implant architecture, is as at the moment, it’s a monolithic single implant that comes with a thread that’s bonded together.

(03:03:12) So you can’t actually separate the thing out, but you can imagine having two part implant, bottom part that is the thread that are inserted that has the chips and maybe a radio and some power source. And then you have another implant that has more of the computational heavy load and the bigger battery. And then one can be under the dura, one can be above the dura being the plug for the skull. They can talk to each other, but the thing that you want to upgrade, the computer and not the thread, if you want to upgrade that, you just go in there, remove the screws, and then put in the next version. And you’re off the… It’s a very, very easy surgery too. You do a skin incision, slip this in, screw. Probably be able to do this in 10 minutes.

Lex Fridman (03:03:55) So that would allow you to reuse the thread sort of?

Lex Fridman (03:03:59) So I mean, this leads to the natural question of what is the pathway to scaling the increase in the number of threads? Is that a priority? What’s the technical challenge there?

DJ Seo (03:04:11) Yeah, that is a priority. So for next versions of the implant, the key metrics that we’re looking to improve are number of channels, just recording from more and more neurons. We have a pathway to actually go from currently 1000 to hopefully 3000, if not 6,000 by end of this year.

DJ Seo (03:04:30) And then end of next year we want to get to even more. 16,000.

DJ Seo (03:04:36) There’s a couple of limitations to that. One is, obviously being able to photolithographically, print those wires. As I mentioned, it’s two micron in width and spacing. Obviously, there are chips that are much more advanced than those types of resolution and we have some of the tools that we have brought in house to be able to do that. So traces will be narrower just so that you have to have more of the wires coming up into the chip. Chips also cannot linearly consume more energy as you have more and more channels. So there’s a lot of innovations in the circuit, and architecture as well as the circuit design topology to make them lower power. You need to also think about if you have all of these spikes, how do you send that off to the end application. So you need to think about bandwidth limitation there and potentially innovations and signal processing.

(03:05:28) Physically, one of the biggest challenges is going to be the interface. It’s always the interface that breaks bonding this thin film array to the electronics. It starts to become very, very highly dense interconnects. So how you connectivise that? There’s a lot of innovations in the 3D integrations in the recent years that we can take advantage of. One of the biggest challenges that we do have is forming this hermetic barrier. This is an extremely harsh environment that we’re in, the brain. So how do you protect it from, yeah, the brain trying to kill your electronics, to also your electronics leaking things that you don’t want into the brain. And that forming that hermetic barrier is going to be a very, very big challenge that we, I think are actually well suited to tackle.

Lex Fridman (03:06:20) How do you test that? What’s the development environment to simulate that kind of harshness?

DJ Seo (03:06:25) Yeah, so this is where the accelerated life tester essentially is a brain in a vat. It literally is a vessel that is made up of, and again, for all intents and purpose for this particular type of test, your brain is a salt water. And you can also put some other set of chemicals like reactive oxygen species that get at these interfaces and trying to cause a reaction to pull it apart. But you could also increase the rate at which these interfaces are aging by just increasing temperature. So every 10 degrees Celsius that you increase, you’re basically accelerating time by two X.

(03:07:11) And there’s limit as to how much temperature you want to increase because at some point there’s some other nonlinear dynamics that causes you to have other nasty gases to form that just is not realistic in an environment. So what we do is we increase in our ALT chamber by 20 degrees Celsius that increases the aging by four times. So essentially one day in ALT chamber is four day in calendar year, and we look at whether the implants still are intact, including the threats. And-

Lex Fridman (03:07:43) And operation and all of that.

DJ Seo (03:07:45) … and operation and all of that. Obviously, is not an exact same environment as a brain because brain has mechanical other more biological groups that attack at it. But it is a good test environment, testing environment for at least the enclosure and the strength of the enclosure. And I mean, we’ve had implants, the current version of the implant that has been in there for close to two and a half years, which is equivalent to a decade and they seem to be fine.

Lex Fridman (03:08:18) So it’s interesting that basically close approximation is warm salt water, hot salt water is a good testing environment.

Lex Fridman (03:08:29) By the way, I’m drinking LMNT , which is basically salt water. Which is making me kind of… It doesn’t have computational power the way the brain does, but maybe in terms of other characteristics, it’s quite similar and I’m consuming it.

DJ Seo (03:08:44) Yeah. You have to get it in the right pH too.

Lex Fridman (03:08:48) And then consciousness will emerge. Yeah, no. All right.

DJ Seo (03:08:52) By the way, the other thing that also is interesting about our enclosure is, if you look at our implant, it’s not your common looking medical implant that usually is encased in a titanium can that’s laser welded. We use this polymer called PCTFE, polychlorotrifluoroethylene, which is actually commonly used in blister packs. So when you have a pill and you try to pop a pill, there’s kind of that plastic membrane. That’s what this is. No one’s actually ever used this except us. And the reason we wanted to do this is because electromagnetically transparent. So when we talked about the electromagnetic inductive charging, with titanium can usually if you want to do something like that, you have to have a sapphire window and it’s a very, very tough process to scale.

Lex Fridman (03:09:45) So you’re doing a lot of iteration here in every aspect of this. The materials, the software, all.

Future capabilities

Lex Fridman (03:09:53) Okay. So you mentioned scaling. Is it possible to have multiple Neuralink devices as one of the ways of scaling? To have multiple Neuralink devices implanted?

DJ Seo (03:10:08) That’s the goal. That’s the goal. Yeah. I mean, our monkeys have had two neural links, one in each hemisphere. And then we’re also looking at potential of having one in motor cortex, one in visual cortex and one in wherever other cortex.

Lex Fridman (03:10:24) So focusing on the particular function one Neuralink device.

Lex Fridman (03:10:29) I mean, I wonder if there’s some level of customization that can be done on the compute side. So for the motor cortex-

DJ Seo (03:10:34) Absolutely. That’s the goal. And we talk about at Neuralink building a generalized neural interface to the brain. And that also is strategically how we’re approaching this with marketing and also with regulatory, which is, hey, look, we have the robot and the robot can access any part of the cortex. Right now we’re focused on motor cortex with current version of the N1 that’s specialized for motor decoding tasks. But also at the end of the day, there’s a general compute available there. But typically if you want to really get down to hyperoptimizing for power and efficiency, you do need to get to some specialized function.

(03:11:21) But what we’re saying is that, hey, you are now used to this robotic insertion techniques, which took many, many years of showing data and conversation with the FDA and also internally convincing ourselves that this is safe. And now the difference is if we go to other parts of the brain, like visual cortex, which we’re interested in as our second product, obviously it’s a completely different environment, the cortex is laid out very, very differently. It’s going to be more stimulation focus rather than recording, just kind of creating visual percepts. But in the end, we’re using the same thin film array technology, we’re using the same robot insertion technology, we’re using the same packaging technology. Now it’s where the conversation is focused around what are the differences and what are the implication of those differences in safety and efficacy.

Lex Fridman (03:12:17) The way you said second product is both hilarious and awesome to me. That product being restoring sight for blind people. So can you speak to stimulating the visual cortex? I mean, the possibilities there are just incredible to be able to give that gift back to people who don’t have sight or even any aspect of that. Can you just speak to the challenges of… There’s challenges here-

Lex Fridman (03:12:51) One of which is like you said, from recording to stimulation. Just any aspect of that that you’re both excited and see the challenges of?

DJ Seo (03:13:02) Yeah, I guess I’ll start by saying that we actually have been capable of stimulating through our thin film array as well as other electronics for years. We have actually demonstrated some of that capabilities for reanimating the limb in the spinal cord. Obviously, for the current EFS study, we’ve hardware disabled that. So that’s something that we wanted to embark as a separate journey. And obviously, there are many, many different ways to write information into the brain. The way in which we’re doing that is through electrical, passing electrical current, and kind of causing that to really change the local environment so that you can artificially cause the neurons to depolarize in nearby areas. For vision, specifically the way our visual system works, it’s both well understood. I mean, anything with kind of brain, there are aspects of it that’s well understood, but in the end, we don’t really know anything.

(03:14:10) But the way visual system works is that you have photon hitting your eye, and in your eyes there are these specialized cells called photoreceptor cells that convert the photon energy into electrical signals. And then that then gets projected to your back of your head, your visual cortex. It goes through actually thalamic system called LGN that then projects it out. And then in the visual cortex there’s visual area one or V1, and then there’s a bunch of other higher level processing layers like V2, V3. And there are actually kind of interesting parallels. And when you study the behaviors of these convolutional neural networks, like what the different layers of the network is detecting, first they’re detecting these edges and they’re then detecting some more natural curves and then they start to detect objects.

(03:15:08) Kind of similar thing happens in the brain. And a lot of that has been inspired and also it’s been kind of exciting to see some of the correlations there. But things like from there, where does cognition arise and where’s color encoded? There’s just not a lot of understanding, fundamental understanding there. So in terms of bringing sight back to those that are blind, there are many different forms of blindness. There’s actually million people, 1 million people in the US that are legally blind. That means certain score below in the visual tests. I think it’s something like if you can see something at 20 feet distance that normal people can see at 200 feet distance, if you’re worse than that, you’re legally blind.

Lex Fridman (03:15:57) So fundamental that means you can’t function effectively using sight in the world.

DJ Seo (03:16:04) … you’re environment. And yeah, there are different forms of blindness. There are forms of blindness where there’s some degeneration of your retina is photoreceptor cells and rest of your visual processing that I described is intact. And for those types of individuals, you may not need to maybe stick electrodes into the visual cortex. You can actually build retinal prosthetic devices that actually just replaces the function of that retinal cells that are degenerated. And there are many companies that are working on that, but that’s a very small slice albeit significance, those smaller slice of folks that are legally blind.

(03:16:51) If there’s any damage along that circuitry, whether it’s in the optic nerve or just the LGN circuitry or any break in that circuit, that’s not going to work for you. And the source of where you need to actually cause that visual percepts to happen because your biological mechanism not doing that is by placing electrodes in the visual cortex in the back of your head. And the way in which this would work is that you would have an external camera, whether it’s something as unsophisticated as a GoPro or some sort of wearable Ray- Ban type glasses that meta is working on that captures a scene. And that scene is then converted to a set of electrical impulses or stimulation pulses that you would activate in your visual cortex through these thin film arrays. And by playing some a concerted kind of orchestra of these stimulation patterns, you can create what’s called phosphenes, which are these kind of white yellowish dots that you can also create by just pressing your eyes. You can actually create those percepts by stimulating the visual cortex.

(03:18:08) And the name of the game is really have many of those and have those percepts, be the phosphenes, be as small as possible so that you can start to tell apart they’re the individual pixels of the screen. So if you have many, many of those potentially you’ll be able to, in the long term, be able to actually get naturalistic vision. But in the short term to maybe midterm, being able to at least, be able to have object detection algorithms run on your glasses, the pre-processing units, and then being able to at least see the edges of things so you don’t bump into stuff.

Lex Fridman (03:18:46) This is incredible. This is really incredible. So you basically would be adding pixels and your brain would start to figure out what those pixels mean with different kinds of assistant signal processing on all fronts.

DJ Seo (03:18:59) Yeah. The thing that actually… So a couple of things. One is obviously if you’re blind from birth, the way brain works, especially in the early age, neuroplasticity is really nothing other than your brain and different parts of your brain fighting for the limited territory. And I mean very, very quickly you see cases where people that are… I mean, you also hear about people who are blind that have heightened sense of hearing or some other senses. And the reason for that is because that cortex that’s not used just gets taken over by these different parts of the cortex. So for those types of individuals, I mean I guess they’re going to have to now map some other parts of their senses into what they call vision, but it’s going to be obviously a very, very different conscious experience.

(03:19:54) Before… So I think that’s an interesting caveat. The other thing that also is important to highlight is that, we’re currently limited by our biology in terms of the wavelength that we can see. There’s a very, very small wavelength that is a visible light wavelength that we can see with our eyes. But when you have an external camera with this BCI system, you’re not limited to that. You can have infrared, you can have UV, you can have whatever other spectrum that you want to see. And whether that gets matched to some sort of weird conscious experience, I’ve no idea. But oftentimes I talk to people about the goal of Neuralink being going beyond the limits of our biology. That’s sort of what I mean.

Lex Fridman (03:20:39) And if you’re able to control the kind of raw signal, is that when we use our site, we’re getting the photons and there’s not much processing on it. If you’re being able to control that signal, maybe you can do some kind of processing, maybe you do object detection ahead of time. You’re doing some kind of pre-processing and there’s a lot of possibilities to explore that. So it’s not just increasing thermal imaging, that kind of stuff, but it’s also just doing some kind of interesting processing.

DJ Seo (03:21:10) Correct. Yeah. I mean, my theory of how visual system works also is that, I mean, there’s just so many things happening in the world and there’s a lot of photons that are going into your eye. And it’s unclear exactly where some of the pre-processing steps are happening. But I mean, I actually think that just from a fundamental perspective, there’s just so much the reality that we’re in, if it’s a reality, so there’s so much data and I think humans are just unable to actually eat enough, actually to process all that information. So there’s some sort of filtering that does happen, whether that happens in the retina, whether that happens in different layers of the visual cortex, unclear. But the analogy that I sometimes think about is, if your brain is a CCD camera and all of the information in the world is a sun, and when you try to actually look at the sun with the CCD camera, it’s just going to saturate the sensors because it’s an enormous amount of energy.

(03:22:16) So what you do is you end up adding these filters to just kind of narrow the information that’s coming to you and being captured. And I think things like our experiences or our drugs like propofol, anesthetics drug or psychedelics, what they’re doing is they’re kind of swapping out these filters and putting in new ones or removing older ones and kind of controlling our conscious experience.

Lex Fridman (03:22:50) Yeah, man, not to distract from the topic, but I just took a very high dose of ayahuasca in the Amazon jungle. So yes, it’s a nice way to think about it. You’re swapping out different experiences and with Neuralink being able to control that, primarily at first to improve function, not for entertainment purposes or enjoyment purposes, but-

DJ Seo (03:23:11) Yeah, giving back loss functions.

Lex Fridman (03:23:13) Giving back loss functions. And there, especially when the function is completely lost, anything is a huge help. Would you implant a Neuralink device in your own brain?

DJ Seo (03:23:29) Absolutely. I mean, maybe not right now, but absolutely.

Lex Fridman (03:23:33) What kind of capability once reached you start getting real curious and almost get a little antsy, jealous of people as you watch them get implanted?

DJ Seo (03:23:46) Yeah, I think even with our early participants, if they start to do things that I can’t do, which I think is in the realm of possibility for them to be able to get 15, 20 if not like a hundred BPS. There’s nothing that fundamentally stops us from being able to achieve that type of performance. I mean, I would certainly get jealous that they can do that.

Lex Fridman (03:24:13) I should say that watching Noland, I get a little jealous having so much fun, and it seems like such a chill way to play video games.

DJ Seo (03:24:19) Yeah. I mean the thing that also is hard to appreciate sometimes is that, he’s doing these things while talking. And I mean, it’s multitasking, so it’s clearly, it’s obviously cognitively intensive. But similar to how when we talk, we move our hands. These are multitasking. I mean, he’s able to do that. And you won’t be able to do that with other assistive technology. As far as I am aware, if you’re obviously using an eye tracking device, you’re very much fixated on that thing that you’re trying to do. And if you’re using voice control, I mean if you say some other stuff, you don’t get to use that.

Lex Fridman (03:25:02) The multitasking aspect of that is really interesting. So it’s not just the BPS for the primary task, it’s the parallelization of multiple tasks. If you measure the BPS for the entirety of the human organism. So you’re talking and doing a thing with your mind and looking around also, I mean, there’s just a lot of parallelization that can be happening.

DJ Seo (03:25:28) But I mean, I think at some point for him, if he wants to really achieve those high level BPS, it does require a full attention. And that’s a separate circuitry that is a big mystery, how attention works and…

Lex Fridman (03:25:41) Yeah, attention, cognitive load. I’ve read a lot of literature on people doing two tasks. You have your primary task and a secondary task, and the secondary task is a source of distraction. And how does that affect the performance of the primary task? And depending on the tasks, because there’s a lot of interesting… I mean, this is an interesting computational device, and I think there’s-

Lex Fridman (03:26:05) … a lot of novel insights that can be gained from everything. I mean, I personally am surprised that no one’s able to do such incredible control of the cursor while talking. And also being nervous at the same time because he’s talking like all of us are if you’re talking in front of the camera, you get nervous. So all of those are coming into play and he’s able to still achieve high performance. Surprising. I mean, all of this is really amazing. And I think just after researching this really in depth, I kind of want a Neuralink.

Lex Fridman (03:26:39) And also the safety get in line. Well, we should say the registry is for people who have quadriplegia and all that kind of stuff, so.

Lex Fridman (03:26:47) That’d be a separate line for people. They’re just curious like myself. So now that Noland, patient P1 is part of the ongoing prime study, what’s the high level vision for P2, P3, P4, P5, and just the expansion into other human beings that are getting to experience this implant?

DJ Seo (03:27:14) Yeah, I mean the primary goal is for our study in the first place is to achieve safety endpoints. Just understand safety of this device as well as the implantation process. And also at the same time understand the efficacy and the impact that it could have on the potential user’s lives. And Just because you have, you’re living with tetraplegia, it doesn’t mean your situation is same as another person living with tetraplegia. It’s wildly, wildly varying. And it’s something that we’re hoping to also understand how our technology can serve not just a very small slice of those individuals, but broader group of individuals and being able to get the feedback to just really build just the best product for them.

(03:28:11) So there’s obviously, also goals that we have. And the primary purpose of the early feasibility study is to learn from each and every participant to improve the device, improve the surgery before we embark on what’s called a pivotal study. That then is a much larger trial that starts to look at statistical significance of your endpoints and that’s required before you can then market the device. And that’s how it works in the US and just generally around the world. That’s the process you follow.

(03:28:50) So our goal is to really just understand from people like Noland, P2, P3, future participants, what aspects of our device needs to improve. If it turns out that people are like, “I really don’t like the fact that it lasts only six hours. I want to be able to use this computer for 24 hours.” I mean, that is a user needs and user requirements, which we can only find out from just being able to engage with them.

Lex Fridman (03:29:17) So before the pivotal study, there’s kind of a rapid innovation based on individual experiences. You’re learning from individual people, how they use it, the high resolution details in terms of cursor control and signal and all that kind of stuff, life experience.

DJ Seo (03:29:33) So there’s hardware changes, but also just firmware updates. So even when we had that sort of recovery event for Noland, he now has the new firmware that he has been updated with, and similar to how your phones get updated all the time with new firmware for security patches, whatever, new functionality, UI. And that’s something that is possible with our implant. It’s not a static one-time device that can only do…

DJ Seo (03:30:00) It’s not a static one-time device that can only do the thing that it said it can do. I mean, it’s similar to Tesla, you can do over-the-air firmware updates, and now you have completely new user interface and all these bells and whistles and improvements on everything, like the latest. Right? When we say generalized platform, that’s what we’re talking about.

Lex Fridman (03:30:22) Yeah. It’s really cool how the app that Noland is using, there’s calibration, all that kind of stuff, and then there’s update. You just click and get an update.

(03:30:35) What other future capabilities are you looking to? You said vision. That’s a fascinating one. What about accelerated typing or speech, or this kind of stuff? And what else is there?

DJ Seo (03:30:49) Yeah. Those are still in the realm of movement program. So, largely speaking, we have two programs. We have the movement program and we have the vision program. The movement program currently is focused around the digital freedom. As you can easily guess, if you can control 2D cursor in the digital space, you could move anything in the physical space. So, robotic arms, wheelchair, your environment, or even really, whether it’s through the phone or just directly to those interfaces, to those machines.

(03:31:22) So, we’re looking at ways to expand those types of capability, even for Noland. That requires conversation with the FDA and showing safety data for if there’s a robotic arm or a wheelchair, that we can guarantee that they’re not going to hurt themselves accidentally. Right? It’s very different if you’re moving stuff in the digital domain versus in the physical space, you can actually potentially cause harm to the participants. So, we’re working through that right now.

(03:31:50) Speech does involve different areas of the brain. Speech prosthetic is very, very fascinating and there’s actually been a lot of really amazing work that’s been happening in academia. Sergey Stavisky at UC Davis, Jaimie Henderson and late Krishna Shenoy at Stanford, are doing just some incredible amount of work in improving speech neuro-prosthetics. And those are actually looking more at parts of the motor cortex that are controlling these vocal articulators, and being able to, even by mouthing the word or imagine speech, you can pick up those signals.

(03:32:31) The more sophisticated higher level processing areas like the Broca’s area or Wernicke’s area, those are still very, very big mystery in terms of the underlying mechanism of how all that stuff works. But I mean, I think Neuralink’s eventual goal is to understand those things and be able to provide a platform and tools to be able to understand that and study that.

Lex Fridman (03:32:58) This is where I get to the pothead questions. Do you think we can start getting insight into things like thought? So, speech, there’s a muscular component, like you said, there’s the act of producing sounds, but then what about the internal things like cognition, like low-level thoughts and high-level thoughts? Do you think we’ll start noticing signals that could be picked up, they could be understood, that could be maybe used in order to interact with the outside world?

DJ Seo (03:33:35) In some ways, I guess, this starts to kind of get into the hard problem of consciousness. And I mean, on one hand, all of these are at some point, set of electrical signals that from there maybe it in itself is giving you the cognition or the meaning, or somehow human mind is an incredibly amazing storytelling machine. So, we’re telling ourselves and fooling ourselves that there’s some interesting meaning here.

(03:34:13) But I mean, I certainly think that BCI … Really, BCI, at the end of the day is a set of tools that help you study the underlying mechanisms in a both local but also broader sense, and whether there’s some interesting patterns of electrical signal that means you’re thinking this versus … And you can either learn from many, many sets of data to correlate some of that and be able to do mind reading or not. I’m not sure.

(03:34:47) I certainly would not rule that out as a possibility, but I think BCI alone probably can’t do that. There’s probably additional set of tools and framework and also just hard problem of consciousness, at the end of the day, is rooted in this philosophical question of what is the meaning of it all? What’s the nature of our existence? Where’s the mind emerged from this complex network?

Lex Fridman (03:35:13) Yeah. How does the subjective experience emerge from just a bunch of spikes, electrical spikes?

DJ Seo (03:35:21) Yeah. Yeah. I mean, we do really think about BCI and what we’re building as a tool for understanding the mind, the brain. The only question that matters.

(03:35:34) There actually is some biological existence proof of what it would take to kind of start to form some of these experiences that may be unique. If you actually look at every one of our brains, there are two hemispheres. There’s a left-sided brain, there’s a right-sided brain. And unless you have some other conditions, you normally don’t feel like left legs or right legs, you just feel like one legs, right? So, what is happening there? Right?

(03:36:10) If you actually look at the two hemispheres, there’s a structure that kind of connectorized the two, called the corpus callosum, that is supposed to have around 200 to 300 million connections or axons. So, whether that means that’s the number of interface and electrodes that we need to create some sort of mind meld or from that whatever new conscious experience that you can experience. But I do think that there’s kind of an interesting existence proof that we all have.

Lex Fridman (03:36:52) And that threshold is unknown at this time?

DJ Seo (03:36:55) Oh, yeah. Everything in this domain is speculation. Right?

Lex Fridman (03:37:00) And then, you’d be continuously pleasantly surprised. Do you see a world where there is millions of people, like tens of millions, hundreds of millions of people walking around with a Neuralink device or multiple Neuralink devices in their brain?

DJ Seo (03:37:20) I do. First of all, there are, if you look at worldwide, people suffering from movement disorders and visual deficits, I mean, that’s in the tens if not hundreds of millions of people. So, that alone, I think there’s a lot of benefit and potential good that we can do with this type of technology. And once you start to get into psychiatric application, depression, anxiety, hunger or obesity, right? Mood, control of appetite. I mean, that starts to become very real to everyone.

Lex Fridman (03:38:06) Not to mention that most people on Earth have a smartphone, and once BCI starts competing with a smartphone as a preferred methodology of interacting with the digital world, that also becomes an interesting thing.

DJ Seo (03:38:24) Oh yeah, this is even before going to that, right? There’s almost, I mean, the entire world that could benefit from these types of things. And then, if we’re talking about next generation of how we interface with machines or even ourselves, in many ways, I think BCI can play a role in that. And some of the things that I also talk about is, I do think that there is a real possibility that you could see 8 billion people walking around with Neuralink.

Lex Fridman (03:38:58) Well, thank you so much for pushing ahead. And I look forward to that exciting future.

Matthew MacDougall

Lex Fridman (03:39:06) Thanks for listening to this conversation with DJ Seo. And now, dear friends, here’s Matthew MacDougall, the head neurosurgeon at Neuralink.

(03:39:17) When did you first become fascinated with the human brain?

Matthew MacDougall (03:39:21) Since forever. As far back as I can remember, I’ve been interested in the human brain. I mean, I was a thoughtful kid and a bit of an outsider, and you sit there thinking about what the most important things in the world are in your little tiny adolescent brain. And the answer that I came to, that I converged on was that all of the things you can possibly conceive of as things that are important for human beings to care about are literally contained in the skull. Both the perception of them and their relative values and the solutions to all our problems, and all of our problems, are all contained in the skull. And if we knew more about how that worked, how the brain encodes information and generates desires and generates agony and suffering, we could do more about it.

(03:40:27) You think about all the really great triumphs in human history. You think about all the really horrific tragedies. You think about the Holocaust, you think about any prison full of human stories, and all of those problems boil down to neurochemistry. So, if you get a little bit of control over that, you provide people the option to do better. In the way I read history, the way people have dealt with having better tools is that they most often, in the end, do better, with huge asterisks. But I think it’s an interesting, a worthy, a noble pursuit to give people more options, more tools.

Lex Fridman (03:41:16) Yeah, that’s a fascinating way to look at human history. You just imagine all these neurobiological mechanisms, Stalin, Hitler, Genghis Khan, all of them just had a brain, just a bunch of neurons, few times of billions of neurons gaining a bunch of information over a period of time. They have a set of modules that does language and memory and all that. And from there, in the case of those people, they’re able to murder millions of people. And all that coming from … There’s not some glorified notion of a dictator of this enormous mind or something like this. It’s just the brain.

Matthew MacDougall (03:41:59) Yeah. Yeah. I mean, a lot of that has to do with how well people like that can organize those around them.

Matthew MacDougall (03:42:09) Yeah. And so, I always find it interesting to look to primatology, look to our closest non-human relatives for clues as to how humans are going to behave and what particular humans are able to achieve. And so, you look at chimpanzees and bonobos, and they’re similar but different in their social structures particularly. And I went to Emory in Atlanta and studied under the great Frans de Waal, who was kind of the leading primatologist, who recently died. And his work looking at chimps through the lens of how you would watch an episode of Friends and understand the motivations of the characters interacting with each other. He would look at a chimp colony and basically apply that lens. I’m massively oversimplifying it.

(03:43:05) If you do that, instead of just saying, “Subject 473 threw his feces at subject 471.” You talk about them in terms of their human struggles, accord them the dignity of themselves as actors with understandable goals and drives, what they want out of life. And primarily, it’s the things we want out of life, food, sex, companionship, power. You can understand chimp and bonobo behavior in the same lights much more easily. And I think doing so gives you the tools you need to reduce human behavior from the kind of false complexity that we layer onto it with language, and look at it in terms of, oh, well, these humans are looking for companionship, sex, food, power. And I think that that’s a pretty powerful tool to have in understanding human behavior.

Lex Fridman (03:44:10) And I just went to the Amazon jungle for a few weeks and it’s a very visceral reminder that a lot of life on Earth is just trying to get laid. They’re all screaming at each other. I saw a lot of monkeys and they’re just trying to impress each other, or maybe if there’s a battle for power, but a lot of the battle for power has to do with them getting laid.

Matthew MacDougall (03:44:33) Right. Breeding rights often go with alpha status. And so, if you can get a piece of that, then you’re going to do okay.

Lex Fridman (03:44:40) And we’d like to think that we’re somehow fundamentally different, and especially when it comes to primates, we really aren’t. We can use fancier poetic language, but maybe some of the underlying drives and motivators are similar.

Matthew MacDougall (03:44:57) Yeah, I think that’s true.

Neuroscience

Lex Fridman (03:44:58) And all of that is coming from this, the brain.

Lex Fridman (03:45:02) So, when did you first start studying the brain as the biological mechanism?

Matthew MacDougall (03:45:07) Basically, the moment I got to college, I started looking around for labs that I could do neuroscience work in. I originally approached that from the angle of looking at interactions between the brain and the immune system, which isn’t the most obvious place to start, but I had this idea at the time that the contents of your thoughts would have a direct impact, maybe a powerful one, on non-conscious systems in your body. The systems we think of as homeostatic automatic mechanisms, like fighting off a virus, like repairing a wound. And sure enough, there are big crossovers between the two.

(03:45:55) I mean, it gets to kind of a key point that I think goes under-recognized. One of the things people don’t recognize or appreciate about the human brain enough, and that is that it basically controls or has a huge role in almost everything that your body does. You try to name an example of something in your body that isn’t directly controlled or massively influenced by the brain, and it’s pretty hard. I mean, you might say like bone healing or something. But even those systems, the hypothalamus and pituitary end up playing a role in coordinating the endocrine system, that does have a direct influence on say, the calcium level in your blood, that goes to bone healing. So, non-obvious connections between those things implicate the brain as really a potent prime mover in all of health.

Lex Fridman (03:46:55) One of the things I realized in the other direction too, how most of the systems in the body are integrated with the human brain, they affect the brain also, like the immune system. I think there’s just, people who study Alzheimer’s and those kinds of things, it’s just surprising how much you can understand of that from the immune system, from the other systems that don’t obviously seem to have anything to do with the nervous system. They all play together.

Matthew MacDougall (03:47:28) Yeah, you could understand how that would be driven by evolution too. Just in some simple examples, if you get sick, if you get a communicable disease, you get the flu, it’s pretty advantageous for your immune system to tell your brain, “Hey, now be antisocial for a few days. Don’t go be the life of the party tonight. In fact, maybe just cuddle up somewhere warm, under a blanket, and just stay there for a day or two.” And sure enough, that tends to be the behavior that you see both in animals and in humans. If you get sick, elevated levels of interleukins in your blood and TNF-alpha in your blood, ask the brain to cut back on social activity and even moving around, you have lower locomotor activity in animals that are infected with viruses.

Lex Fridman (03:48:25) So, from there, the early days in neuroscience to surgery, when did that step happen? Which is a leap.

Matthew MacDougall (03:48:34) Yeah. It was sort of an evolution of thought. I wanted to study the brain. I started studying the brain in undergrad in this neuroimmunology lab. I, from there, realized at some point that I didn’t want to just generate knowledge. I wanted to affect real changes in the actual world, in actual people’s lives. And so, after having not really thought about going into medical school, I was on a track to go into a PhD program. I said, “Well, I’d like that option. I’d like to actually potentially help tangible people in front of me.”

(03:49:18) And doing a little digging, found that there exists these MD-PhD programs where you can choose not to choose between them and do both. And so, I went to USC for medical school and had a joint PhD program with Caltech, where I actually chose that program particularly because of a researcher at Caltech named Richard Andersen, who’s one of the godfathers of primate neuroscience, and has a macaque lab where Utah arrays and other electrodes were being inserted into the brains of monkeys to try to understand how intentions were being encoded in the brain.

(03:50:03) So, I ended up there with the idea that maybe I would be a neurologist and study the brain on the side. And then discovered that neurology … Again, I’m going to make enemies by saying this, but neurology predominantly and distressingly to me, is the practice of diagnosing a thing and then saying, “Good luck with that. There’s not much we can do.” And neurosurgery, very differently, it’s a powerful lever on taking people that are headed in a bad direction and changing their course in the sense of brain tumors that are potentially treatable or curable with surgery. Even aneurysms in the brain, blood vessels that are going to rupture, you can save lives, really, is at the end of the day what mattered to me.

(03:50:59) And so, I was at USC, as I mentioned, that happens to be one of the great neurosurgery programs. And so, I met these truly epic neurosurgeons, Alex Khalessi, and Mike Apuzzo, and Steve Giannotta, and Marty Weiss, these epic people that were just human beings in front of me. And so, it kind of changed my thinking from neurosurgeons are distant gods that live on another planet and occasionally come and visit us, to these are humans that have problems and are people, and there’s nothing fundamentally preventing me from being one of them. And so, at the last minute in medical school, I changed gears from going into a different specialty and switched into neurosurgery, which cost me a year. I had to do another year of research because I was so far along in the process that to switch into neurosurgery, the deadlines had already passed. So, it was a decision that cost time, but absolutely worth it.

Neurosurgery

Lex Fridman (03:52:09) What was the hardest part of the training on the neurosurgeon track?

Matthew MacDougall (03:52:14) Yeah, two things, I think, that residency in neurosurgery is sort of a competition of pain, of how much pain can you eat and smile? And so, there’s work hour restrictions that are not really … They’re viewed, I think, internally among the residents as weakness. And so, most neurosurgery residents try to work as hard as they can, and that, I think necessarily means working long hours and sometimes over the work hour limits.

(03:52:49) We care about being compliant with whatever regulations are in front of us, but I think more important than that, people want to give their all in becoming a better neurosurgeon because the stakes are so high. And so, it’s a real fight to get residents to say, go home at the end of their shift and not stay and do more surgery.

Lex Fridman (03:53:12) Are you seriously saying one of the hardest things is literally forcing them to get sleep and rest and all this kind of stuff?

Matthew MacDougall (03:53:20) Historically that was the case.

Lex Fridman (03:53:21) That’s hilarious. And that’s awesome.

Matthew MacDougall (03:53:24) I think the next generation is more compliant and more self-care-

Lex Fridman (03:53:29) Weaker is what you mean. All right. I’m just kidding. I’m just kidding.

Matthew MacDougall (03:53:32) I didn’t say it.

Lex Fridman (03:53:33) Now I’m making enemies.

Lex Fridman (03:53:35) Okay, I get it. Wow, that’s fascinating. So, what was the second thing?

Matthew MacDougall (03:53:39) The personalities. And maybe the two are connected.

Lex Fridman (03:53:43) So, was it pretty competitive?

Matthew MacDougall (03:53:45) It’s competitive, and it’s also, as we touched on earlier, primates like power. And I think neurosurgery has long had this aura of mystique and excellence and whatever about it. And so, it’s an invitation, I think, for people that are cloaked in that authority. A board certified neurosurgeon is basically a walking fallacious appeal to authority. Right? You have license to walk into any room and act like you’re an expert on whatever. And fighting that tendency is not something that most neurosurgeons do well. Humility isn’t the forte.

Lex Fridman (03:54:28) Yeah. I have friends who know you and whenever they speak about you that you have the surprising quality for a neurosurgeon of humility, which I think indicates that it’s not as common as perhaps in other professions, because there is a kind of gigantic sort of heroic aspect to neurosurgery, and I think it gets to people’s head a little bit.

Matthew MacDougall (03:54:54) Yeah. Well, I think that allows me to play well at an Elon company because Elon, one of his strengths, I think, is to just instantly see through fallacy from authority. So, nobody walks into a room that he’s in and says, “Well, goddammit, you have to trust me. I’m the guy that built the last 10 rockets,” or something. And he says, “Well, you did it wrong and we can do it better.” Or, “I’m the guy that kept Ford alive for the last 50 years. You listen to me on how to build cars.” And he says, “No.”

(03:55:34) And so, you don’t walk into a room that he’s in and say, “Well, I’m a neurosurgeon. Let me tell you how to do it.” He’s going to say, “Well, I’m a human being that has a brain. I can think from first principles myself. Thank you very much. And here’s how I think it ought to be done. Let’s go try it and see who’s right.” And that’s proven, I think over and over in his case, to be a very powerful approach.

Lex Fridman (03:55:57) If we just take that tangent, there’s a fascinating interdisciplinary team at Neuralink that you get to interact with, including Elon. What do you think is the secret to a successful team? What have you learned from just getting to observe these folks, world experts in different disciplines work together?

Matthew MacDougall (03:56:21) There’s a sweet spot where people disagree and forcefully speak their mind and passionately defend their position, and yet, are still able to accept information from others and change their ideas when they’re wrong. And so, I like the analogy of how you polish rocks. You put hard things in a hard container and spin it. People bash against each other, and out comes a more refined product. And so, to make a good team at Neuralink, we’ve tried to find people that are not afraid to defend their ideas passionately and occasionally strongly disagree with people that they’re working with, and have the best idea come out on top.

(03:57:20) It’s not an easy balance. Again, to refer back to the primate brain. It’s not something that is inherently built into the primate brain to say, “I passionately put all my chips on this position, and now I’m just going to walk away from it and admit you are right.” Part of our brains tell us that that is a power loss, that is a loss of face, a loss of standing in the community, and now you’re a zeta chump because your idea got trounced. And you just have to recognize that that little voice in the back of your head is maladaptive and it’s not helping the team win.

Lex Fridman (03:58:04) Yeah, you have to have the confidence to be able to walk away from an idea that you hold on to. Yeah.

Lex Fridman (03:58:08) And if you do that often enough, you’re actually going to become the best in the world at your thing. I mean, that rapid iteration.

Matthew MacDougall (03:58:18) Yeah, you’ll at least be a member of a winning team.

Lex Fridman (03:58:22) Ride the wave. What did you learn … You mentioned there’s a lot of amazing neurosurgeons at USC. What lessons about surgery and life have you learned from those folks?

Matthew MacDougall (03:58:35) Yeah. I think working your ass off, working hard while functioning as a member of a team, getting a job done that is incredibly difficult, working incredibly long hours, being up all night, taking care of someone that you think probably won’t survive no matter what you do. Working hard to make people that you passionately dislike look good the next morning.

(03:59:06) These folks were relentless in their pursuit of excellent neurosurgical technique, decade over decade, and I think were well-recognized for that excellence. So, especially Marty Weiss, Steve Giannotta, Mike Apuzzo, they made huge contributions not only to surgical technique, but they built training programs that trained dozens or hundreds of amazing neurosurgeons. I was just lucky to be in their wake.

Lex Fridman (03:59:42) What’s that like … You mentioned doing a surgery where the person is likely not to survive. Does that wear on you?

Matthew MacDougall (03:59:54) Yeah. It’s especially challenging when you … With all respect to our elders, it doesn’t hit so much when you’re taking care of an 80-year-old, and something was going to get them pretty soon anyway. And so, you lose a patient like that, and it was part of the natural course of what is expected of them in the coming years, regardless.

(04:00:36) Taking care of a father of two or three, four young kids, someone in their 30s that didn’t have it coming, and they show up in your ER having their first seizure of their life, and lo and behold, they’ve got a huge malignant inoperable or incurable brain tumor. You can only do that, I think, a handful of times before it really starts eating away at your armor. Or, a young mother that shows up that has a giant hemorrhage in her brain that she’s not going to survive from. And they bring her four-year-old daughter in to say goodbye one last time before they turn the ventilator off. The great Henry Marsh is an English neurosurgeon who said it best, I think. He says, “Every neurosurgeon carries with them a private graveyard.” And I definitely feel that, especially with young parents, that kills me. They had a lot more to give. The loss of those people specifically has a knock-on effect that’s going to make the world worse for people for a long time. And it’s just hard to feel powerless in the face of that. And that’s where I think you have to be borderline evil to fight against a company like Neuralink or to constantly be taking pot shots at us, because what we’re doing is to try to fix that stuff. We’re trying to give people options to reduce suffering. We’re trying to take the pain out of life that broken brains brings in. And yeah, this is just our little way that we’re fighting back against entropy, I guess.

Lex Fridman (04:02:52) Yeah. The amount of suffering that’s endured when some of the things that we take for granted that our brain is able to do is taken away, is immense. And to be able to restore some of that functionality is a real gift.

Matthew MacDougall (04:03:06) Yeah. We’re just starting. We’re going to do so much more.

Lex Fridman (04:03:11) Well, can you take me through the full procedure for implanting, say, the N1 chip in Neuralink?

Matthew MacDougall (04:03:18) Sure. Yeah. It’s a really simple, straightforward procedure. The human part of the surgery that I do is dead simple. It’s one of the most basic neurosurgery procedures imaginable. And I think there’s evidence that some version of it has been done for thousands of years. That there are examples, I think, from ancient Egypt of healed or partially healed trepanations, and from Peru or ancient times in South America where these proto-surgeons would drill holes in people’s skulls, presumably to let out the evil spirits, but maybe to drain blood clots. And there’s evidence of bone healing around the edge, meaning the people at least survived some months after a procedure.

(04:04:11) And so, what we’re doing is that. We are making a cut in the skin on the top of the head over the area of the brain that is the most potent representation of hand intentions. And so, if you are an expert concert pianist, this part of your brain is lighting up the entire time you’re playing. We call it the hand knob.

Lex Fridman (04:04:36) The hand knob. So, it’s all the finger movements, all of that is just firing away.

Matthew MacDougall (04:04:43) Yep. There’s a little squiggle in the cortex right there. One of the folds in the brain is kind of doubly folded right on that spot. And so, you can look at it on an MRI and say, “That’s the hand knob.” And then you do a functional test and a special kind of MRI called a functional MRI, fMRI. And this part of the brain lights up when-

Matthew MacDougall (04:05:00) MRI, fMRI, and this part of the brain lights up when people, even quadriplegic people whose brains aren’t connected to their finger movements anymore, they imagine finger movements and this part of the brain still lights up. So we can ID that part of the brain in anyone who’s preparing to enter our trial and say, okay, that part of the brain we confirm is your hand intention area. And so I’ll make a little cut in the skin, we’ll flap the skin open, just like kind of opening the hood of a car, only a lot smaller, make a perfectly round one inch diameter hole in the skull, remove that bit of skull, open the lining of the brain, the covering of the brain, it’s like a little bag of water that the brain floats in, and then show that part of the brain to our robot. And then this is where the robot shines.

(04:06:01) It can come in and take these tiny, much smaller than human hair, electrodes and precisely insert them into the cortex, into the surface of the brain to a very precise depth, in a very precise spot that avoids all the blood vessels that are coating the surface of the brain. And after the robot’s done with its part, then the human comes back in and puts the implant into that hole in the skull and covers it up, screwing it down to the skull and sewing the skin back together. So the whole thing is a few hours long. It’s extremely low risk compared to the average neurosurgery involving the brain that might, say, open up a deeper part of the brain or manipulate blood vessels in the brain. This opening on the surface of the brain with only cortical micro- insertions carries significantly less risk than a lot of the tumor or aneurysm surgeries that are routinely done.

Lex Fridman (04:07:10) So cortical micro-insertions that are via robot and computer vision are designed to avoid the blood vessels.

Lex Fridman (04:07:19) So I know you’re a bit biased here, but let’s compare human and machine. So what are human surgeons able to do well and what are robot surgeons able to do well at this stage of our human civilization and development?

Matthew MacDougall (04:07:36) Yeah. Yeah, that’s a good question. Humans are general purpose machines. We’re able to adapt to unusual situations. We’re able to change the plan on the fly. I remember well a surgery that I was doing many years ago down in San Diego where the plan was to open a small hole behind the ear and go reposition a blood vessel that had come to lay on the facial nerve, the trigeminal nerve, the nerve that goes to the face. When that blood vessel lays on the nerve, it can cause just intolerable, horrific shooting pain that people describe like being zapped with a cattle prod. And so the beautiful, elegant surgery is to go move this blood vessel off the nerve. The surgery team, we went in there and started moving this blood vessel and then found that there was a giant aneurysm on that blood vessel that was not easily visible on the pre-op scans. And so the plan had to dynamically change and that the human surgeons had no problem with that, were trained for all those things.

(04:08:50) Robots wouldn’t do so well in that situation, at least in their current incarnation, fully robotic surgery, like the electrode insertion portion of the neural link surgery, it goes according to a set plan. And so the humans can interrupt the flow and change the plan, but the robot can’t really change the plan midway through. It operates according to how it was programmed and how it was asked to run. It does its job very precisely, but not with a wide degree of latitude in how to react to changing conditions.

Lex Fridman (04:09:29) So there could be just a very large number of ways that you could be surprised as a surgeon? When you enter a situation, there could be subtle things that you have to dynamically adjust to.

Lex Fridman (04:09:38) And robots are not good at that.

Matthew MacDougall (04:09:44) I think we are at the dawn of a new era with AI of the parameters for robot responsiveness to be dramatically broadened, right? I mean, you can’t look at a self-driving car and say that it’s operating under very narrow parameters. If a chicken runs across the road, it wasn’t necessarily programmed to deal with that specifically, but a Waymo or a self-driving Tesla would have no problem reacting to that appropriately. And so surgical robots aren’t there yet, but give it time.

Lex Fridman (04:10:23) And then there could be a lot of semi-autonomous possibilities of maybe a robotic surgeon could say this situation is perfectly familiar, or this situation is not familiar, and in the not familiar case, a human could take over, but basically be very conservative in saying, okay, this for sure has no issues, no surprises, and let the humans deal with the surprises with the edge cases and all that. That’s one possibility. So you think eventually you’ll be out of the job? Well, you being neurosurgeon, your job being a neurosurgeon. Humans, there will not be many neurosurgeons left on this earth.

Matthew MacDougall (04:11:06) I’m not worried about my job in the course of my professional life. I think I would tell my kids not necessarily to go in this line of work depending on how things look in 20 years.

Lex Fridman (04:11:24) It’s so fascinating because if I have a line of work, I would say it’s programming. And if you ask me, for the last, I don’t know, 20 years, what I would recommend for people, I would tell them, yeah, you’ll always have a job if you’re a programmer because there’s more and more computers and all this kind of stuff and it pays well. But then you realize these large language models come along and they’re really damn good at generating code. So overnight you could be surprised like, wow, what is the contribution of the human really? But then you start to think, okay, it does seem that humans have ability, like you said, to deal with novel situations. In the case of programming, it’s the ability to come up with novel ideas to solve problems. It seems like machines aren’t quite yet able to do that. And when the stakes are very high, when it’s life critical as it is in surgery, especially in neurosurgery, then the stakes are very high for a robot to actually replace a human. But it’s fascinating that in this case of Neuralink, there’s a human robot collaboration.

Matthew MacDougall (04:12:34) Yeah, yeah. I do the parts it can’t do and it does the parts I can’t do, and we are friends.

Lex Fridman (04:12:45) I saw that there’s a lot of practice going on. I mean everything in Neuralink is tested extremely rigorously, but one of the things I saw that there’s a proxy on which the surgeries are performed. So this is both for the robot and for the human, for everybody involved in the entire pipeline. What’s that like, practicing the surgery?

Matthew MacDougall (04:13:07) It’s pretty intense. So there’s no analog to this in human surgery. Human surgery is sort of this artisanal craft that’s handed down directly from master to pupil over the generations. I mean, literally the way you learn to be a surgeon on humans is by doing surgery on humans. I mean, first you watch your professors do a bunch of surgery, and then finally they put the trivial parts of the surgery into your hands, and then the more complex parts, and as your understanding of the point and the purposes of the surgery increases, you get more responsibility in the perfect condition. Doesn’t always go well. In Neuralink’s case, the approach is a bit different. We, of course, practiced as far as we could on animals. We did hundreds of animal surgeries. And when it came time to do the first human, we had just an amazing team of engineers build incredibly lifelike models. One of the engineers, Fran Romano in particular, built a pulsating brain in a custom 3-D printed skull that matches exactly the patient’s anatomy, including their face and scalp characteristics.

(04:14:35) And so when I was able to practice that, it’s as close as it really reasonably should get to being the real thing in all the details, including having a mannequin body attached to this custom head. And so when we were doing the practice surgeries, we’d wheel that body into the CT scanner and take a mock CT scan and wheel it back in and conduct all the normal safety checks, verbally, “Stop. This patient we’re confirming his identification is mannequin number…” Blah, blah, blah. And then opening the brain in exactly the right spot using standard operative neuro-navigation equipment, standard surgical drills in the same OR that we do all of our practice surgeries in at Neuralink and having the skull open and have the brain pulse, which adds a degree of difficulty for the robot to perfectly precisely plan and insert those electrodes to the right depth and location. And so we kind of broke new ground on how extensively we practiced for this surgery.

Lex Fridman (04:15:52) So there was a historic moment, a big milestone for Neuralink, in part for humanity, with the first human getting a Neuralink implant in January of this year. Take me through the surgery on Noland. What did it feel like to be part of this?

Matthew MacDougall (04:16:13) Yeah. Well, we are lucky to have just incredible partners at the Barrow Neurologic Institute. They are, I think, the premier neurosurgical hospital in the world. They made everything as easy as possible for the trial to get going and helped us immensely with their expertise on how to arrange the details. It was a much more high pressure surgery in some ways. I mean, even though the outcome wasn’t particularly in question in terms of our participant’s safety, the number of observers, the number of people, there’s conference rooms full of people watching live streams in the hospital rooting for this to go perfectly, and that just adds pressure that is not typical for even the most intense production neurosurgery, say, removing a tumor or placing deep brain stimulation electrodes, and it had never been done on a human before. There were unknown unknowns.

(04:17:27) And so definitely a moderate pucker factor there for the whole team not knowing if we were going to encounter, say, a degree of brain movement that was unanticipated or a degree of brain sag that took the brain far away from the skull and made it difficult to insert or some other unknown unknown problem. Fortunately everything went well and that surgery is one of the smoothest outcomes we could have imagined.

Lex Fridman (04:18:05) I mean, you’re a bit of a quarterback in the Super Bowl kind of situation.

Matthew MacDougall (04:18:07) Extremely nervous. Extremely. I was very pleased when it went well and when it was over. Looking forward to number two.

Lex Fridman (04:18:17) Even with all that practice, all of that, you’ve never been in a situation that’s so high stakes in terms of people watching. And we should also probably mention, given how the media works, a lot of people may be in a dark kind of way hoping it doesn’t go well.

Matthew MacDougall (04:18:36) I think wealth is easy to hate or envy or whatever, and I think there’s a whole industry around driving clicks and bad news is great for clicks, and so any way to take an event and turn it into bad news is going to be really good for clicks.

Lex Fridman (04:19:00) It just sucks because I think it puts pressure on people. It discourages people from trying to solve really hard problems because to solve hard problems, you have to go into the unknown. You have to do things that haven’t been done before and you have to take risks, calculated risks, you have to do all kinds of safety precautions, but risks nevertheless. I just wish there would be more celebration of that, of the risk taking versus people just waiting on the sidelines waiting for failure and then pointing out the failure. Yeah, it sucks. But in this case, it’s really great that everything went just flawlessly, but it’s unnecessary pressure, I would say.

Matthew MacDougall (04:19:41) Now that there’s a human with literal skin in the game, there’s a participant whose well-being rides on this doing well. You have to be a pretty person to be rooting for that to go wrong. And so hopefully people look in the mirror and realize that at some point.

Lex Fridman (04:20:01) So did you get to actually front row seat, watch the robot work? You get to see the whole thing?

Matthew MacDougall (04:20:08) Yeah, because an MD needs to be in charge of all of the medical decision-making throughout the process, I unscrubbed from the surgery after exposing the brain and presenting it to the robot and placed the targets on the robot software interface that tells the robot where it’s going to insert each thread. That was done with my hand on the mouse, for whatever that’s worth.

Lex Fridman (04:20:39) So you were the one placing the targets?

Lex Fridman (04:20:42) Oh, cool. So the robot with a computer vision provides a bunch of candidates and you kind of finalize the decision.

Matthew MacDougall (04:20:52) Right. The software engineers are amazing on this team, and so they actually provided an interface where you can essentially use a lasso tool and select a prime area of brain real estate, and it will automatically avoid the blood vessels in that region and automatically place a bunch of targets. That allows the human robot operator to select really good areas of brain and make dense applications of targets in those regions, the regions we think are going to have the most high fidelity representations of finger movements and arm movement intentions.

Lex Fridman (04:21:37) I’ve seen images of this and for me with OCD, for some reason, are really pleasant. I think there’s a Subreddit called Oddly Satisfying.

Matthew MacDougall (04:21:46) Yeah, love that Subreddit.

Lex Fridman (04:21:49) It’s oddly satisfying to see the different target sites avoiding the blood vessels and also maximizing the usefulness of those locations for the signal. It just feels good. It’s like, ah.

Matthew MacDougall (04:22:02) As a person who has a visceral reaction to the brain bleeding, I can tell you it’s extremely satisfying watching the electrodes themselves go into the brain and not cause bleeding.

Lex Fridman (04:22:12) Yeah. Yeah. So you said the feeling was of relief when everything went perfectly?

Brain surgery details

Lex Fridman (04:22:20) How deep in the brain can you currently go and eventually go, let’s say on the Neuralink side. It seems the deeper you go in the brain, the more challenging it becomes.

Matthew MacDougall (04:22:34) Yeah. So talking broadly about neurosurgery, we can get anywhere. It’s routine for me to put deep brain stimulating electrodes near the very bottom of the brain, entering from the top and passing about a two millimeter wire all the way into the bottom of the brain. And that’s not revolutionary, a lot of people do that, and we can do that with very high precision. I use a robot from Globus to do that surgery several times a month. It’s pretty routine.

Lex Fridman (04:23:12) What are your eyes in that situation? What are you seeing? What kind of technology can you use to visualize where you are to light your way?

Matthew MacDougall (04:23:20) Yeah, so it’s a cool process on the software side. You take a preoperative MRI that’s extremely high resolution, data of the entire brain, you put the patient to sleep, put their head in a frame that holds the skull very rigidly, and then you take a CT scan of their head while they’re asleep with that frame on and then merge the MRI and the CT in software. You have a plan based on the MRI where you can see these nuclei deep in the brain. You can’t see them on CT, but if you trust the merging of the two images, then you indirectly know on the CT where that is, and therefore indirectly know where in reference to the titanium frame screwed to their head those targets are. And so this is sixties technology to manually compute trajectories given the entry point and target and dial in some goofy looking titanium manual actuators with little tick marks on them.

(04:24:32) The modern version of that is to use a robot. Just like a little Kuka arm you might see building cars at the Tesla factory, this small robot arm can show you the trajectory that you intended from the pre-op MRI and establish a very rigid holder through which you can drill a small hole in the skull and pass a small rigid wire deep into that area of the brain that’s hollow, and put your electrode through that hollow wire and then remove all of that except the electrode. So you end up with the electrode very, very precisely placed far from the skull surface. Now, that’s standard technology that’s already been out in the world for a while. Neuralink right now is focused entirely on cortical targets, surface targets because there’s no trivial way to get, say, hundreds of wires deep inside the brain without doing a lot of damage. So your question, what do you see? Well, I see an MRI on a screen. I can’t see everything that DBS electrode is passing through on its way to that deep target.

(04:25:48) And so it’s accepted with this approach that there’s going to be about one in a hundred patients who have a bleed somewhere in the brain as a result of passing that wire blindly into the deep part of the brain. That’s not an acceptable safety profile for Neuralink. We start from the position that we want this to be dramatically maybe two or three orders of magnitude safer than that, safe enough, really, that you or I, without a profound medical problem, might on our lunch break someday say, “Yeah, sure, I’ll get that. I’d been meaning to upgrade to the latest version.” And so the safety constraints given that are high, and so we haven’t settled on a final solution for arbitrarily approaching deep targets in the brain.

Lex Fridman (04:26:46) It’s interesting because you have to avoid blood vessels somehow, and you have to… Maybe there’s creative ways of doing the same thing, like mapping out high resolution geometry of blood vessels, and then you can go in blind, but how do you map out that in a way that’s super stable? There’s a lot of interesting challenges there, right?

Lex Fridman (04:27:06) But there’s a lot to do on the surface.

Matthew MacDougall (04:27:07) Exactly. So we’ve got vision on the surface. We actually have made a huge amount of progress sewing electrodes into the spinal cord as a potential workaround for a spinal cord injury that would allow a brain mounted implant to translate motor intentions to a spine mounted implant that can affect muscle contractions in previously paralyzed arms and legs.

Lex Fridman (04:27:36) That’s mind blowing. That’s just incredible. So the effort there is to try to bridge the brain to the spinal cord to the peripheral in your nervous… So how hard is that to do?

Matthew MacDougall (04:27:47) We have that working in very crude forms in animals.

Matthew MacDougall (04:27:53) Yeah, we’ve done…

Lex Fridman (04:27:54) So similar to with Noland where he’s able to digitally move the cursor. Here you’re doing the same kind of communication, but with the effectors that you have.

Lex Fridman (04:28:07) That’s fascinating.

Matthew MacDougall (04:28:08) So we have anesthetized animals doing grasp and moving their legs in a sort of walking pattern. Again, early days, but the future is bright for this kind of thing, and people with paralysis should look forward to that bright future. They’re going to have options.

Lex Fridman (04:28:30) And there’s a lot of sort of intermediate or extra options where you take an optimist robot like the arm, and to be able to control the arm, the fingers and hands of the arm as a prosthetic.

Matthew MacDougall (04:28:47) Exoskeletons are getting better too.

Lex Fridman (04:28:49) Exoskeletons. So that goes hand in hand. Although I didn’t quite understand until thinking about it deeply and doing more research about Neuralink how much you can do on the digital side. So this digital telepathy. I didn’t quite understand that you can really map the intention, as you described in the hand knob area, that you can map the intention. Just imagine it. Think about it. That intention can be mapped to actual action in the digital world, and now more and more, so much can be done in the digital world that it can reconnect you to the outside world. It can allow you to have freedom, have independence if you’re a quadriplegic. That’s really powerful. You can go really far with that.

Matthew MacDougall (04:29:40) Yeah, our first participant is… He’s incredible. He’s breaking world records left and right.

Lex Fridman (04:29:46) And he’s having fun with it. It’s great. Just going back to the surgery. Your whole journey, you mentioned to me offline you have surgery on Monday, so like you’re doing surgery all the time. Yeah. Maybe the ridiculous question, what does it take to get good at surgery?

Matthew MacDougall (04:30:04) Practice, repetitions. Same with anything else. There’s a million ways of people saying the same thing and selling books saying it, but you call it 10,000 hours, you call it spend some chunk of your life, some percentage of your life focusing on this, obsessing about getting better at it. Repetitions, humility, recognizing that you aren’t perfect at any stage along the way, recognizing you’ve got improvements to make in your technique, being open to feedback and coaching from people with a different perspective on how to do it, and then just the constant will to do better. That, fortunately, if you’re not a sociopath, I think your patients bring that with them to the office visits every day. They force you to want to do better all the time.

Lex Fridman (04:31:01) Yeah, just step up. I mean, it’s a real human being, a real human being that you can help.

Lex Fridman (04:31:08) So every surgery, even if it’s the same exact surgery, is there a lot of variability between that surgery in a different person?

Matthew MacDougall (04:31:15) Yeah. A fair bit. A good example for us is the angle of the skull relative to the normal plane of the body axis of the skull over hand knob is pretty wide variation. Some people have really flat skulls and some people have really steeply angled skulls over that area, and that has consequences for how their head can be fixed in sort of the frame that we use and how the robot has to approach the skull. Yeah, people’s bodies are built as differently as the people you see walking down the street, as much variability and body shape and size as you see there. We see in brain anatomy and skull anatomy, there are some people who we’ve had to exclude from our trial for having skulls that are too thick or too thin or scalp that’s too thick or too thin. I think we have the middle 97% or so of people, but you can’t account for all human anatomy variability.

Lex Fridman (04:32:29) How much mushiness and mess is there? Because taking biology classes, the diagrams are always really clean and crisp. Neuroscience, the pictures of neurons are always really nice and [inaudible 04:32:44], but whenever I look at pictures of real brains, they’re all… I don’t know what is going on. So how much our biological systems in reality, how hard is it to figure out what’s going on?

Matthew MacDougall (04:32:59) Not too bad. Once you really get used to this, that’s where experience and skill and education really come into play is if you stare at a thousand brains, it becomes easier to kind of mentally peel back the, say, for instance, blood vessels that are obscuring the sulci and gyri, know kind of the wrinkle pattern of the surface of the brain. Occasionally when you’re first starting to do this and you open the skull, it doesn’t match what you thought you were going to see based on the MRI. And with more experience, you learn to kind of peel back that layer of blood vessels and see the underlying pattern of wrinkles in the brain and use that as a landmark for where you are.

Lex Fridman (04:33:51) The wrinkles are a landmark?

Matthew MacDougall (04:33:53) Yeah. So I was describing hand knob earlier. That’s a pattern of the wrinkles in the brain. It’s sort of this Greek letter, omega shaped area of the brain.

Lex Fridman (04:34:04) So you could recognize the hand knob area. If I show you a thousand brains and give you one minute with each, you’d be like, “Yep, that’s that.”

Lex Fridman (04:34:13) And so there is some uniqueness to that area of the brain in terms of the geometry, the topology of the thing.

Lex Fridman (04:34:21) Where is it about in the…

Matthew MacDougall (04:34:24) So you have this strip of brain running down the top called the primary motor area, and I’m sure you’ve seen this picture of the homunculus laid over the surface of the brain, the weird little guy with huge lips and giant hands. That guy sort of lays with his legs up at the top of the brain and face arm areas farther down, and then some kind of mouth, lip, tongue areas farther down. And so the hand is right in there, and then the areas that control speech, at least on the left side of the brain in most people are just below that. And so any muscle that you voluntarily move in your body, the vast majority of that references that strip or those intentions come from that strip of brain, and the wrinkle for hand knob is right in the middle of that.

Lex Fridman (04:35:22) And vision is back here?

Lex Fridman (04:35:25) Also close to the surface.

Matthew MacDougall (04:35:27) Vision’s a little deeper. And so this gets to your question about how deep can you get. To do vision, we can’t just do the surface of the brain. We have to be able to go in, not as deep as we’d have to go for DBS, but maybe a centimeter deeper than we’re used to for hand insertions. And so that’s work in progress. That’s a new set of challenges to overcome.

Lex Fridman (04:35:55) By the way, you mentioned the Utah Array and I just saw a picture of that and that thing looks terrifying.

Matthew MacDougall (04:36:02) Yeah. The nails.

Lex Fridman (04:36:04) It’s because it’s rigid and then if you look at the threads, they’re flexible. What can you say that’s interesting to you about that kind of approach of the flexible threads to deliver the electrodes next to the neurons?

Matthew MacDougall (04:36:18) Yeah. I mean, the goal there comes from experience. I mean, we stand on the shoulders of people that made Utah Arrays and used Utah Arrays for decades before we ever even came along. Neuralink arose, partly this approach to technology arose out of a need recognized after Utah Arrays would fail routinely because the rigid electrodes, those spikes that are literally hammered using an air hammer into the brain, those spikes generate a bad immune response that encapsulates the electrode spikes in scar tissue essentially. And so one of the projects that was being worked on in the Anderson Lab at Caltech when I got there was to see if you could use chemotherapy to prevent the formation of scars. Things are pretty bad when you’re jamming a bed of nails into the brain, and then treating that with chemotherapy to try to prevent scar tissue, it’s like, maybe we’ve gotten off track here, guys. Maybe there’s a fundamental redesign necessary.

(04:37:32) And so Neuralink’s approach of using highly flexible, tiny electrodes avoids a lot of the bleeding, avoids a lot of the immune response that ends up happening when rigid electrodes are pounded into the brain. And so what we see is our electrode longevity and functionality and the health of the brain tissue immediately surrounding the electrode is excellent. I mean, it goes on for years now in our animal models.

Lex Fridman (04:38:03) What do most people not understand about the biology of the brain? We will mention the vasculature. That’s really interesting.

Matthew MacDougall (04:38:10) I think the most interesting maybe underappreciated fact is that it really does control almost everything. I don’t know, for an out of the blue example, imagine you want a lever on fertility. You want to be able to turn fertility on and off. There are legitimate targets in the brain itself to modulate fertility, say blood pressure. You want to modulate blood pressure, there are legitimate targets in the brain for doing that. Things that aren’t immediately obvious as brain problems are potentially solvable in the brain. And so I think it’s an under-explored area for primary treatments of all the things that bother people.

Lex Fridman (04:39:04) That’s a really fascinating way to look at it. There’s a lot of conditions we might think have nothing to do with the brain, but they might just be symptoms of something that actually started in the brain. The actual source of the problem, the primary source is something in the brain.

Matthew MacDougall (04:39:19) Yeah. Not always. I mean, kidney disease is real, but there are levers you can pull in the brain that affect all of these systems.

Lex Fridman (04:39:32) On-off switches and knobs in the brain from which this all originates. Would you have a Neuralink chip implanted in your brain?

Matthew MacDougall (04:39:42) Yeah. I think use case right now is use a mouse, right? I can already do that, and so there’s no value proposition. On safety grounds alone, sure. I’ll do it tomorrow.

Lex Fridman (04:39:59) You know, when you say the use case of the mouse, is it…

Lex Fridman (04:40:00) The use case of the mouse is after researching all this and part of it’s just watching Nolan have so much fun. If you can get that bits per second look really high with the mouse, being able to interact, because if you think about the way on the smartphone, the way you swipe, that was transformational. How we interact with the thing, it’s subtle, you don’t realize it, but to be able to touch a phone and to scroll with your finger, that changed everything. People were sure you need a keyboard to type. There’s a lot of HCI aspects to that that changed how we interact with computers, so there could be a certain rate of speed with the mouse that would change everything. You might be able to just click around a screen extremely fast. I can’t see myself getting a Neuralink for much more rapid interaction with the digital devices.

Matthew MacDougall (04:41:03) Yeah, I think recording speech intentions from the brain might change things as well, the value proposition for the average person. A keyboard is a pretty clunky human interface, requires a lot of training. It’s highly variable in the maximum performance that the average person can achieve. I think taking that out of the equation and just having a natural word to computer interface might change things for a lot of people.

Lex Fridman (04:41:40) It’d be hilarious if that is the reason people do it. Even if you have speech to text, that’s extremely accurate. It currently isn’t, but it’d say you’ve gotten super accurate. It’d be hilarious if people went for Neuralink. Just so you avoid the embarrassing aspect of speaking, looking like a douchebag speaking to your phone in public, which is a real, that’s a real constraint.

Matthew MacDougall (04:42:03) I mean with a bone conducting case, that can be an invisible headphone, say, and the ability to think words into software and have it respond to you. That starts to sound sort of like embedded super intelligence. If you can silently ask for the Wikipedia article on any subject and have it read to you without any observable change happening in the outside world. For one thing, standardized testing is obsolete.

Lex Fridman (04:42:43) If it’s done well in the UX side, it could change, I don’t know if it transforms society, but it really can create a kind of shift in the way we interact with digital devices in the way that a smartphone did. Just having to look into the safety of everything involved, I would totally try it. So it doesn’t have to go to some incredible thing where you have, it connects your vision or to some other, it connects all over your brain. That could be just connecting to the hand knob. You might have a lot of interesting interaction, human computer interaction possibilities. That’s really interesting.

Matthew MacDougall (04:43:22) And the technology on the academic side is progressing at light speed here. There was a really amazing paper out of UC Davis at Sergey Stavisky’s lab that basically made an initial solve of speech decode. It was something like 125,000 words that they were getting with very high accuracy, which is-

Lex Fridman (04:43:47) So you’re just thinking the word?

Lex Fridman (04:43:49) Thinking the word and you’re able to get it?

Lex Fridman (04:43:51) Oh, boy. You have to have the intention of speaking it. So do the inner voice. Man, it’s so amazing to me that you can do the intention, the signal mapping. All you have to do is just imagine yourself doing it. And if you get the feedback that it actually worked, you can get really good at that. Your brain will first of all adjust and you develop, like any other skill, like touch typing. You develop in that same kind of way.

(04:44:24) To me, it’s just really fascinating to be able to even to play with that, honestly, I would get a Neuralink just to be able to play with that, just to play with the capacity, the capability of my mind to learn this skill. It’s like learning the skill of typing and learning the skill of moving a mouse. It’s another skill of moving the mouse, not with my physical body, but with my mind.

Matthew MacDougall (04:44:47) I can’t wait to see what people do with it. I feel like we’re cavemen right now. We’re banging rocks with a stick and thinking that we’re making music. At some point when these are more widespread, there’s going to be the equivalent of a piano that someone can make art with their brain in a way that we didn’t even anticipate. Looking forward to it.

Lex Fridman (04:45:12) Give it to a teenager. Anytime I think I’m good at something I’ll always go to… I don’t know. Even with the bits per second and playing a video game, you realize you give it to a teenager, you give a Neuralink to a teenager. Just a large number of them, the kind of stuff they get good at stuff, they’re going to get hundreds of bits per second. Even just with the current technology.

Matthew MacDougall (04:45:37) Probably. Probably.

Lex Fridman (04:45:41) Because it’s also addicting, the number go up aspect of it of improving and training. It is almost like a skill and plus there’s the software on the other end that adapts to you, and especially if the adapting procedure algorithm becomes better and better and better. You’re like learning together.

Matthew MacDougall (04:45:59) Yeah, we’re scratching the surface on that right now. There’s so much more to do.

Lex Fridman (04:46:03) So on the complete other side of it, you have an RFID chip implanted in you?

Matthew MacDougall (04:46:13) Little subtle thing.

Lex Fridman (04:46:14) It’s a passive device that you use for unlocking a safe with top secrets or what do you use it for? What’s the story behind it?

Matthew MacDougall (04:46:23) I’m not the first one. There’s this whole community of weirdo biohackers that have done this stuff, and I think one of the early use cases was storing private crypto wallet keys and whatever. I dabbled in that a bit and had some fun with it.

Lex Fridman (04:46:42) You have some Bitcoin implanted in your body somewhere. You can’t tell where. Yeah, yeah.

Matthew MacDougall (04:46:48) Actually, yeah. It was the modern day equivalent of finding change in the sofa cushions after I put some orphaned crypto on there that I thought was worthless and forgot about it for a few years. Went back and found that some community of people loved it and had propped up the value of it, and so it had gone up fifty-fold, so there was a lot of change in those cushions.

Matthew MacDougall (04:47:14) But the primary use case is mostly as a tech demonstrator. It has my business card on it. You can scan that in by touching it to your phone. It opens the front door to my house, whatever, simple stuff.

Lex Fridman (04:47:30) It’s a cool step. It’s a cool leap to implant something in your body. I mean, perhaps it’s a similar leap to a Neuralink because for a lot of people, that kind of notion of putting something inside your body, something electronic inside a biological system is a big leap.

Matthew MacDougall (04:47:45) We have a kind of mysticism around the barrier of our skin. We’re completely fine with knee replacements, hip replacements, dental implants, but there’s a mysticism still around the inviolable barrier that the skull represents, and I think that needs to be treated like any other pragmatic barrier. The question isn’t how incredible is it to open the skull? The question is what benefit can we provide?

Lex Fridman (04:48:21) So from all the surgeries you’ve done, from everything you understand the brain, how much does neuroplasticity come into play? How adaptable is the brain? For example, just even in the case of healing from surgery or adapting to the post-surgery situation.

Matthew MacDougall (04:48:36) The answer that is sad for me and other people of my demographic is that plasticity decreases with age. Healing decreases with age. I have too much gray hair to be optimistic about that. There are theoretical ways to increase plasticity using electrical stimulation. Nothing that is totally proven out as a robust enough mechanism to offer widely to people.

(04:49:06) But yeah, I think there’s cause for optimism that we might find something useful in terms of say, an implanted electrode that improves learning. Certainly there’s been some really amazing work recently from Nicholas Schiff, Jonathan Baker and others who have a cohort of patients with moderate traumatic brain injury who have had electrodes placed in the deep nucleus in the brain called the central median nucleus or just near central median nucleus, and when they apply small amounts of electricity to that part of the brain, it’s almost like electronic caffeine.

(04:49:46) They’re able to improve people’s attention and focus. They’re able to improve how well people can perform a task. I think in one case, someone who was unable to work, after the device was turned on, they were able to get a job. And that’s sort of one of the holy grails for me with Neuralink and other technologies like this is from a purely utilitarian standpoint, can we make people able to take care of themselves and their families economically again? Can we make it so someone who’s fully dependent and even maybe requires a lot of caregiver resources, can we put them in a position to be fully independent, taking care of themselves, giving back to their communities? I think that’s a very compelling proposition and what motivates a lot of what I do and what a lot of the people at Neuralink are working for.

Lex Fridman (04:50:45) It’s just a cool possibility that if you put a Neuralink in there, that the brain adapts the other part of the brain adapts too and integrates it. The capacity of the brain to do that is really interesting. Probably unknown to the degree to which you can do that, but you’re now connecting an external thing to it, especially once it’s doing stimulation. The biological brain and the electronic brain outside of it working together, the possibilities there are really interesting. It’s still unknown, but interesting. It feels like the brain is really good at adapting to whatever, but of course it is a system that by itself is already, everything serves a purpose and so you don’t want to mess with it too much.

Matthew MacDougall (04:51:39) Yeah, it’s like eliminating a species from an ecology. You don’t know what the delicate interconnections and dependencies are. The brain is certainly a delicate, complex beast, and we don’t know every potential downstream consequence of a single change that we make.

Lex Fridman (04:52:04) Do you see yourself doing, so you mentioned P1, surgeries of P2, P3, P4, P5? Just more and more and more humans.

Matthew MacDougall (04:52:14) I think it’s a certain kind of brittleness or a failure on the company’s side if we need me to do all the surgeries. I think something that I would very much like to work towards is a process that is so simple and so robust on the surgery side that literally anyone could do it. We want to get away from requiring intense expertise or intense experience to have this done and make it as simple and translatable as possible. I mean, I would love it if every neurosurgeon on the planet had no problem doing this. I think we’re probably far from a regulatory environment that would allow people that aren’t neurosurgeons to do this, but not impossible.

Lex Fridman (04:53:08) All right, I’ll sign up for that. Did you ever anthropomorphize the robot R1? Do you give it a name? Do you see it as a friend as working together with you?

Matthew MacDougall (04:53:20) I mean, to a certain degree it’s-

Lex Fridman (04:53:21) Or an enemy who’s going to take your job?

Matthew MacDougall (04:53:25) To a certain degree, yeah. It’s complex relationship.

Lex Fridman (04:53:31) All the good relationships are.

Matthew MacDougall (04:53:32) It’s funny when in the middle of the surgery, there’s a part of it where I stand basically shoulder to shoulder with the robot, and so if you’re in the room reading the body language, it’s my brother in arms there. We’re working together on the same problem. Yeah, I’m not threatened by it.

Life and death

Lex Fridman (04:53:55) Keep telling yourself that. How have all the surgeries that you’ve done over the years, the people you’ve helped and the stakes, the high stakes that you’ve mentioned, how has that changed your understanding of life and death?

Matthew MacDougall (04:54:13) Yeah, it gives you a very visceral sense, and this may sound trite, but it gives you a very visceral sense that death is inevitable. On one hand, as a neurosurgeon, you’re deeply involved in these, just hard to fathom tragedies, young parents dying, leaving a four-year-old behind, say. And on the other hand, it takes the sting out of it a bit because you see how just mind-numbingly universal death is. There’s zero chance that I’m going to avoid it. I know techno-optimists right now and longevity buffs right now would disagree on that 0.000% estimate, but I don’t see any chance that our generation is going to avoid it. Entropy is a powerful force and we are very ornate, delicate, brittle, DNA machines that aren’t up to the cosmic ray bombardment that we’re subjected to.

(04:55:35) So on the one hand, every human that has ever lived died or will die. On the other hand, it’s just one of the hardest things to imagine inflicting on anyone that you love is having them gone. I mean, I’m sure you’ve had friends that aren’t living anymore and it’s hard to even think about them. And so I wish I had arrived at the point of nirvana where death doesn’t have a sting, I’m not worried about it. But I can at least say that I’m comfortable with the certainty of it, if not having found out how to take the tragedy out of it. When I think about my kids either not having me or me not having them or my wife.

Lex Fridman (04:56:35) Maybe I’ve come to accept the intellectual certainty of it, but it may be the pain that comes with losing the people you love. But I don’t think I’ve come to understand the existential aspect of it, that this is going to end, and I don’t mean in some trite way. I mean, it certainly feels like it’s not going to end. You live life like it’s not going to end. And the fact that this light that’s shining, this consciousness is going to no longer be in one moment, maybe today. It fills me when I really am able to load all that in with Ernest Becker’s terror. It is a real fear.

(04:57:28) I think people aren’t always honest with how terrifying it is. I think the more you are able to really think through it, the more terrifying it is. It’s not such a simple thing, “Oh, well, it’s the way life is.” If you really can load that in, it’s hard, but I think that’s why the Stoics did it, because it helps you get your shit together and be like, “The moment, every single moment you’re alive is just beautiful” and it’s terrifying that it’s going to end, and it’s almost like you’re shivering in the cold, a child helpless. This kind of feeling,

(04:58:10) And then it makes you, when you have warmth, when you have the safety, when you have the love to really appreciate it. I feel like sometimes in your position when you mentioned armor just to see death, it might make you not be able to see that, the finiteness of life because if you kept looking at that, it might break you. So it is good to know that you’re kind of still struggling with that. There’s the neurosurgeon and then there’s a human, and the human is still able to struggle with that and feel the fear of that and the pain of that.

Matthew MacDougall (04:58:51) Yeah, it definitely makes you ask the question of how many of these can you see and not say, “I can’t do this anymore”? But I mean you said it well, I think it gives you an opportunity to just appreciate that you’re alive today and I’ve got three kids and an amazing wife, and I am really happy. Things are good. I get to help on a project that I think matters. I think it moves us forward. I’m a very lucky person.

Lex Fridman (04:59:30) It’s the early steps of a potentially gigantic leap for humanity. It’s a really interesting one. And it’s cool because you read about all this stuff in history where it’s like the early days. I’ve been reading, before going to the Amazon, I would read about explorers that would go and explore even the Amazon jungle for the first time. It’s just those are the early steps or early steps into space, early steps in any discipline in physics and mathematics, and it’s cool because on the grand scale, these are the early steps into delving deep into the human brain, so not just observing the brain but be able to interact with the human brain. It’s going to help a lot of people, but it also might help us understand what the hell’s going on in there.

Matthew MacDougall (05:00:20) Yeah. I think ultimately we want to give people more levers that they can pull. You want to give people options. If you can give someone a dial that they can turn on how happy they are, I think that makes people really uncomfortable. But now talk about major depressive disorder. Talk about people that are committing suicide at an alarming rate in this country, and try to justify that queasiness in that light of, you can give people a knob to take away suicidal ideation, suicidal intention. I would give them that knob. I don’t know how you justify not doing that.

Lex Fridman (05:01:11) You can think about all the suffering that’s going on in the world, every single human being that’s suffering right now. It’ll be a glowing red dot. The more suffering, the more it’s glowing, and you just see the map of human suffering and any technology that allows you to dim that light of suffering on a grand scale is pretty exciting. Because there’s a lot of people suffering and most of them suffer quietly, and we look away too often, and we should remember those are suffering because once again, most of them are suffering quietly.

Matthew MacDougall (05:01:46) Well, and on a grander scale, the fabric of society. People have a lot of complaints about how our social fabric is working or not working, how our politics is working or not working. Those things are made of neurochemistry too in aggregate, right? Our politics is composed of individuals with human brains, and the way it works or doesn’t work is potentially tunable in the sense that, I don’t know, say remove our addictive behaviors or tune our addictive behaviors for social media or our addiction to outrage, our addiction to sharing the most angry political tweet we can find. I don’t think that leads to a functional society, and if you had options for people to moderate that maladaptive behavior, there could be huge benefits to society. Maybe we could all work together a little more harmoniously toward useful ends.

Lex Fridman (05:03:00) There’s a sweet spot, like you mentioned. You don’t want to completely remove all the dark sides of human nature. Those are somehow necessary to make the whole thing work, but there’s a sweet spot.

Matthew MacDougall (05:03:11) Yeah, I agree. You got to suffer a little, just not so much that you lose hope.

Consciousness

Lex Fridman (05:03:16) Yeah. When you, all the surgeries you’ve done, have you seen consciousness in there ever? Was there a glowing light?

Matthew MacDougall (05:03:22) I have this sense that I never found it, never removed it like a Dementor in Harry Potter. I have this sense that consciousness is a lot less magical than our instincts want to claim it is. It seems to me like a useful analog for about what consciousness is in the brain is that we have a really good intuitive understanding of what it means to say, touch your skin and know what’s being touched. And I think consciousness is just that level of sensory mapping applied to the thought processes in the brain itself.

(05:04:10) So what I’m saying is, consciousness is the sensation of some part of your brain being active, so you feel it working. You feel the part of your brain that thinks of red things or winged creatures or the taste of coffee. You feel those parts of your brain being active, the way that I’m feeling my palm being touched, and that sensory system that feels the brain working is consciousness.

Lex Fridman (05:04:43) That’s so brilliant. It’s the same way. It’s the sensation of touch when you’re touching a thing. Consciousness is the sensation of you feeling your brain working, your brain thinking, your brain perceiving.

Matthew MacDougall (05:04:59) Which isn’t like a warping of space-time or some quantum field effect, right? It’s nothing magical. People always want to ascribe to consciousness something truly different, and there’s this awesome long history of people looking at whatever the latest discovery in physics is to explain consciousness because it’s the most magical, the most out there thing that you can think of, and people always want to do that with consciousness. I don’t think that’s necessary. It’s just a very useful and gratifying way of feeling your brain work.

Lex Fridman (05:05:38) And as we said, it’s one heck of a brain. Everything we see around us, everything we love, everything that’s beautiful came from brains like these.

Matthew MacDougall (05:05:48) It’s all electrical activity happening inside your skull.

Lex Fridman (05:05:52) And I, for one, am grateful there’s people like you that are exploring all the ways that it works and all the ways it can be made better.

Matthew MacDougall (05:06:04) Thanks, Lex.

Lex Fridman (05:06:04) Thank you so much for talking today.

Matthew MacDougall (05:06:06) It’s been a joy.

Bliss Chapman

Lex Fridman (05:06:08) Thanks for listening to this conversation with Matthew MacDougall. Now, dear friends, here’s Bliss Chapman, brain interface software lead at Neuralink. You told me that you’ve met hundreds of people with spinal cord injuries or with ALS, and that your motivation for helping at Neuralink is grounded in wanting to help them. Can you describe this motivation?

Bliss Chapman (05:06:32) Yeah. First, just a thank you to all the people I’ve gotten a chance to speak with for sharing their stories with me. I don’t think there’s any world really in which I can share their stories as powerful way as they can, but just I think to summarize at a very high level, what I hear over and over again is that people with ALS or severe spinal cord injury in a place where they basically can’t move physically anymore, really at the end of the day are looking for independence. And that can mean different things for different people.

(05:07:02) For some folks, it can mean the ability just to be able to communicate again independently without needing to wear something on their face, without needing a caretaker to be able to put something in their mouth. For some folks, it can mean independence to be able to work again, to be able to navigate a computer digitally, efficiently enough to be able to get a job, to be able to support themselves, to be able to move out and ultimately be able to support themselves after their family maybe isn’t there anymore to take care of them.

(05:07:27) And for some folks, it’s as simple as just being able to respond to their kid in time before they run away or get interested in something else. And these are deeply personal and very human problems. And what strikes me again and again when talking with these folks is that this is actually an engineering problem. This is a problem that with the right resources, with the right team, can make a lot of progress on. And at the end of the day, I think that’s a deeply inspiring message and something that makes me excited to get up every day.

Lex Fridman (05:08:01) So it’s both an engineering problem in terms of a BCI, for example, that can give them capabilities where they can interact with the world, but also on the other side, it’s an engineering problem for the rest of the world to make it more accessible for people living with quadriplegia?

Bliss Chapman (05:08:15) Yeah. And actually, I’ll take a broad view lens on this for a second. I think I’m very in favor of anyone working in this problem space. So beyond BCI, I’m happy and excited and willing to support any way I can, folks working on eye tracking systems, working on speech to text systems, working on head trackers or mouse sticks or quad sticks. And I’ve met many engineers and folks in the community that do exactly those things.

(05:08:38) And I think for the people we’re trying to help, it doesn’t matter what the complexity of the solution is as long as the problem is solved. And I want to emphasize that there can be many solutions out there that can help with these problems. And BCI is one of a collection of such solutions. So BCI in particular, I think offers several advantages here. And I think the folks that recognize this immediately are usually the people who have spinal cord injury or some form of paralysis.

(05:09:03) Usually you don’t have to explain to them why this might be something that could be helpful. It’s usually pretty self-evident, but for the rest of us folks that don’t live with severe spinal cord injury or who don’t know somebody with ALS, it’s not often obvious why you would want a brain implant to be able to connect and navigate a computer.

(05:09:18) And it’s surprisingly nuanced, and to the degree that I’ve learned a huge amount just working with Noland in the first Neuralink clinical trial and understanding from him and his words why this device is impactful for him, and it’s a nuanced topic. It can be the case that even if you can achieve the same thing, for example, with a mouse stick when navigating a computer, he doesn’t have access to that mouse stick every single minute of the day. He only has access when someone is available to put it in front of him. And so a BCI can really offer a level of independence and autonomy that, if it wasn’t literally physically part of your body, it’d be hard to achieve in any other way.

Lex Fridman (05:09:52) So there’s a lot of fascinating aspects to what it takes to get Noland to be able to control a cursor on the screen with his mind. You texted me something that I just love. You said, “I was part of the team that interviewed and selected P1, I was in the operating room during the first human surgery monitoring live signals coming out of the brain. I work with the user basically every day to develop new UX paradigms, decoding strategies, and I was part of the team that figured out how to recover useful BCI to new world record levels when the signal quality degraded.” We’ll talk about, I think every aspect of that, but just zooming out, what was it like to be a part of that team and part of that historic, I would say, historic first?

Bliss Chapman (05:10:38) Yeah. I think for me, this is something I’ve been excited about for close to 10 years now. And so to be able to be even just some small part of making it a reality is extremely exciting. A couple maybe special moments during that whole process that I’ll never really truly forget. One of them is entering the actual surgery. At that point in time, I know Noland quite well. I know his family. And so I think the initial reaction when Noland is rolled into the operating room is just an “Oh, shit” kind of reaction. But at that point, muscle memory kicks in and you sort of go into, you let your body just do all the talking.

(05:11:19) And I have the lucky job in that particular procedure to just be in charge of monitoring the implant. So my job is to sit there, to look at the signals coming off the implant, to look at the live brain data streaming off the device as threads are being inserted into the brain and just to basically observe and make sure that nothing is going wrong or that there’s no red flags or fault conditions that we need to go and investigate or pause the surgery to debug.

(05:11:40) And because I had that sort of spectator view of the surgery, I had a slightly removed perspective than I think most folks in the room. I got to sit there and think to myself, “Wow, that brain is moving a lot.” When you look inside the craniectomy that we stick the threads in, one thing that most people don’t realize is the brain moves. The brain moves a lot when you breathe, your heart beats, and you can see it visibly. So that’s something that I think was a surprise to me and very, very exciting to be able to see someone’s brain who you physically know and have talked with that length, actually pausing and moving inside their skull.

Lex Fridman (05:12:15) And they used that brain to talk to you previously, and now it’s right there moving.

Lex Fridman (05:12:21) Actually, I didn’t realize that in terms of the thread sending, so the Neuralink implant is active during surgery and one thread at a time, you’re able to start seeing the signal?

Lex Fridman (05:12:32) So that’s part of the way you test that the thing is working?

Bliss Chapman (05:12:35) Yeah. So actually in the operating room, right after we sort of finished all the thread insertions, I started collecting what’s called broadband data. So broadband is basically the most raw form of signal you can collect from a Neuralink electrode. It’s essentially a measurement of the local fuel potential or the voltage essentially measured by that electrode. And we have a certain mode in our application that allows us to visualize where detected spikes are. So it visualizes where in the broadband signal and it’s very, very raw form of the data, a neuron is actually spiking. And so one of these moments that I’ll never forget as part of this whole clinical trial is seeing live in the operating room while he’s still under anesthesia, beautiful spikes being shown in the application, just streaming live to a device I’m holding in my hand.

Lex Fridman (05:13:22) So this is no signal processing the raw data, and then the signals processing is on top of it, you’re seeing the spikes detected?

Lex Fridman (05:13:30) And that’s a UX too, that looks beautiful as well.

Bliss Chapman (05:13:35) During that procedure, there was actually a lot of cameramen in the room, so they also were curious and wanted to see, there’s several neurosurgeons in the room who were all just excited to see robots taking their job, and they were all crowded around a small little iPhone watching this live brain data stream out of his brain.

Lex Fridman (05:13:51) What was that like seeing the robot do some of the surgery? So the computer vision aspect where it detects all the spots that avoid the blood vessels, and then obviously with the human supervision, then actually doing the really high precision connection of the threads to the brain?

Bliss Chapman (05:14:11) That’s a good question. My answer is going to be pretty lame here, but it was boring. I’ve seen it so many times.

Lex Fridman (05:14:11) The way you want it to be.

Bliss Chapman (05:14:17) Yeah, that’s exactly how you want surgery to be. You want it to be boring. I’ve seen it so many times. I’ve seen the robot do the surgery literally hundreds of times, and so it was just one more time.

Lex Fridman (05:14:29) Yeah, all the practice surgeries and the proxies, and this is just another day.

Lex Fridman (05:14:35) So what about when Noland woke up? Do you remember a moment where he was able to move the cursor, not move the cursor, but get signal from the brain such that it was able to show that there’s a connection?

Bliss Chapman (05:14:49) Yeah. Yeah. So we are quite excited to move as quickly as we can, and Noland was really, really excited to get started. He wanted to get started, actually the day of surgery, but we waited until the next morning very patiently. It’s a long night.

Bliss Chapman (05:15:00) … we waited until the next morning very patiently. So a long night. And the next morning in the ICU where he was recovering, he wanted to get started and actually start to understand what kind of signal we can measure from his brain. And maybe for folks who are not familiar with the Neuralink system, we implant the Neuralink system or the Neuralink implant in the motor cortex. So the motor cortex is responsible for representing things like motor intent. If you imagine closing and opening your hand, that kind of signal representation would be present in the motor cortex.

(05:15:31) If you imagine moving your arm back and forth or wiggling a pinky, this sort of signal can be present in the motor cortex. So one of the ways we start to map out what kind of signal do we actually have access to, in any particular individual’s brain, is through this task called body mapping. And body mapping is where you essentially present a visual to the user and you say, “Hey, imagine doing this,” and their visual is a 3D hand opening, closing or index finger modulating up and down.

(05:15:55) And you ask the user to imagine that, and obviously you can’t see them do this, because they’re paralyzed, so you can’t see them actually move their arm. But while they do this task, you can record neural activity and you can basically offline model and check, “Can I predict, or can I detect the modulation corresponding with those different actions?” And so we did that task and we realized, “Hey, there’s actually some modulation associated with some of his hand motion,” which was a first indication that, “okay, we can potentially use that modulation to do useful things in the world.” For example, control a computer cursor.

(05:16:24) And he started playing with it, the first time we showed him it. And we actually just took the same live view of his brain activity and put it in front of him and we said, “Hey, you tell us what’s going on? We’re not you. You’re able to imagine different things, and we know that it’s modulating some of these neurons, so you figure out for us, what that is actually representing.” And so he played with it for a bit. He was like, “I don’t quite get it yet.” He played for a bit longer and he said, “Oh, when I move this finger, I see this particular neuron start to fire more.”

(05:16:51) And I said, “Okay, prove it. Do it again.” And so he said, “Okay, three, two, one,” boom. And the minute he moved, you can see instantaneously this neuron is firing, single neuron. I can tell you the exact channel number if you’re interested. It’s stuck in my brain now forever. But that single channel firing was a beautiful indication that it was behaved really modulated, neural activity, that could then be used for downstreaming tasks, like decoding a computer cursor.

Lex Fridman (05:17:15) And when you say single channel, is that associated with a single electrode?

Bliss Chapman (05:17:18) Yeah. Channel and electrode are interchangeable.

Lex Fridman (05:17:20) And there’s a 1,024 of those?

Lex Fridman (05:17:25) That’s incredible that, that works. When I was learning about all this and loading it in, it was just blowing my mind that the intention, you can visualize yourself moving the finger. That can turn into a signal, and the fact that you can then skip that step and visualize the cursor moving, or have the intention of the cursor moving. And that leading to a signal that can then be used to move the cursor? There is so many exciting things there to learn about the brain, about the way the brain works, the very fact of there existing signal that can be used, is really powerful.

Lex Fridman (05:18:03) But it feels like that’s just the beginning of figuring out how that signal could be used really, really effectively? I should also just, there’s so many fascinating details here, but you mentioned the body mapping step. At least in the version I saw, that Noland was showing off, there’s a super nice interface, a graphical interface, but it just felt like I was in the future.

(05:18:28) I guess it visualizes you moving the hand, and there’s a very sexy polished interface that, “Hello,” I don’t know if there’s a voice component, but it just felt like when you wake up in a really nice video game, and this is the tutorial at the beginning of that video game. This is what you’re supposed to do. It’s cool.

Bliss Chapman (05:18:50) No, I mean the future should feel like the future.

Lex Fridman (05:18:52) But it’s not easy to pull that off. I mean, it needs to be simple, but not too simple.

Bliss Chapman (05:18:57) Yeah. And I think the UX design component here is underrated for BCI development in general. There’s a whole interaction effect between the ways in which you visualize an instruction to the user, and the kinds of signal you can get back. And that quality of your behavioral alignment to the neural signal, is a function of how good you are at expressing to the user what you want them to do. And so yeah, we spend a lot of time thinking about the UX, of how we build our applications, of how the decoder actually functions, the control surfaces it provides to the user. All these little details matter a lot.

Neural signal

Lex Fridman (05:19:27) So maybe it’d be nice to get into a little bit more detail of what the signal looks like, and what the decoding looks like?

Lex Fridman (05:19:34) So there’s a N1 implant that has, like we mentioned, 1,024 electrodes, and that’s collecting raw data, raw signal. What does that signal look like? And what are the different steps along the way before it’s transmitted, and what is transmitted? All that kind of stuff.

Bliss Chapman (05:19:56) Yep. This is going to be a fun one. Grab the [inaudible 05:19:58].

Bliss Chapman (05:19:59) So maybe before diving into what we do, it’s worth understanding what we’re trying to measure, because that dictates a lot of the requirements for the system that we build. And what we’re trying to measure is really individual neurons, producing action potentials. And action potential is, you can think of it like a little electrical impulse that you can detect, if you’re close enough. And by being close enough, I mean within let’s say 100 microns of that cell. And 100 microns is a very, very tiny distance. And so the number of neurons that you’re going to pick up with any given electrode, is just a small radius around that electrode.

(05:20:33) And the other thing worth understanding about the underlying biology here, is that when neurons produce an action potential, the width of that action potential is about one millisecond. So from the start of the spike, to the end of the spike, that whole width of that characteristic feature, of a neuron firing, is one millisecond wide. And if you want to detect that an individual spike is occurring or not, you need to sample that signal, or sample the local fuel potential nearby that a neuron… Much more frequently than once a millisecond. You need to sample many, many times per millisecond, to be able to detect that this is actually the characteristic waveform of a neuron producing an action potential.

(05:21:07) And so we sample across all 1,024 electrodes, about 20,000 times a second. 20,000 times a second means for any given one millisecond window, we have about 20 samples that tell us what that exact shape of that actual potential looks like. And once we’ve sort of sampled at super high rate the underlying electrical field nearby these cells, we can process that signal into just where do we detect a spike, or where do we not? Sort of a binary signal, one or zero. Do we detect a spike in this one millisecond or not?

(05:21:39) And we do that because the actual information carrying subspace of neural activity, is just when our spikes occurring. Essentially everything that we care about for decoding can be captured or represented in the frequency characteristics of spike trains. Meaning, how often are spikes firing in any given window of time. And so that allows us to do sort of a crazy amount of compression, from this very rich high-density signal, to something that’s much, much more sparse and compressible, that can be sent out over a wireless radio. Like a Bluetooth communication for example.

Lex Fridman (05:22:14) Quick tangents here. You mentioned electrode neuron, there’s a local neighborhood of neurons nearby. How difficult is it to isolate from where the spike came from?

Bliss Chapman (05:22:30) So there’s a whole field of academic neuroscience work on exactly this problem, of basically given a single electrode, or given a set of electrodes measuring a set of neurons. How can you sort, spike sort, which spikes are coming from what neuron? And this is a problem that’s pursued in academic work, because you care about it for understanding what’s going on in the underlying neuroscience of the brain. If you care about understanding how the brain’s representing information, how that’s evolving through time, then that’s a very, very important question to understand.

(05:23:02) For the engineering side of things, at least at the current scale, if the number of neurons per electrode is relatively small, you can get away with basically ignoring that problem completely. You can think of it like a random projection of neurons to electrodes, and there may be in some cases more than one neuron per electrode. But if that number is small enough, those signals can be thought of as sort of a union of the two.

(05:23:25) And for many applications, that’s a totally reasonable trade-off to make, and can simplify the problem a lot. And as you sort of scale out channel count, the relevance of distinguishing individual neurons becomes less important. Because you have more overall signal, and you can start to rely on correlations or covariate structure in the data to help understand when that channel is firing… What does that actually represent? Because you know that when that channel’s firing in concert with these other 50 channels, that means move left. But when that same channel’s firing with concert with these other 10 channels, that means move right.

Lex Fridman (05:23:53) Okay. So you have to do this kind of spike detection onboard, and you have to do that super efficiently? So fast, and not use too much power, because you don’t want to be generating too much heat, so it’d have to be a super simple signal processing step?

Lex Fridman (05:24:11) Is there some wisdom you can share about what it takes to overcome that challenge?

Bliss Chapman (05:24:17) Yeah. So we’ve tried many different versions of basically turning this raw signal into a feature that you might want to send off the device. And I’ll say that I don’t think we’re at the final step of this process, this is a long journey. We have something that works clearly today, but there can be many approaches that we find in the future that are much better than what we do right now. So some versions of what we do right now, and there’s a lot of academic heritage to these ideas, so I don’t want to claim that these are original Neuralink ideas or anything like that.

(05:24:44) But one of these ideas is basically to build sort of like a convolutional filter almost, if you will. That slides across the signal and looks for a certain template to be matched. That template consists of how deep the spike modulates, how much it recovers, and what the duration and window of time is for that, the whole process takes. And if you can see in the signal that, that template is matched within certain bounds, then you can say, “Okay, that’s a spike.” One reason that approach is super convenient, is that you can actually implement that extremely efficiently in hardware. Which means that you can run it in low power across 1,024 channels all at once.

(05:25:20) Another approach that we’ve recently started exploring, and this can be combined with the spike detection approach, is something called spike band power. And the benefits of that approach are that you may be able to pick up some signal from neurons that are maybe too far away to be detected as a spike, because the farther away you are from an electrode, the weaker that actual spike waveform will look like on that electrode. So you might be able to pick up population level activity of things that are maybe slightly outside the normal recording radius… What neuroscientists sometimes refer to as the hash of activity, the other stuff that’s going on. And you can look at across many channels how that background noise is behaving, and you might be able to get more juice out of the signal that way.

(05:25:59) But it comes at a cost. That signal is now a floating point representation, which means it’s more expensive to send out over a power. It means you have to find different ways to compress it, that are different than what you can apply to binary signals. So there’s a lot of different challenges associated with these different modalities.

Lex Fridman (05:26:12) So also in terms of communication, you’re limited by the amount of data you can send?

Latency

Lex Fridman (05:26:17) And also because you’re currently using the Bluetooth protocol, you have to batch stuff together? But you have to also do this, keeping the latency crazy low? Crazy low? Anything to say about the latency?

Bliss Chapman (05:26:32) Yeah. This is a passion project of mine. So I want to build the best mouse in the world. I don’t want to build the Chevrolet Spark or whatever of electric cars. I want to build the Tesla Roadster version of a mouse. And I really do think it’s quite possible that within five to 10 years that most eSports competitions are dominated by people with paralysis.

(05:26:54) This is a very real possibility for a number of reasons. One is that they’ll have access to the best technology to play video games effectively. The second is they have the time to do so. So those two factors together are particularly potent for eSport competitors.

Lex Fridman (05:27:07) Unless, people without paralysis are also allowed to implant N1?

Lex Fridman (05:27:13) Which, it is another way to interact with a digital device, and there’s something to that, if it’s a fundamentally different experience, more efficient experience? Even if it’s not like some kind of full-on high bandwidth communication, if it’s just the ability to move the mouse 10X faster, like the bits per second? If I can achieve a bits per second at 10X what I can do with a mouse, that’s a really interesting possibility of what that can do? Especially as you get really good at it. With training.

Bliss Chapman (05:27:47) It’s definitely the case that you have a higher ceiling performance, because you don’t have to buffer your intention through your arm, through your muscle. You get just by nature of having a brain implant at all, like 75 millisecond lead time on any action that you’re actually trying to take. And there’s some nuance to this, there’s evidence that the motor cortex, you can sort of plan out sequences of actions, so you may not get that whole benefit all the time. But for reaction time style games, where you just want to… Somebody’s over here, snipe them, that kind of thing? You actually do have just an inherent advantage, because you don’t need to go through muscle.

(05:28:18) So the question is, just how much faster can you make it? And we’re already faster than what you would do if you’re going through muscle from a latency point of view, and we’re in the early stages of that. I think we can push it. So our end to end latency right now from brain spike to cursor movement, it’s about 22 milliseconds. If you think about the best mice in the world, the best gaming mice, that’s about five milliseconds ish of latency, depending on how you measure, depending how fast your screen refreshes, there’s a lot of characteristics that matter there. And the rough time for a neuron in the brain to actually impact your command of your hand is about 75 milliseconds.

(05:28:50) So if you look at those numbers, you can see that we’re already competitive and slightly faster than what you’d get by actually moving your hand. And this is something that if you ask Noland about it, when he moved the cursor for the first time… We asked him about this, it was something I was super curious about. “What does it feel like when you’re modulating a click intention, or when you’re trying to just move the cursor to the right?” He said it moves before he is actually intending it to. Which is kind of a surreal thing, and something that I would love to experience myself one day, what is that like to have the thing just be so immediate, so fluid, that it feels like it’s happening before you’re actually intending it to move?

Lex Fridman (05:29:25) Yeah. I suppose we’ve gotten used to that latency, that natural latency that happens. So is currently the bottleneck, the communication? So the Bluetooth communication? What’s the actual bottleneck? I mean there’s always going to be a bottleneck, what’s the current bottleneck?

Bliss Chapman (05:29:38) Yeah. A couple things. So kind of hilariously, Bluetooth low- energy protocol has some restrictions on how fast you can communicate. So the protocol itself establishes a standard of the most frequent sort of updates you can send, are on the order of 7.5 milliseconds. And as we push latency down to the level of individual spikes impacting control, that level of resolution, that kind of protocol is going to become a limiting factor at some scale.

(05:30:06) Another sort of important nuance to this, is that it’s not just the Neuralink itself that’s part of this equation. If you start pushing latency below the level of how fast you’re going to refresh, then you have another problem. You need your whole system to be able to be as reactive as the limits of what the technology can offer.

Bliss Chapman (05:30:26) 120 hertz just doesn’t work anymore, if you’re trying to have something respond at something that’s at the level of one millisecond.

Lex Fridman (05:30:32) That’s a really cool challenge. I also like that for a T-shirt, the best mouse in the world. Tell me on the receiving end, so the decoding step? Now we figured out what the spikes are, we’ve got them all together, now we’re sending that over to the app. What’s the decoding step look like?

Bliss Chapman (05:30:49) Yeah. So maybe first, what is decoding? I think there’s probably a lot of folks listening that just have no clue what it means to decode brain activity.

Lex Fridman (05:30:56) Actually, even if we zoom out beyond that, what is the app? So there’s an implant that’s wirelessly communicating with any digital device that has an app installed.

Lex Fridman (05:31:08) So maybe can you tell me at high-level what the app is, what the software is outside of the brain?

Bliss Chapman (05:31:15) So maybe working backwards from the goal. The goal is to help someone with paralysis. In this case, Noland. Be able to navigate his computer independently. And we think the best way to do that, is to offer them the same tools that we have to navigate our software. Because we don’t want to have to rebuild an entire software ecosystem for the brain, at least not yet. Maybe someday you can imagine there’s UXs that are built natively for BCI, but in terms of what’s useful for people today, I think most people would prefer to be able to just control mouse and keyboard inputs, to all the applications that they want to use for their daily jobs, for communicating with their friends, et cetera.

(05:31:47) And so the job of the application is really to translate this wireless stream of brain data, coming off the implant, into control of the computer. And we do that by essentially building a mapping from brain activity to sort of the HID inputs, to the actual hardware. So HID is just the protocol for communicating like input device events, so for example, move mouse to this position or press this key down. And so that mapping is fundamentally what the app is responsible for. But there’s a lot of nuance of how that mapping works, and we spent a lot of time to try to get it right, and we’re still in the early stages of a long journey to figure out how to do that optimally.

(05:32:21) So one part of that process is decoding. So decoding is this process of taking the statistical patterns of brain data, that’s being channeled across this Bluetooth connection to the application. And turning it into, for example, a mouse movement. And that decoding step, you can think of it in a couple of different parts. So similar to any machine learning problem, there’s a training step, and there’s an [inaudible 05:32:39] step. The training step in our case is a very intricate behavioral process where the user has to imagine doing different actions. So for example, they’ll be presented a screen with a cursor on it, and they’ll be asked to push that cursor to the right. Then imagine pushing that cursor to the left, push it up, push it down. And we can basically build up a pattern or using any sort of modern ML method of mapping of given this brain data, and then imagine behavior, map one to the other.

(05:33:07) And then at test time you take that same pattern matching system. In our case it’s a deep neural network, and you run it and you take the live stream of brain data coming off their implant, you decode it by pattern matching to what you saw at calibration time, and you use that for a control of the computer. Now a couple sort of rabbit holes that I think are quite interesting. One of them has to do with how you build that best template matching system. Because there’s a variety of behavioral challenges and also debugging challenges when you’re working with someone who’s paralyzed.

(05:33:35) Because again, fundamentally you don’t observe what they’re trying to do, you can’t see them attempt to move their hand. And so you have to figure out a way to instruct the user to do something, and validate that they’re doing it correctly, such that then you can downstream, build with confidence, the mapping between the neural spikes and the intended action.

(05:33:53) And by doing the action correctly, what I really mean is, at this level of resolution of what neurons are doing. So if, in ideal world, you could get a signal of behavioral intent that is ground truth accurate at the scale of one millisecond resolution, then with high confidence, I could build a mapping from my neural spikes, to that behavioral intention. But the challenge is again, that you don’t observe what they’re actually doing. And so there’s a lot of nuance to how you build user experiences, that give you more than just a course on average correct representation of what the user’s intending to do.

(05:34:24) If you want to build the world’s best mouse, you really want it to be as responsive as possible. You want it to be able to do exactly what the user’s intending, at every step along the way, not just on average be correct, when you’re trying to move it from left to right. And building a behavioral calibration game, or our software experience, that gives you that level of resolution, is what we spend a lot of time working on.

Lex Fridman (05:34:44) So the calibration process, the interface, has to encourage precision. Meaning whatever it does, it should be super intuitive that the next thing the human is going to likely do, is exactly that intention that you need, and only that intention?

Lex Fridman (05:35:03) And you don’t have any feedback except that may be speaking to you afterwards, what they actually did, you can’t… Oh, yeah.

Lex Fridman (05:35:11) So that’s fundamentally, that is a really exciting UX challenge. Because that’s all on the UX, it’s not just about being friendly or nice or usable.

Bliss Chapman (05:35:24) User experience is how it works.

Lex Fridman (05:35:24) … it’s how it works, for the calibration. And calibration, at least at this stage of Neuralink is fundamental to the operation of the thing? And not just calibration, but continued calibration essentially?

Intention vs action

Bliss Chapman (05:35:40) You said something that I think is worth exploring there a little bit. You said it’s primarily a UX challenge, and I think a large component of it is, but there is also a very interesting machine learning challenge here. Which is given some dataset, including some on average correct behavior, of asking the user to move up, or move down, move right, move left, and given a dataset of neural spikes. Is there a way to infer, in some kind of semi-supervised, or entirely unsupervised way, what that high resolution version of their intention is?

(05:36:10) And if you think about it, there probably is, because there are enough data points in the dataset, enough constraints on your model. That there should be a way with the right sort of formulation, to let the model figure out itself, for example… At this millisecond, this is exactly how hard they’re pushing upwards, and at this millisecond, this is how hard they’re trying to push upwards.

Lex Fridman (05:36:27) It’s really important to have very clean labels, yes? So the problem becomes much harder from the machine learning perspective if the labels are noisy?

Lex Fridman (05:36:36) And then to get the clean labels, that’s a UX challenge?

Bliss Chapman (05:36:40) Correct. Although clean labels, I think maybe it’s worth exploring what that exactly means. I think any given labeling strategy will have some number of assumption to make, about what the user is attempting to do. Those assumptions can be formulated in a loss function, or they can be formulated in terms of heuristics that you might use, to just try to estimate or guesstimate what the user’s trying to do. And what really matters is, how accurate are those assumptions? For example, you might say, “Hey, user, push upwards and follow the speed of this cursor.” And your heuristic might be that they’re trying to do exactly what that cursor is trying to do.

(05:37:10) Another competing heuristic might be, they’re actually trying to go slightly faster at the beginning of the movement and slightly slower at the end. And those competing heuristics may or may not be accurate reflections of what the user is trying to do. Another version of the task might be, “Hey, user, imagine moving this cursor a fixed offset.” So rather than follow the cursor, just try to move it exactly 200 pixels to the right. So here’s the cursor, here’s the target, okay, cursor disappears, try to move that now invisible cursor, 200 pixels to the right. And the assumption in that case would be that the user can’t actually modulate correctly that position offset.

(05:37:41) But that position offset assumption might be a weaker assumption, and therefore potentially, you can make it more accurate, than these heuristics that are trying to guesstimate at each millisecond what the user’s trying to do. So you can imagine different tasks that make different assumptions about the nature of the user intention. And those assumptions being correct is what I would think of as a clean label.

Lex Fridman (05:37:59) For that step, what are we supposed to be visualizing? There’s a cursor, and you want to move that cursor to the right, or the left, or up and down, or maybe move them by a certain offset. So that’s one way. Is that the best way to do calibration?

(05:38:13) So for example, an alternative crazy way that probably is playing a role here, is a game like WEG Grid. Where you’re just getting a very large amount of data, the person playing a game. Where if they’re in a state of flow, maybe you can get clean signal as a side effect?

Lex Fridman (05:38:34) Or is that not an effective way for initial calibration?

Bliss Chapman (05:38:38) Yeah. Great question. There’s a lot to unpack there. So the first thing I would draw a distinction between is, open loop versus closed loop. So open loop, what I mean by that is, the user is sort of going from zero to one. They have no model at all, and they’re trying to get to the place where they have some level of control, at all. In that setup, you really need to have some task that gives the user a hint of what you want them to do, such that you can build its mapping again, from brain data to output. Then once they have a model, you could imagine them using that model and actually adapting to it, and figuring out the right way to use it themself. And then retraining on that data to give you sort of a boost in performance.

(05:39:14) There’s a lot of challenges associated with both of these techniques, and we can rabbit hole into both of them if you’re interested. But the sort of challenge with the open loop task is that the user themself doesn’t get proprioceptive feedback about what they’re doing. They don’t necessarily perceive themself or feel the mouse under their hand, when they’re trying to do an open loop calibration. They’re being asked to perform something… Imagine if you sort of had your whole right arm numbed, and you stuck it in a box and you couldn’t see it, so you had no visual feedback and you had no proprioceptive feedback, about what the position or activity of your arm was.

(05:39:47) And now you’re asked, “Okay, given this thing on the screen, that’s moving from left to right, match that speed?” And you basically can try your best to invoke whatever that imagined action is in your brain, that’s moving the cursor from left to right. But in any situation, you’re going to be inaccurate and maybe inconsistent in how you do that task. And so that’s sort of the fundamental challenge of open loop. The challenge with closed loop is that once the user’s given a model, and they’re able to start moving the mouse on their own, they’re going to very naturally adapt to that model. And that coadaptation between the model learning what they’re doing, and the user learning how to use the model, may not find you the best sort of global minima.

(05:40:25) And maybe that your first model was noisy in some ways, or maybe just had some quirk. There’s some part of the data distribution, it didn’t cover super well, and the user now figures out, because they’re a brilliant user like Noland, they figure out the right sequence of imagined motions, or the right angle they have to hold their hand at to get it to work. And they’ll get it to work great, but then the next day they come back to their device, and maybe they don’t remember exactly all the tricks that they used the previous day. And so there’s a complicated sort of feedback cycle here that can emerge, and can make it a very, very difficulty debugging process.

Lex Fridman (05:40:56) Okay. There’s a lot of really fascinating things there. Actually, just to stay on the closed loop… I’ve seen situations, this actually happened watching psychology grad students. They used a piece of software and they don’t know how to program themselves. They used a piece of software that somebody else wrote, and it has a bunch of bugs, and they’ve been using it for years. They figure out ways to walk around, “Oh, that just happens.” Nobody considers, “Maybe we should fix this.” They just adapt. And that’s a really interesting notion, that we’re really good at it adapting, but that might not be the optimal?

Lex Fridman (05:41:39) Okay. So how do you solve that problem? Do you have to restart from scratch every once in a while, kind of thing?

Bliss Chapman (05:41:44) Yeah. It’s a good question. First and foremost, I would say this is not a solve problem. And for anyone who’s listening in academia who works on BCIs, I would also say this is not a problem that’s solved by simply scaling channel count. So maybe that can help, and you can get sort of richer covariant structures that you can use to exploit, when trying to come up with good labeling strategies. But if you’re interested in problems that aren’t going to be solved inherently by scaling channel count, this is one of them.

(05:42:08) Yeah. So how do you solve it? It’s not a solve problem. That’s the first thing I want to make sure it gets across. The second thing is, any solution that involves closed loop is going to become a very difficult debugging problem. And one of my general heuristics for choosing what prompts to tackle is, that you want to choose the one that’s going to be the easiest to debug. Because if you can do that, even if the ceiling is lower, you’re going to be able to move faster, because you have a tighter iteration loop debugging the problem.

(05:42:34) In the open loop setting, there’s not a feedback cycle to debug with the user in the loop. And so there’s some reason to think that, that should be an easier debugging problem. The other thing that’s worth understanding is that even in the closed loop setting, there’s no special software magic of how to infer what the user is truly attempting to do. In the closed loop setting, although they’re moving the cursor on the screen, they may be attempting something different than what your model is outputting. So what the model is outputting is not a signal that you can use to retrain if you want to be able to improve the model further. You still have this very complicated guestimation, or unsupervised problem of figuring out what is the true user intention underlying that signal?

(05:43:09) And so the open loop problem has the nice property of being easy to debug, and the second nice property of, it has all the same information and content as the closed loop scenario. Another thing I want to mention and call out, is that this problem doesn’t need to be solved in order to give useful control to people. Even today with the solutions we have now, and that academia has built up over decades, the level of control that can be given to a user today, is quite useful. It doesn’t need to be solved to get to that level of control.

(05:43:38) But again, I want to build the world’s best mouse. I want to make it so good that it’s not even a question that you want it. And to build the world’s best mouse, the superhuman version, you really need to nail that problem. And a couple maybe details of previous studies that we’ve done internally, that I think are very interesting to understand, when thinking about how to solve this problem. The first is that even when you have ground-truth data of what the user’s trying to do, and you can get this with an able-bodied monkey, a monkey that has a Neuralink device implanted, and moving a mouse to control a computer. Even with that ground-truth dataset, it turns out that the optimal thing to predict to produce high performance BCI, is not just the direct control of the mouse.

(05:44:18) You can imagine building a dataset of what’s going on in the brain, and what is the mouse exactly doing on the table? And it turns out that if you build the mapping from neurospikes to predict exactly what the mouse is doing, that model will perform worse, than a model that is trained to predict higher level assumptions about what the user might be trying to do. For example, assuming that the monkey is trying to go in a straight line to the target, it turns out that making those assumptions is actually more effective in producing a model, than actually predicting the underlying hand movement.

Lex Fridman (05:44:45) So the intention, not the physical movement, or whatever?

Lex Fridman (05:44:48) There’s obviously a really strong correlation between the two, but the intention is a more powerful thing to be chasing?

Lex Fridman (05:44:55) Well, that’s also super interesting. I mean, the intention itself is fascinating because yes, with the BCI here in this case with the digital telepathy, you’re acting on the intention, not the action. Which is why there’s an experience of feeling like it’s happening before you meant for it to happen? That is so cool. And that is why you could achieve superhuman performance problem, in terms of the control of the mouse? So for open loop, just to clarify, so whenever the person is tasked to move the mouse to the right, you said there’s not feedback, so they don’t get to get that satisfaction of actually getting it to move? Right?

Bliss Chapman (05:45:38) So you could imagine giving the user feedback on a screen, but it’s difficult, because at this point you don’t know what they’re attempting to do. So what can you show them that would basically give them a signal of, “I’m doing this correctly or not correctly?” So let’s take a very specific example. Maybe your calibration task looks like you’re trying to move the cursor, a certain position offset. So your instructions to the user are, “Hey, the cursor’s here. Now when the cursor disappears, imagine you’re moving it 200 pixels from where it was, to the right to be over this target.”

(05:46:05) In that kind of scenario, you could imagine coming up with some sort of consistency metric that you could display to the user of, “Okay, I know what the spike trend looks like on average when you do this action to the right. Maybe I can produce some sort of probabilistic estimate of how likely is that to be the action you took, given the latest trial or trajectory that you imagined?” And that could give the user some sort of feedback of how consistent are they, across different trials.

(05:46:27) You could also imagine that if the user is prompted with that kind of consistency metric, that maybe they just become more behaviorally engaged to begin with, because the task is kind of boring when you don’t have any feedback at all. And so there may be benefits to the user experience of showing something on the screen, even if it’s not accurate. Just because it keeps the user motivated to try to increase that number, or push it upwards.

Lex Fridman (05:46:48) So there’s this psychology element here?

Bliss Chapman (05:46:50) Yeah. Absolutely.

Calibration

Lex Fridman (05:46:52) And again, all of that is UX challenge? How much signal drift is there hour-to-hour, day-to-day, week-to-week, month-to-month? How often do you have to recalibrate because of the signal drift?

Bliss Chapman (05:47:06) Yeah. So this is a problem we’ve worked on both with NHP, non-human primates, before our clinical trial, and then also with Noland during the clinical trial. Maybe the first thing that’s worth stating is what the goal is here. So the goal is really to enable the user to have a plug and play experience… Well, I guess they don’t have to plug anything in, but a play experience where they can use the device whenever they wanted, however they want to. And that’s really what we’re aiming for. And so there can be a set of solutions that get to that state without considering this non-stationary problem.

(05:47:38) So maybe the first solution here that’s important, is that they can recalibrate whenever they want. This is something that Noland has the ability to do today, so he can recalibrate the system at 2:00 AM, in the middle of the night without his caretaker, or parents or friends around, to help push a button for him. The other important part of the solution is that when you have a good model calibrated, that you can continue using that without needing to recalibrate it. So how often he has to do this recalibration to-date, depends really on his appetite for performance.

(05:48:06) We observe sort of a degradation through time, of how well any individual model works, but this can be mitigated behaviorally by the user adapting their control strategy. It can also be mitigated through a combination of software features that we provide to the user. For example, we let the user adjust exactly how fast the cursor is moving. We call that the gain, for example, the gain of how fast the cursor reacts to any given input intention.

(05:48:27) They can also adjust the smoothing, how smooth the output of that cursor intention actually is. That can also adjust the friction, which is how easy is it to stop and hold still? And all these software tools allow the user a great deal of flexibility and troubleshooting mechanisms to be able to solve this problem for themselves.

Lex Fridman (05:48:42) By the way, all of this is done by looking to the right side of the screen, selecting the mixer. And the mixer you have, it’s-

Bliss Chapman (05:48:48) Like DJ mode. DJ mode for your BCI.

Lex Fridman (05:48:52) I mean, it’s a really well done interface. It’s really, really well done. And so there’s that bias that there’s a cursor drift that Noland talked about in a stream. Although he said that you guys were just playing around with it with him, and then constantly improving. So that could have been just a snapshot of that particular moment, a particular day, where he said that there was this cursor drift and this bias that could be removed by him. I guess, looking to the right side of the screen, or left side of the screen, to adjust the bias?

Lex Fridman (05:49:25) That’s one interface action, I guess, to adjust the bias?

Bliss Chapman (05:49:28) Yeah. So this is actually an idea that comes out of academia. There is some prior work with BrainGate clinical trial participants where they pioneered this idea of bias correction. The way we’ve done it, I think is, it’s very prioritized, very beautiful user experience. Where the user can essentially flash the cursor over to the side of the screen, and it opens up a window, where they can actually adjust or tune exactly the bias of the cursor. So bias, maybe for people who aren’t familiar, is just sort of what is the default motion of the cursor, if you’re imagining nothing? And it turns out that, that’s one of the first sort-

Bliss Chapman (05:50:00) … and it turns out that that’s one of the first qualia of the cursor control experience that’s impacted by neuron [inaudible 05:50:07]

Lex Fridman (05:50:07) Qualia of the cursor experience.

Bliss Chapman (05:50:08) I mean, I don’t know how else to describe it. I’m not the guy moving thing.

Lex Fridman (05:50:14) It’s very poetic. I love it. The qualia of the cursor experience. Yeah, I mean it sounds poetic, but it is deeply true. There is an experience. When it works well, it is a joyful… A really pleasant experience. And when it doesn’t work well, it’s a very frustrating experience. That’s actually the art of UX, you have the possibility to frustrate people, or the possibility to give them joy.

Bliss Chapman (05:50:40) And at the end of the day, it really is truly the case that UX is how the thing works. And so it’s not just what’s showing on the screen, it’s also, what control surfaces does a decoder provide the user? We want them to feel like they’re in the F1 car, not like some minivan. And that really truly is how we think about it. Noland himself is an F1 fan. We refer to ourself as a pit crew, he really is truly the F1 driver. And there’s different control surfaces that different kinds of cars and airplanes provide the user, and we take a lot of inspiration from that when designing how the cursor should behave.

(05:51:11) And maybe one nuance of this is, even details like when you move a mouse on a MacBook trackpad, the sort of response curve of how that input that you give the trackpad translates to cursor movement is different than how it works with a mouse. When you move on the trackpad, there’s a different response function, a different curve to how much a movement translates to input to the computer than when you do it physically with a mouse. And that’s because somebody sat down a long time ago, when they’re designing the initial input systems to any computer, and they thought through exactly how it feels to use these different systems. And now we’re designing the next generation of this, input system to a computer, which is entirely done via the brain, and there’s no proprioceptive feedback, again, you don’t feel the mouse in your hand, you don’t feel the keys under your fingertips, and you want a control surface that still makes it easy and intuitive for the user to understand the state of the system, and how to achieve what they want to achieve. And ultimately the end goal is that that UX is completely… It fades in the background, it becomes something that’s so natural and intuitive that it’s subconscious to the user, and they just should feel like they have basically direct control over the cursor, just does what they want it to do. They’re not thinking about the implementation of how to make it do what they want it to do, it’s just doing what they want it to do.

Lex Fridman (05:52:17) Is there some kind of things along the lines of like Fitt’s Law, where you should move the mouse in a certain kind of way that maximizes your chance to hit the target? I don’t even know what I’m asking, but I’m hoping the intention of my question will land on a profound answer. No. Is there some kind of understanding of the laws of UX when it comes to the context of somebody using their brain to control it that’s different than with a mouse?

Bliss Chapman (05:52:55) I think we’re in the early stages of discovering those laws, so I wouldn’t claim to have solved that problem yet, but there’s definitely some things we’ve learned that make it easier for the user to get stuff done. And it’s pretty straightforward when you verbalize it, but it takes a while to actually get to that point, when you’re in the process of debugging the stuff in the trenches.

(05:53:14) One of those things is that any machine learning system that you build has some number of errors, and it matters how those errors translate to the downstream user experience. For example, if you’re developing a search algorithm in your photos, if you search for your friend, Joe, and it pulls up a photo of your friend, Josephine, maybe that’s not a big deal, because the cost of an error is not that high. In a different scenario, where you’re trying to detect insurance fraud or something like this, and you’re directly sending someone to court because of some machine learning model output, then the errors make a lot more sense to be careful about, you want to be very thoughtful about how those errors translate to downstream effects.

(05:53:53) The same is true in BCI. So for example, if you’re building a model that’s decoding a velocity output from the brain, versus an output where you’re trying to modulate the left click for example, these have sort of different trade-offs of how precise you need to be before it becomes useful to the end user. For velocity, it’s okay to be on average correct, because the output of the model is integrated through time. So if the user’s trying to click at position A, and they’re currently position B, they’re trying to navigate over time to get between those two points. And as long as the output of the model is on average correct, they can sort of steer it through time, with the user control loop in the mix, they can get to the point they want to get to.

(05:54:29) The same is not true of a click. For a click, you’re performing it almost instantly, at the scale of neurons firing. And so you want to be very sure that that click is correct, because a false click can be very destructive to the user. They might accidentally close the tab that they’re trying to do something in, and lose all their progress. They might accidentally hit some send button on some text that there’s only half composed and reads funny after. So there’s different sort of cost functions associated with errors in this space, and part of the UX design is understanding how to build a solution that is, when it’s wrong, still useful to the end user.

Lex Fridman (05:55:02) It’s so fascinating, assigning cost to every action when an error occurs. So every action, if an error occurs, has a certain cost, and incorporating that into how you interpret the intention, mapping it to the action is really important. I didn’t quite, until you said it, realize there’s a cost to sending the text early. It’s a very expensive cost.

Bliss Chapman (05:55:32) Yeah, it’s super annoying if you accidentally… Imagine if your cursor misclicked every once in a while. That’s super obnoxious. And the worst part of it is, usually when the user’s trying to click, they’re also holding still, because they’re over the target they want to hit, and they’re getting ready to click, which means that in the datasets that we build, on average is the case that sort of low speeds, or desire to hold still, is correlated with when the user’s attempting to click.

Lex Fridman (05:55:54) Wow, that is really fascinating.

Bliss Chapman (05:55:58) People think that, “Oh, a click is a binary signal, this must be super easy to decode.” Well, yes, it is, but the bar is so much higher for it to become a useful thing for the user. And there’s ways to solve this. I mean, you can sort of take the compound approach of, “Well, let’s take five seconds to click. Let’s take a huge window of time, so we can be very confident about the answer.” But again, world’s best mouse. The world’s best mouse doesn’t take a second to click, or 500 milliseconds to click, it takes five milliseconds to click or less. And so if you’re aiming for that kind of high bar, then you really want to solve the underlying problem.

Webgrid

Lex Fridman (05:56:26) So maybe this is a good place to ask about how to measure performance, this whole bits per second. Can you explain what you mean by that? Maybe a good place to start is to talk about Webgrid as a game, as a good illustration of the measurement of performance.

Bliss Chapman (05:56:43) Yeah. Maybe I’ll take one zoom out step there, which is just explaining why we care to measure this at all. So again, our goal is to provide the user the ability to control the computer as well as I can, and hopefully better. And that means that they can do it at the same speed as what I can do, it means that they have access to all the same functionality that I have, including all those little details like command tab, command space, all this stuff, they need to be able to do it with their brain, and with the same level of reliability as what I can do with my muscles. And that’s a high bar, and so we intend to measure and quantify every aspect of that to understand how we’re progressing towards that goal.

(05:57:13) There’s many ways to measure BPS by the way, this isn’t the only way, but we present the user a grid of targets, and basically we compute a score which is dependent on how fast and accurate they can select, and then how small are the targets. And the more targets that are on the screen, the smaller they are, the more information you present per click. And so if you think about it from information theory point of view, you can communicate across different information theoretic channels, and one such channel is a typing interface, you can imagine, that’s built out of a grid, just like a software keyboard on the screen.

(05:57:41) And bits per second is a measure that’s computed by taking the log of the number of targets on the screen. You can subtract one if you care to model a keyboard, because you have to subtract one for the delete key on the keyboard. But log of the number of targets on the screen, times the number of correct selections, minus incorrect, divided by some time window, for example, 60 seconds. And that’s sort of the standard way to measure a cursor control task in academia. And all credit in the world goes to this great professor, Dr. Shenoy of Stanford who came up with that task, and he’s also one of my inspirations for being in the field. So all the credit in the world to him for coming up with a standardized metric to facilitate this kind of bragging rights that we have now to say that Noland is the best in the world at this task with this BCI. It’s very important for progress that you have standardized metrics that people can compare across. Different techniques and approaches, how well does this do? So big kudos to him and to all the team at Stanford.

(05:58:29) Yeah, so for Noland, and for me playing this task, there’s also different modes that you can configure this task. So the Webgrid task can be presented as just sort of a left click on the screen, or you could have targets that you just dwell over, or you could have targets that you left, right click on, you could have targets that are left, right click, middle click, scrolling, clicking and dragging. You can do all sorts of things within this general framework, but the simplest, purest form is just blue targets show up on the screen, blue means left click. That’s the simplest form of the game.

(05:58:56) And the sort of prior records here in academic work and at Neuralink internally with NHPs have all been matched or beaten by Noland with his Neuralink device. So prior to Neuralink, the world record for a human using device is somewhere between 4.2 to 4.6 BPS, depending on exactly what paper you read and how you interpret it. Noland’s current record is 8.5 BPS. and again, this sort of median Neuralinker performance is 10 BPS. So you can think of it roughly as, he’s 85% the level of control of a median Neuralinker using their cursor to select blue targets on the screen.

(05:59:35) I think there’s a very interesting journey ahead to get us to that same level of 10 BPS performance. It’s not the case that the tricks that got us from 4 to 6 BPS, and then 6 to 8 BPS are going to be the ones that get us from 8 to 10. And in my view, the core challenge here is really the labeling problem. It’s how do you understand, at a very, very fine resolution, what the user’s attempting to do? And I highly encourage folks in academia to work on this problem.

Lex Fridman (06:00:01) What’s the journey with Noland on that quest of increasing the BPS on Webgrid? In March, you said that he selected 89,285 targets in Webgrid. So he loves this game, he’s really serious about improving his performance in this game. So what is that journey of trying to figure out how to improve that performance? How much can that be done on the decoding side? How much can that be done on the calibration side? How much can that be done on the Noland side of figuring out how to convey his intention more cleanly?

Bliss Chapman (06:00:36) Yeah. No, this is a great question. So in my view, one of the primary reasons why Noland’s performance is so good is because of Noland. Noland is extremely focused and very energetic. He’ll play Webgrid sometimes for four hours in the middle of the night. From 2:00 A.M. To 6:00 A.M. he’ll be playing Webgrid, just because he wants to push it to the limits of what he can do. This is not us asking him to do that, I want to be clear. We’re not saying, ” Hey, you should play Webgrid tonight.” We just gave him the game as part of our research, and he is able to play it independently, and practice whenever he wants, and he really pushes hard to push it, the technology’s absolute limit. And he views that as his job, really, to make us be the bottleneck. And boy, has he done that well.

(06:01:16) And so the first thing to acknowledge is that he’s extremely motivated to make this work. I’ve also had the privilege to meet other clinical trial participants from BrainGate and other trials, and they very much shared the same attitude of, they viewed this as their life’s work to advance the technology as much as they can. And if that means selecting targets on the screen for four hours from 2:00 A.M. to 6:00 A.M., then so be it. And there’s something extremely admirable about that that’s worth calling out.

(06:01:42) Okay, so then how do you get from where he started, which is no cursor control to eight BPS? I mean, when he started, there’s a huge amount of learning to do on his side and our side to figure out what’s the most intuitive control for him. And the most intuitive control for him is, you have to find the set intersection of, “Do we have the signal to decode?” So we don’t pick up every single neuron in the motor cortex, which means we don’t have representation for every part of the body. So there may be some signals that we have better decode performance on than others. For example, on his left hand, we have a lot of difficulty distinguishing his left ring finger from his left middle finger, but on his right hand, we have a good control and good modulation detected from the neurons that were able to record for his pinky, and his thumb, and his index finger. So you can imagine how these different subspaces of modulated activity intersect with what’s the most intuitive for him.

(06:02:32) And this has evolved over time, so once we gave him the ability to calibrate models on his own, he was able to go and explore various different ways to imagine controlling the cursor. For example, he can imagine controlling the cursor by wiggling his wrist side to side, or by moving his entire arm, by… I think at one point he did his feet. He tried a whole bunch of stuff to explore the space of what is the most natural way for him to control the cursor, that at the same time, it’s easy for us to decode-

Lex Fridman (06:02:54) Just to clarify, it’s through the body mapping procedure there, you’re able to figure out which finger he can move?

Bliss Chapman (06:03:02) Yes. Yeah, that’s one way to do it. Maybe one nuance of the… When he’s doing it, he can imagine many more things than we represent in that visual on the screen. So we show him, sort of abstractly, “Here’s a cursor. You figure out what works the best for you.” And we obviously have hints about what will work best from that body mapping procedure, of, “We know that this particular action we can represent well.” But it’s really up to him to go and explore and figure out what works the best.

Lex Fridman (06:03:27) But at which point does he no longer visualize the movement of his body, and is just visualizing the movement of the cursor?

Lex Fridman (06:03:34) How quickly does he get there?

Bliss Chapman (06:03:37) So this happened on a Tuesday. I remember this day very clearly, because at some point during the day, it looked like he wasn’t doing super well, it looked like the model wasn’t performing super well, and he was getting distracted, but actually, it wasn’t the case. What actually happened was, he was trying something new, where he was just controlling the cursor, so he wasn’t imagining moving his hand anymore, he was just imagining… I don’t know what it is, some abstract intention to move the cursor on the screen, and I cannot tell you what the difference between those two things are, I truly cannot. He’s tried to explain it to me before, I cannot give a first-person account of what that’s like. But the expletives that he uttered in that moment were enough to suggest that it was a very qualitatively different experience for him to just have direct neural control over a cursor.

Lex Fridman (06:04:23) I wonder if there’s a way through UX to encourage a human being to discover that, because he discovered it… Like you said to me, that he’s a pioneer. So he discovered that on his own through all of this, the process of trying to move the cursor with different kinds of intentions. But that is clearly a really powerful thing to arrive at, which is to let go of trying to control the fingers and the hand, and control the actual digital device with your mind.

Bliss Chapman (06:04:56) That’s right. UX is how it works. And the ideal UX is one that the user doesn’t have to think about what they need to do in order to get it done, it just does it.

Lex Fridman (06:05:05) That is so fascinating. But I wonder, on the biological side, how long it takes for the brain to adapt. So is it just simply learning high level software, or is there a neuroplasticity component where the brain is adjusting slowly?

Bliss Chapman (06:05:25) Yeah. The truth is, I don’t know. I’m very excited to see with sort of the second participant that I implant, what the journey is like for them, because we’ll have learned a lot more, potentially, we can help them understand and explore that direction more quickly. This wasn’t me prompting Noland to go try this, he was just exploring how to use his device and figured it out himself. But now that we know that that’s a possibility, that maybe there’s a way to, for example, hint the user, “Don’t try super hard during calibration, just do something that feels natural.” Or, “Just directly control the cursor. Don’t imagine explicit action.” And from there, we should be able to hopefully understand how this is for somebody who has not experienced that before. Maybe that’s the default mode of operation for them, you don’t have to go through this intermediate phase of explicit motions.

Lex Fridman (06:06:07) Or maybe if that naturally happens for people, you can just occasionally encourage them to allow themselves to move the cursor.

Lex Fridman (06:06:14) Actually, sometimes, just like with a four-minute mile, just the knowledge that that’s possible-

Bliss Chapman (06:06:19) Yes, pushes you to do it.

Lex Fridman (06:06:21) Enables you to do it, and then it becomes trivial. And then it also makes you wonder, this is the cool thing about humans, once there’s a lot more human participants, they will discover things that are possible.

Bliss Chapman (06:06:32) Yes. And share their experiences probably with each other.

Lex Fridman (06:06:34) Yeah, and share. And that because of them sharing it, they’ll be able to do it. All of a sudden that’s unlocked for everybody, because just the knowledge sometimes is the thing that enables you to do it.

Bliss Chapman (06:06:46) Yeah. Just to comment on that too, we’ve probably tried 1,000 different ways to do various aspects of decoding, and now we know what the right subspace is to continue exploring further. Again, thanks to Noland and the many hours he’s put into this. And so even just that, help constraints, or the beam search of different approaches that we could explore really helps accelerate for the next person the set of things that we’ll get to try on day one, how fast we hopefully get them to use for control, how fast we can enable them to use it independently, and to get value out of the system. So massive hats off to Noland and all the participants that came before to make this technology a reality.

Lex Fridman (06:07:20) So how often are the updates to the decoder? ‘Cause Noland mentioned, “Okay, there’s a new update that we’re working on.” In the stream he said he plays the snake game, because it’s super hard, it’s a good way for him to test how good the update is. And he says sometimes the update is a step backwards, it’s a constant iteration. What does the update entail? Is it mostly on the decoder side?

Bliss Chapman (06:07:48) Yeah. Couple of comments. So, one, it’s probably worth drawing distinction between research sessions where we’re actively trying different things to understand what the best approach is, versus independent use, where we wanted to have ability to just go use the device how anybody would want to use their MacBook. So what he’s referring to is, I think, usually in the context of a research session, where we’re trying many, many different approaches to… Even unsupervised approaches, like we talked about earlier, to try to come up with better ways to estimate his true intention, and more accurately decoded.

(06:08:15) And in those scenarios, we try, in any given session… He’ll sometimes work for eight hours a day, and so that can be hundreds of different models that we would try in that day. A lot of different things. Now, it’s also worth noting that we update the application he uses quite frequently, I think sometimes up to 4 or 5 times a day, we’ll update his application with different features, or bug fixes, or feedback that he’s given us.

(06:08:39) He’s a very articulate person who is part of the solution, he’s not a complaining person, he says, “Hey, here’s this thing that I’ve discovered is not optimal in my flow. Here’s some ideas how to fix it. Let me know what your thoughts are, let’s figure out how to solve it.” And it often happens that those things are addressed within a couple of hours of him giving us his feedback, that’s the kind of iteration cycle we’ll have. And so sometimes at the beginning of the session, he’ll give us feedback, and at the end of the session he’s giving us feedback on the next iteration of that process or that setup.

Lex Fridman (06:09:06) That’s fascinating, ’cause one of the things you mentioned that there was 271 pages of notes taken from the BCI sessions, and this was just in March. So one of the amazing things about human beings that they can provide… Especially ones who are smart, and excited, and all positive and good vibes like Noland, that they can provide feedback, continuous feedback.

Bliss Chapman (06:09:27) Yeah. Just to brag on the team a little bit, I work with a lot of exceptional people, and it requires the team being absolutely laser-focused on the user, and what will be the best for them. And it requires a level of commitment of, “Okay, this is what the user feedback was. I have all these meetings, we’re going to skip that today, and we’re going to do this.” That level of focus and commitment is, I would say, underappreciated in the world. And also, you obviously have to have the talent to be able to execute on these things effectively, and we have that in loads.

Lex Fridman (06:10:00) Yeah, and this is such an interesting space of UX design, because there’s so many unknowns here. And I can tell UX is difficult because of how many people do it poorly. It’s just not a trivial thing.

Bliss Chapman (06:10:19) Yeah. UX is not something that you can always solve by just constant iterating on different things. Sometimes you really need to step back and think globally, “Am I even in the right sort of minima to be chasing down for a solution?” There’s a lot of problems in which sort of fast iteration cycle is the predictor of how successful you’ll be. As a good example, like in an RL simulation for example, the more frequently you get reward, the faster you can progress. It’s just an easier learning problem the more frequently you get feedback. But UX is not that way, I mean, users are actually quite often wrong about what the right solution is, and it requires a deep understanding of the technical system, and what’s possible, combined with what the problem is you’re trying to solve. Not just how the user expressed it, but what the true underlying problem is to actually get to the right place.

Lex Fridman (06:11:04) Yeah, that’s the old stories of Steve Jobs rolling in there, like, “Yeah, the user is a useful signal, but it’s not a perfect signal, and sometimes you have to remove the floppy disc drive.” Or whatever the… I forgot all the crazy stories of Steve Jobs making wild design decisions. But there, some of it is aesthetic, that some of it is about the love you put into the design, which is very much a Steve Jobs, Johnny Ive type thing, but when you have a human being using their brain to interact with it, it also is deeply about function, it’s not just aesthetic. And that, you have to empathize with a human being before you, while not always listening to them directly. You have to deeply empathize. It’s fascinating. It’s really, really fascinating. And at the same time, iterate, but not iterate in small ways, sometimes a complete… Like rebuilding the design. Noland said in the early days the UX sucked, but you improved quickly. What was that journey like?

Bliss Chapman (06:12:16) Yeah, I mean, I’ll give you one concrete example. So he really wanted to be able to read manga. This is something that he… I mean, it sounds like a simple thing, but it’s actually a really big deal for him, and he couldn’t do it with his mouse stick. It wasn’t accessible, you can’t scroll with the mouse stick on his iPad on the website that he wanted to be able to use to read the newest manga, and so-

Lex Fridman (06:12:36) Might be a good quick pause to say the mouth stick is the thing he’s using. Holding a stick in his mouth to scroll on a tablet.

Bliss Chapman (06:12:44) Right. Yeah. You can imagine it’s a stylus that you hold between your teeth. Yeah, it’s basically a very long stylus.

Lex Fridman (06:12:49) It’s exhausting, it hurts, and it’s inefficient.

Bliss Chapman (06:12:54) Yeah. And maybe it’s also worth calling out, there are other alternative assisted technologies, but the particular situation Noland’s in, and this is not uncommon, and I think it’s also not well-understood by folks, is that he’s relatively spastic, so he’ll have muscle spasms from time to time. And so any assistive technology that requires him to be positioned directly in front of a camera, for example, an eye tracker, or anything that requires him to put something in his mouth just is a no-go, ’cause he’ll either be shifted out of frame when he has a spasm, or if he has something in his mouth, it’ll stab him in the face if he spasms too hard. So these kinds of considerations are important when thinking about what advantages a BCI has in someone’s life. If it fits ergonomically into your life in a way that you can use it independently when your caretakers not there, wherever you want to, either in the bed or in the chair, depending on your comfort level and your desire to have pressure source, all these factors matter a lot in how good the solution is in that user’s life.

(06:13:45) So one of these very fun examples is scroll. So, again, manga is something he wanted to be able to read, and there’s many ways to do scroll with a BCI. You can imagine different gestures, for example, the user could do that would move the page. But scroll is a very fascinating control surface, because it’s a huge thing on the screen in front of you. So any sort of jitter in the model output, any sort of air in the model output causes an earthquake on the screen. You really don’t want to have your mango page that you’re trying to read be shifted up and down a few pixels just because your scroll decoder is not completely accurate.

(06:14:19) And so this was an example where we had to figure out how to formulate the problem in a way that the errors of the system, whenever they do occur, and we’ll do our best to minimize them, but whenever those errors do occur, that it doesn’t interrupt the qualia, again, of the experience that the user is having. It doesn’t interrupt their flow of reading their book. And so what we ended up building is this really brilliant feature. This is a teammate named Bruce who worked on this really brilliant work called Quick Scroll. And Quick Scroll basically looks at the screen, and it identifies where on the screen are scroll bars. And it does this by deeply integrated with macOS to understand where are the scroll bars actively present on the screen, using the sort of accessibility tree that’s available to macOS apps. And we identified where those scroll bars are, and we provided a BCI scroll bar, and the BCI scroll bar looks similar to a normal scroll bar, but it behaves very differently, in that once you move over to it, your cursor sort of morphs onto it, it sort of attaches or latches onto it. And then once you push up or down, in the same way that you’d use a push to control the normal cursor, it actually moves the screen for you. So it’s basically like remapping the velocity to a scroll action.

(06:15:26) And the reason that feels so natural and intuitive is that when you move over to attach to it feels like magnetic, so you’re sort of stuck onto it, and then it’s one continuous action, you don’t have to switch your imagined movement, you sort of snap onto it, and then you’re good to go. You just immediately can start pulling the page down or pushing it up. And even once you get that right, there’s so many little nuances of how the scroll behavior works to make it natural and intuitive. So one example is momentum. When you scroll a page with your fingers on the screen, you actually have some flow, it doesn’t just stop right when you lift your finger up. The same is true with BCI scroll, so we had to spend some time to figure out, “What are the right nuances when you don’t feel the screen under your fingertip anymore? What is the right sort of dynamic, or what’s the right amount of page give, if you will, when you push it to make it flow the right amount for the user to have a natural experience reading their book?”

(06:16:15) I could tell you there’s so many little minutia of how exactly that scroll works, that we spent probably a month getting right, to make that feel extremely natural and easy for the user to navigate.

Lex Fridman (06:16:25) I mean, even the scroll on a smartphone with your finger feels extremely natural and pleasant, and it probably takes an extremely long time to get that right. And actually, the same kind of visionary UX design that we were talking about, don’t always listen to the users, but also listen to them, and also have visionary, big, like throw everything out, think from first principles, but also not. Yeah, yeah. By the way, it just makes me think that scroll bars on the desktop probably have stagnated, and never taken that… ‘Cause the snap, same as snap to grid, snap to scroll bar action you’re talking about is something that could potentially be extremely useful in the desktop setting, even just for users to just improve the experience. ‘Cause the current scroll bar experience in the desktop is horrible.

Lex Fridman (06:17:20) It’s hard to find, hard to control, there’s not a momentum, there’s… And the intention should be clear, when I start moving towards a scroll bar, there should be a snapping to the scroll bar action, but of course… Maybe I’m okay paying that cost, but there’s hundreds of millions of people paying that cost non-stop, but anyway. But in this case, this is necessary, because there’s an extra cost paid by Noland for the jitteriness, so you have to switch between the scrolling and the reading. There has to be a face shift between the two, like when you’re scrolling, you’re scrolling.

Bliss Chapman (06:17:58) Right, right. So that is one drawback of the current approach. Maybe one other just sort of case study here. So, again, UX is how it works, and we think about that holistically, from the… Even the feature detection level of what we detect in the brain, to how we design the decoder, what we choose to decode, to then how it works once it’s being used by the user. So another good example in that sort of how it works once they’re actually using the decoder, the output that’s displayed on the screen is not just what the decoder says, it’s also a function of what’s going on on the screen.

(06:18:25) So we can understand, for example, that when you’re trying to close a tab, that very small, stupid little X that’s extremely tiny, which is hard to get precisely hit, if you’re dealing with a noisy output of the decoder, we can understand that that is a small little X you might be trying to hit, and actually make it a bigger target for you. Similar to how when you’re typing on your phone, if you are used to the iOS keyboard for example, it actually adapts to target size of individual keys based on an underlying language model. So it’ll actually understand if I’m typing, “Hey, I’m going to see L.” It’ll make the E key bigger because it knows Lex is the person I’m going to go see. And so that kind of predictiveness can make the experience much more smooth, even without improvements to the underlying decoder or feature detection part of the stack.

(06:19:07) So we do that with a feature called magnetic targets, we actually index the screen, and we understand, “Okay, these are the places that are very small targets that might be difficult to hit. Here’s the kind of cursor dynamics around that location that might be indicative of the user trying to select it. Let’s make it easier. Let’s blow up the size of it in a way that makes it easier for the user to sort of snap onto that target.” So all these little details, they matter a lot in helping the user be independent in their day-to-day living.

Neural decoder

Lex Fridman (06:19:29) So how much of the work on the decoder is generalizable to P2, P3, P4, P5 PM? How do you improve the decoder in a way that’s generalizable?

Bliss Chapman (06:19:40) Yeah, great question. So the underlying signal we’re trying to decode is going to look very different in P2 than in P1. For example, channel number 345 is going to mean something different in user one than it will in user two, just because that electrode that corresponds with channel 345 is going to be next to a different neuron in user one to person user two. But the approach is the methods, the user experience of how do you get the right behavioral pattern from the user to associate with that neural signal. We hope that will translate over multiple generations of users.

(06:20:08) And beyond that, it’s very, very possible, in fact, quite likely that we’ve overfit to Noland’s user experience, desires and preferences. And so what I hope to see is that when we get a second, third, fourth participant, that we find what the right wide minimums are that cover all the cases that make it more intuitive for everyone. And hopefully, there’s a crosspollination of things, where, “Oh, we didn’t think about that with this user because they can speak. But with this user who just can fundamentally not speak at all, this user experience is not optimal.” Those improvements that we make there should hopefully translate then to even people who can speak but don’t feel comfortable doing so because we’re in a public setting, like their doctor’s office.

Lex Fridman (06:20:42) So the actual mechanism of open-loop labeling, and then closed-loop labeling would be the same, and hopefully can generalize across the different users-

Lex Fridman (06:20:52) … as they’re doing the calibration step? And the calibration step is pretty cool. I mean, that in itself. The interesting thing about Webgrid, which is closed-loop, it’s fun. I love it when there’s… They used to be kind of idea of human computation, which is using actions a human would want to do anyway to get a lot of signal from. And Webgrid is that, a nice video game that also serves as great calibration.

Bliss Chapman (06:21:20) It’s so funny, I’ve heard this reaction so many times. Before the first user was implanted, we had an internal perception that the first user would not find this fun. And so we thought really quite a bit actually about, “Should we build other games that are more interesting for the user, so we can get this kind of data and help facilitate research that’s for long duration and stuff like this?” Turns out that people love this game. I always loved it, but I didn’t know that that was a shared perception.

Lex Fridman (06:21:45) Yeah. And just in case it’s not clear, Webgrid is… There’s a grid of let’s say 35 by 35 cells and one of them lights up blue and you have to move your mouse over that and click on it. And if you miss it, it’s red, and…

Bliss Chapman (06:22:01) I’ve played this game for so many hours, so many hours.

Lex Fridman (06:22:04) And what’s your record you said?

Bliss Chapman (06:22:06) I think I have the highest at Neuralink right now. My record’s 17 BPS.

Bliss Chapman (06:22:11) If you imagine that 35 by 35 grid, you’re hitting about 100 trials per minute. So 100 correct selections in that one minute window. So you’re averaging about between 500, 600 milliseconds per selection.

Lex Fridman (06:22:22) So one of the reasons I think I struggle with that game is I’m such a keyboard person, so everything is done with via keyboard. If I can avoid touching the mouse, it’s great. So how can you explain your high performance?

Bliss Chapman (06:22:36) I have a whole ritual I go through when I play Webgrid. There’s actually like a diet plan associated with this. It’s a whole thing.

Bliss Chapman (06:22:43) The first thing is-

Lex Fridman (06:22:43) “I have to fast for five days, I have to go up to the mountains.”

Bliss Chapman (06:22:47) I mean, the fasting thing is important. So this is like-

Lex Fridman (06:22:49) Focuses the mind, yeah. It’s true, it’s true.

Bliss Chapman (06:22:51) So what I do is, I… Actually, I don’t eat for a little bit beforehand, and then I’ll actually eat a ton of peanut butter right before I play, and I get-

Lex Fridman (06:22:58) This is a real thing?

Bliss Chapman (06:22:59) This is a real thing, yeah. And then it has to be really late at night, this is, again, a night owl thing I think we share, but it has to be midnight, 2:00 A.M. kind of time window. And I have a very specific physical position I’ll sit in, which is… I was homeschooled growing up, and so I did most of my work on the floor, just in my bedroom or whatever. And so I have a very specific situation-

Bliss Chapman (06:23:19) … on the floor, that I sit and play. And then you have to make sure there’s not a lot of weight on your elbow when you’re playing so you can move quickly. And then I turn the gain of the cursor, so the speed of the cursor way, way up, so it’s small motions that actually move the cursor.

Lex Fridman (06:23:29) Are you moving with your wrist, or you’re… You’re never-

Bliss Chapman (06:23:33) I move with my fingers. So my wrist is almost completely still, I’m just moving my fingers.

Lex Fridman (06:23:37) You know those… Just on a small tangent-

Lex Fridman (06:23:40) … the… which I’ve been meaning to go down this rabbit hole of people that set the world record in Tetris. Those folks, they’re playing… There’s a way to… Did you see this?

Bliss Chapman (06:23:50) I’ve seen it. All the fingers are moving?

Lex Fridman (06:23:52) Yeah, you could find a way to do it where it’s using a loophole, like a bug that you can do some incredibly fast stuff. So it’s along that line, but not quite. But you do realize there’ll be a few programmers right now listening to this who’ll fast and eat peanut butter, and be like-

Bliss Chapman (06:24:09) Yeah, please track my record. I mean, the reason I did this literally was just because I wanted the bar to be high for the team. The number that we aim for should not be the median performance, it should be able to beat all of us at least, that should be the minimum bar.

Lex Fridman (06:24:21) What do you think is possible, like 20?

Bliss Chapman (06:24:23) Yeah, I don’t know what the limits… I mean, the limits, you can calculate just in terms of screen refresh rate and cursor immediately jumping to the next target. I mean, I’m sure there’s limits before that with just sort of reaction time, and visual perception, and things like this. I would guess it’s below 40, but above 20, somewhere in there is probably the right… That I’d never to be thinking about. It also matters how difficult the task is. You can imagine some people might be able to do 10,000 targets on the screen, and maybe they can do better that way. So there’s some task optimizations you could do to try to boost your performance as well.

Lex Fridman (06:24:55) What do you think it takes for Noland to be able to do above 8.5, to keep increasing that number? You said every increase in the number…

Lex Fridman (06:25:00) … to keep increasing that number. You said every increase in the number might require different improvements in the system.

Bliss Chapman (06:25:08) Yeah. The first answer that’s important to say is, I don’t know. This is edge of the research so, again, nobody’s gotten to that number before, so what’s next is going to be a heuristic guess from my part. What we’ve seen historically is that different parts of the stack can compile next to different time points. So when I first joined Neuralink, three years ago or so, one of the major problems was just the latency of the Bluetooth connection. The radio in the device wasn’t super good, it was an early revision of the implant. And it just, no matter how good your decoder was, if your thing is updating every 30 milliseconds or 50 milliseconds, it’s just going to be choppy. And no matter how good you are, that’s going to be frustrating and lead to challenges. So at that point, it was very clear that the main challenge is just get the data off the device in a very reliable way such that you can enable the next challenge to be tackled.

(06:25:59) And then at some point it was actually the modeling challenge of how do you just build a good mapping, like the supervised learning problem of, you have a bunch of data and you have a label you’re trying to predict, just what is the right neural decoder architecture and hyperparameters to optimize that? And that was the problem for a bit, and once you solve that, it became a different bottleneck. I think the next bottleneck after that was actually just software stability and reliability. If you have widely varying inference latency in your system or your app just lags out every once in a while, it decreases your ability to maintain and get in a state of flow, and it basically just disrupts your control experience. And so there’s a variety of different software bugs and improvements we made that basically increased the performance of the system, made it much more reliable, much more stable and led to a state where we could reliably collect data to build better models with.

(06:26:49) So that was a bottleneck for a while, it was just the software stack itself. If I were to guess right now, there’s two major directions you could think about for improving VPS further. The first major direction is labeling. So labeling is, again, this fundamental challenge of given a window of time where the user is expressing some behavioral intent, what are they really trying to do at the granularity of every millisecond? And that again, is a task design problem, it’s a UX problem, it’s a machine learning problem, it’s a software problem. It touches all those different domains. The second thing you can think about to improve BPS further is either completely changing the thing you’re decoding or just extending the number of things that you’re decoding. So this is serving the direction of functionality, basically, you can imagine giving more clicks.

(06:27:33) For example, a left click, a right click, a middle click, different actions like click-and-drag for example, and that can improve the effective bit rate of your communication processes. If you’re trying to allow the user to express themselves through any given communication channel, you can measure that with bits per second. But what actually is measured at the end of the day is how effective are they at navigating their computer? So from the perspective of the downstream tasks that you care about, functionality and extending functionality is something we’re very interested in, because not only can it improve the number of BPS, but it can also improve the downstream independence that the user has and the skill and efficiency with which they can operate their computer.

Lex Fridman (06:28:05) Would the number of threads increasing also potentially help?

Bliss Chapman (06:28:10) Yes. Short answer is yes. It’s a bit nuanced how that manifests in the numbers. So what you’ll see is that if you plot a curve of number of channels that you’re using for decode versus either the offline metric of how good you are at decoding or the online metric of in practice how good is the user at using this device, you see roughly a log curve. So as you move further out in number of channels, you get a corresponding logarithmic improvement in control quality and offline validation metrics. The important nuance here is that each channel corresponds with a specific represented intention in the brain. So for example, if you have a channel 254, it might correspond with moving to the right. Channel 256, might mean move to the left. If you want to expand the number of functions you want to control, you really want to have a broader set of channels that covers a broader set of imagined movements. You can think of it like Mr. Potato Man actually, if you had a bunch of different imagined movements you could do, how would you map those imagined movements to input to a computer? You could imagine handwriting to output characters on the screen. You could imagine just typing with your fingers and have that output text on the screen. You could imagine different finger modulations for different clicks. You can imagine wiggling your big nose for opening some menu or wiggling your big toe to have command tab occur or something like this. So it’s really the amount of different actions you can take in the world depends on how many channels you have on the information content that they carry.

Lex Fridman (06:29:42) Right, so that’s more about the number of actions. So actually as you increase the number of threads, that’s more about increasing the number of actions you’re able to perform.

Bliss Chapman (06:29:51) But one other nuance there that is worth mentioning. So again, our goal is really to enable a user with paralyzes to control the computer as fast as I can, so that’s BPS, with all the same functionality I have, which is what we just talked about, but then also as reliably as I can. And that last point is very related to channel account discussion. So as you scale out number of channels, the relative importance of any particular feature of your model input to the output control of the user diminishes, which means that if the neural non-stationarity effect is per channel, or if the noise is independent such that more channels means on average less output effect, then your reliability of your system will improve. So one core thesis that at least I have is that scaling channel account should improve the reliability system without any work on the decoder itself.

Lex Fridman (06:30:37) Can you linger on the reliability here? So first of all, when you say non-stationarity of the signal, which aspect are you referring to?

Bliss Chapman (06:30:46) Yeah, so maybe let’s talk briefly what the actual underlying signal looks like. So again, I spoke very briefly at the beginning about how when you imagine moving to the right or imagine moving to the left, neurons might fire more or less, and the frequency content that signal, at least in the motor cortex, it’s very correlated with the output intention, the behavioral task that the user is doing. You can imagine actually this is not obvious that rate coding, which is the name of that phenomenon, is the only way the brain could represent information. You can imagine many different ways in which the brain could encode intention, and there’s actually evidence in bats for example, that there’s temporal codes. So timing codes of exactly when particular neurons fire is the mechanism of information representation. But at least in the motor cortex, there’s substantial evidence that it’s rate coding or at least first order of effect is that it’s rate coding.

(06:31:31) So then if the brain is representing information by changing the frequency of a neuron firing, what really matters is the delta between the baseline state of the neuron and what it looks like when it’s modulated. And what we’ve observed and what has also been observed in academic work is that that baseline rate, if you’re to target the scale, if you imagine that analogy for measuring flour or something when you’re baking, that baseline state of how much the pot weighs is actually different day to day. So if what you’re trying to measure is how much rice is in the pot, you’re going to get a different measurement different days because you’re measuring with different pots. So that baseline rate shifting is really the thing that at least from a first order description of the problem is what’s causing this downstream bias. There can be other effects, not linear effects on top of that, but at least at a very first order description of the problem. That’s what we observed day to day is that the baseline firing rate of any particular neuron or observed on a particular channel is changing.

Lex Fridman (06:32:23) So can you just adjust to the baseline to make it relative to the baseline nonstop?

Bliss Chapman (06:32:29) Yeah, this is a great question. So with monkeys, we have found various ways to do this. One example way to do this is you ask them to do some behavioral tasks like play the game with a joystick, you measure what’s going on in the brain. You compute some mean of what’s going on across all the input features, and you subtract that on the input when you’re doing your BCI session, works super well. For whatever reason, that doesn’t work super well with Noland. I actually don’t know the full reason why, but I can imagine several explanations.

(06:32:59) One such explanation could be that the context effect difference between some open-loop task and some closed-loop task is much more significant with Noland than it is with the monkey. Maybe in this open-loop task, he’s watching the Lex Fridman Podcast while he’s doing the task or he’s whistling and listening to music and talking with his friend and ask his mom what’s for dinner while he’s doing this task. So the exact difference in context between those two states may be much larger and thus lead to a bigger generalization gap between the features that you’re normalizing at open-loop time and what you’re trying to use at closed-loop time.

Lex Fridman (06:33:29) That’s interesting. Just on that point, it’s incredible to watch Noland be able to multitask, to do multiple tasks at the same time, to be able to move the mouse cursor effectively while talking and while being nervous because he’s talking in front of [inaudible 06:33:45]

Bliss Chapman (06:33:44) Kicking my ass and chest too, yeah.

Lex Fridman (06:33:46) Kicking your ass and talk trash while doing it-

Lex Fridman (06:33:50) … so all at the same time. And yes, if you are trying to normalize to the baseline, that might throw everything off. Boy, is that interesting?

Bliss Chapman (06:33:59) Maybe one comment on that too. For folks that aren’t familiar with assistive technology, I think there’s a common belief that, well, why can’t you just use an eye tracker or something like this for helping somebody move a mouse on the screen? It’s really a fair question and one that I actually was not confident before Sir Noland that this was going to be a profoundly transformative technology for people like him. And I’m very confident now that it will be, but the reasons are subtle. It really has to do with ergonomically how it fits into their life, even if you can just offer the same level of control as what they would have with an eye tracker or with a mouse stick, but you don’t need to have that thing in your face. You don’t need to be positioned a certain way.

(06:34:34) You don’t need your caretaker to be around to set it up for you. You can activate it when you want, how you want, wherever you want. That level of independence is so game-changing for people. It means that they can text a friend at night privately without their mom needing to be in the loop. It means that they can open up and browse the internet at 2:00 AM when nobody’s around to set their iPad up for them. This is a profoundly game-changing thing for folks in that situation, and this is even before we start talking about folks that may not be able to communicate at all or ask for help when they want to. This can be potentially the only link that they have to the outside world. And yeah, that one doesn’t, I think, need explanation of why that’s so impactful.

Lex Fridman (06:35:11) You mentioned NeuroDecodeR. How much machine learning is in the decoder, how much magic, how much science, how much art? How difficult is it to come up with a decoder that figures out what these sequence of spikes mean?

Bliss Chapman (06:35:28) Yeah, good question. There’s a couple of different ways to answer this, so maybe I’ll zoom out briefly first and then I’ll go down one of the rabbit holes. So the zoomed out view is that building the decoder is really the process of building the dataset plus compiling it into the weights, and each of those steps is important. The direction I think of further improvement is primarily going to be in the dataset side of how do you construct the optimal labels for the model. But there’s an entirely separate challenge of then how do you compile the best model? And so I’ll go briefly down the second rabbit hole. One of the main challenges with designing the optimal model for BCI is that offline metrics don’t necessarily correspond to online metrics. It’s fundamentally a control problem. The user is trying to control something on the screen and the exact user experience of how you output the intention impacts their ability to control. So for example, if you just look at validation loss as predicted by your model, there can be multiple ways to achieve the same validation loss.

(06:36:26) Not all of them are equally controllable by the end user. And so it might be as simple as saying, oh, you could just add auxiliary loss terms that help you capture the thing that actually matters. But this is a very complex nuanced process. So how you turn the labels into the model is more of a nuanced process than just a standard supervised learning problem. One very fascinating anecdote here, we’ve tried many different neural network architectures that translate brain data to velocity outputs, for example. And one example that’s stuck in my brain from a couple of years ago now is at one point, we were using just fully-connected networks to decode the brain activity. We tried A-B test where we were measuring the relative performance in online control sessions of one deconvolution over the input signal. So if you imagine per channel you have a sliding window that’s producing some convolved feature, for each of those input sequences for every single channel simultaneously, you can actually get better validation metrics, meaning you’re fitting the data better and it’s generalizing better in offline data if you use this convolutional architecture. You’re reducing parameters. It’s a standard procedure when you’re dealing with time series data. Now it turns out that when using that model online, the controllability was worse, was far worse, even though the offline metrics were better, and there can be many ways to interpret that. But what that taught me at least was that, hey, it’s at least the case right now that if you were to just throw a bunch of compute at this problem and you were trying to hyperparameter optimize or let some GPT model hard code or come up with or invent many different solutions, if you were just optimizing for loss, it would not be sufficient, which means that there’s still some inherent modeling gap here. There’s still some artistry left to be uncovered here of how to get your model to scale with more compute, and that may be fundamentally a labeling problem, but there may be other components to this as well.

Lex Fridman (06:38:11) Is it data constraint at this time, which is what it sounds like? How do you get a lot of good labels?

Bliss Chapman (06:38:22) Yeah, I think it’s data quality constrained, not necessarily data quantity constrained.

Lex Fridman (06:38:27) But even just the quantity ’cause it has to be trained on the interactions. I guess there’s not that many interactions.

Bliss Chapman (06:38:37) Yeah, so it depends what version of this you’re talking about. So if you’re talking about, let’s say, the simplest example of just 2D velocity, then I think, yeah, data quality is the main thing. If you’re talking about how to build a multi-function output that lets you do all the inputs the computer that you and I can do, then it’s actually a much more sophisticated nuanced modeling challenge because now you need to think about not just when the users are left clicking, but when you’re building the left click model, you also need to be thinking about how to make sure it doesn’t fire when they’re trying to right click or when they’re trying to move the mouse.

(06:39:03) So one example of an interesting bug from week one of BCI with Noland was when he moved the mouse, the click signal dropped off a cliff and when he stopped, the click signal went up. So again, there’s a contamination between the two inputs. Another good example was at one point he was trying to do a left click and drag, and the minute he started moving, the left click signal dropped off a cliff. So again, ’cause some contamination between the two signals, you need to come up with some way to either in the dataset or in the model build robustness against this kind of, you think of it like overfitting, but really it’s just that the model has not seen this kind of variability before. So you need to find some way to help the model with that.

Lex Fridman (06:39:42) This is super cool ’cause it feels like all of this is very solvable, but it’s hard.

Bliss Chapman (06:39:46) Yes, it is fundamentally an engineering challenge. This is important to emphasize, and it’s also important to emphasize that it may need fundamentally new techniques, which means that people who work on let’s say unsupervised speech classification using CTC loss for example, with internal to Siri, they could potentially have very applicable skills to this.

Future improvements

Lex Fridman (06:40:03) So what things are you excited about in the future development of the software stack on Neuralink? So everything we’ve been talking about, the decoding, the UX?

Bliss Chapman (06:40:14) I think there’s something I’m excited about from the technology side and some I’m excited about for understanding how this technology is going to be best situated for entering the world, so I’ll work backwards. On the technology entering the world side of things, I’m really excited to understand how this device works for folks that cannot speak at all, that have no ability to bootstrap themselves into useful control by voice command, for example, and are extremely limited in their current capabilities. I think that will be an incredibly useful signal for us to understand really, what is an existential threat for all startups, which is product market fit. Does this device have the capacity and potential to transform people’s lives in the current state? And if not, what are the gaps? And if there are gaps, how do we solve them most efficiently?

(06:40:56) So that’s what I’m very excited about for the next year or so of clinical trial operations. On the technology side, I’m quite excited about basically everything we’re doing. I think it’s going to be awesome. The most prominent one I would say is scaling channel account. So right now we have a 1,000-channel device. The next version we’ll have between 3 and 6,000 channels, and I would expect that curve to continue in the future. And it’s unclear what set of problems will just disappear completely at that scale and what set of problems will remain and require for their focus. And so I’m excited about the clarity of gradient that gives us in terms of the user experiences we choose to focus our time and resources on. And then also in terms of even things as simple as non-stationarity, does that problem just completely go away at that scale? Or do we need to come up with new creative UXes still even at that point?

(06:41:40) And also when we get to that time point, when we start expanding out dramatically the set of functions that you can output from one brain how to deal with all the nuances of both the user experience of not being able to feel the different keys under your fingertips, but still needing to be able to modulate all of them in synchrony to achieve the thing you want. And again, you don’t have that appropriate set of feedback loop, so how can you make that intuitive for a user to control a high dimensional control surface without feeling the thing physically? I think that’s going to be a super interesting problem. I’m also quite excited to understand do these scaling laws continue? As you scale channel count, how much further out do you go before that saturation point is truly hit?

(06:42:17) And it’s not obvious today. I think we only know what’s in the interpolation space. We only know what’s between 0 and 1,024, but we don’t know what’s beyond that. And then there’s a whole range of interesting neuroscience and brain questions, which is, when you stick more stuff in the brain in more places, you get to learn much more quickly about what those brain regions represent. And so I’m excited about that fundamental neuroscience learning, which is also important for figuring out how to most efficiently insert electrodes in the future. So yeah, I think all those dimensions I’m really, really excited about. And that doesn’t even get close to touching the software stack that we work on every single day and what we’re working on right now.

Lex Fridman (06:42:49) Yeah, it seems virtually impossible to me that 1,000 electrodes is where it saturates. It feels like this would be one of those silly notions in the future where obviously you should have millions of electrodes and this is where the true breakthroughs happen. You tweeted, “Some thoughts are most precisely described in poetry.” Why do you think that is?

Bliss Chapman (06:43:20) I think it’s because the information bottleneck of language is pretty steep, and yet you’re able to reconstruct on the other person’s brain more effectively without being literal. If you can express a sentiment such that in their brain they can reconstruct the actual true underlying meaning and beauty of the thing that you’re trying to get across, the generator function in their brain is more powerful than what language can express. And so the mechanism of poetry is really just to feed or seed that generator function.

Lex Fridman (06:43:56) So being literal sometimes is a suboptimal compression for the thing you’re trying to convey.

Bliss Chapman (06:44:03) That right. And it’s actually in the process of the user going through that generation that they understand what you mean. That’s the beautiful part. It’s also like when you look at a beautiful painting, it’s not the pixels of the painting that are beautiful, it’s the thought process that occurs when you see that, the experience of that, that actually is the thing that matters.

Lex Fridman (06:44:19) Yeah, it’s resonating with some deep thing within you that the artist also experienced and was able to convey that through the pixels.

Lex Fridman (06:44:29) And that’s actually going to be relevant for full-on telepathy. It’s like if you just read the poetry literally, that doesn’t say much of anything interesting. It requires a human to interpret it. So it’s the combination of the human mind and all the experiences that a human being has within the context of the collective intelligence of the human species that makes that poem make sense and they load that in. So in that same way, the signal that carries from human to human meaning may seem trivial, but may actually carry a lot of power because of the complexity of the human mind and the receiving end. Yeah, that’s interesting. Who was it? I think Joscha Bach [inaudible 06:45:24] said something about all the people that think we’ve achieved AGI explain why humans like music.

Lex Fridman (06:45:38) And until the AGI likes music, you haven’t achieved AGI or something like this.

Bliss Chapman (06:45:45) Do you not think that’s some next token entropy surprise kind of thing going on there?

Bliss Chapman (06:45:50) I don’t know either. I listen to a lot of classical music and also read a lot of poetry and yeah, I do wonder if there is some element of the next token surprise factor going on there.

Bliss Chapman (06:46:00) Cause a lot of the tricks in both poetry and music are basically you have some repeated structure and then you do a twist. It’s like, okay, clause 1, 2, 3 is one thing and then clause four is like, “Okay, now we’re onto the next theme,” and they play with exactly when the surprise happens and the expectation of the user. And that’s even true through history as musicians evolve in music, they take some known structure that people are familiar with and they just tweak it a little bit. They tweak it and add a surprising element. This is especially true in classical music heritage, but that’s what I’m wondering. Is it all just entropy?

Lex Fridman (06:46:32) So breaking structure or breaking symmetry is something that humans seem to like. Maybe it’s as simple as that.

Bliss Chapman (06:46:37) Yeah, and great artists copy and knowing which rules to break is the important part, and fundamentally, it must be about the listener of the piece. Which rule is the right one to break? It’s about the audience member perceiving that as interesting.

Lex Fridman (06:46:54) What do you think is the meaning of human existence?

Bliss Chapman (06:47:00) There’s a TV show I really like called The West Wing, and in The West Wing there’s a character, he’s the President of the United States who’s having a discussion about the Bible with one of their colleagues. And the colleague says something about the Bible says X, Y, and Z, and the President says, “Yeah, but it also says A, B, C.” The person says, “Well, do you believe the Bible to be literally true?” And the President says, “Yes, but I also think that neither of us are smart enough to understand it.” I think the analogy here for the meaning of life is that largely we don’t know the right question to ask.

(06:47:38) So I think I’m very aligned with the Hitchhiker’s Guide to the Galaxy version of this question, which is basically, if we can ask the right questions, it’s much more likely we find the meaning of human existence. So in the short term as a heuristic in the search policy space, we should try to increase the diversity of people asking such questions or generally of consciousness and conscious beings asking such questions. So again, I think I will take the I don’t know card here, but say I do think there are meaningful things we can do that improve the likelihood of answering that question.

Lex Fridman (06:48:13) It’s interesting how much value you assign to the task of asking the right questions. That’s the main thing, it’s not the answers, it’s the questions.

Bliss Chapman (06:48:24) This point, by the way, is driven home in a very painful way when you try to communicate with someone who cannot speak, because a lot of the time, the last thing to go is they have the ability to somehow wiggle a lip or move something that allows them to say yes or no. And in that situation, it’s very obvious that what matters is, are you asking them the right question to be able to say yes or no to?

Lex Fridman (06:48:45) Wow, that’s powerful. Well, Bliss, thank you for everything you do, and thank you for being you, and thank you for talking today.

Noland Arbaugh

Lex Fridman (06:48:56) Thanks for listening to this conversation with Bliss Chapman. And now, dear friends, here’s Noland Arbaugh, the first human being to have a Neuralink device implanted in his brain. You had a diving accident in 2016 that left you paralyzed with no feeling from the shoulders down. How did that accident change your life?

Becoming paralyzed

Noland Arbaugh (06:49:18) It was a freak thing that happened. Imagine you’re running into the ocean, although this is a lake, but you’re running into the ocean and you get to about waist high, and then you dive in, take the rest of the plunge under the wave or something. That’s what I did, and then I just never came back up. Not sure what happened. I did it running into the water with a couple of guys, and so my idea of what happened is really just that I took a stray fist, elbow, knee, foot, something to the side of my head. The left side of my head was sore for about a month afterwards, so I must’ve taken a pretty big knock, and then they both came up and I didn’t. And so I was face down in the water for a while. I was conscious, and then eventually just realized I couldn’t hold my breath any longer and I keep saying took a big drink.

(06:50:20) People, I don’t know if they like that I say that. It seems like I’m making light of it all, but it’s just how I am, and I don’t know. I am a very relaxed stress-free person. I rolled with the punches for a lot of this. I took it in stride. It’s like, “All right, well, what can I do next? How can I improve my life even a little bit on a day-to-day basis?” At first, just trying to find some way to heal as much of my body as possible to try to get healed, to try to get off a ventilator, learn as much as I could so I could somehow survive once I left the hospital. And then thank God I had my family around me. If I didn’t have my parents, my siblings, then I would’ve never made it this far.

(06:51:24) They’ve done so much for me, more than I can ever thank them for, honestly, and a lot of people don’t have that. A lot of people in my situation, their families either aren’t capable of providing for them or honestly just don’t want to, and so they get placed somewhere in some sort of home. So thankfully, I had my family. I have a great group of friends, a great group of buddies from college who have all rallied around me, and we’re all still incredibly close. People always say if you’re lucky, you’ll end up with one or two friends from high school that you keep throughout your life. I have about 10 or 12 from high school that have all stuck around, and we still get together, all of us twice a year. We call it the spring series and the fall series. This last one we all did, we dressed up X-Men, so I did a-

Noland Arbaugh (06:52:21) … Professor Xavier, and it was freaking awesome. It was so good. So yeah, I have such a great support system around me, and so being a quadriplegic isn’t that bad. I get waited on all the time. People bring me food and drinks, and I get to sit around and watch as much TV and movies and anime as I want. I get to read as much as I want. It’s great.

Lex Fridman (06:52:51) It’s beautiful to see that you see the silver lining in all of this. Just going back, do you remember the moment when you first realized you were paralyzed from the neck down?

Noland Arbaugh (06:53:03) Yep. I was face down in the water when I… whatever, something hit my head. I tried to get up and I realized I couldn’t move, and it just clicked. I’m like, “All right, I’m paralyzed, can’t move. What do I do? If I can’t get up? I can’t flip over, can’t do anything, then I’m going to drown eventually.” And I knew I couldn’t hold my breath forever, so I just held my breath and thought about it for maybe 10, 15 seconds. I’ve heard from other people that on lookers, I guess the two girls that pulled me out of the water were two of my best friends. They were lifeguards, and one of them said that it looked like my body was shaking in the water like I was trying to flip over and stuff, but I knew. I knew immediately, and I realized that that’s what my situation was from here on out.

(06:54:08) Maybe if I got to the hospital, they’d be able to do something.When I was in the hospital right before surgery, I was trying to calm one of my friends down. I had brought her with me from college to camp, and she was just bawling over me, and I was like, “Hey, it’s going to be fine. Don’t worry.” I was cracking some jokes to try to lighten the mood. The nurse had called my mom, and I was like, “Don’t tell my mom. She’s just going to be stressed out. Call her after I’m out of surgery ’cause at least she’ll have some answers then, whether I live or not, really.” And I didn’t want her to be stressed through the whole thing, but I knew.

(06:54:44) And then when I first woke up after surgery, I was super drugged up. They had me on fentanyl three ways, which was awesome. I don’t recommend it, but I saw some crazy stuff on that fentanyl, and it was still the best I’ve ever felt on drugs, medication, sorry, on medication. I remember the first time I saw my mom in the hospital, I was just bawling. I had ventilator in. I couldn’t talk or anything, and I just started crying because it was more like seeing her… The whole situation obviously was pretty rough, but it was just seeing her face for the first time was pretty hard. But yeah, I never had a moment of, “Man, I’m paralyzed. This sucks. I don’t want to be around anymore.” It was always just, “I hate that I have to do this, but sitting here and wallowing isn’t going to help.”

Lex Fridman (06:55:57) So immediate acceptance.

Lex Fridman (06:56:01) Has there been low points along the way?

Noland Arbaugh (06:56:03) Yeah, yeah, sure. There are days when I don’t really feel like doing anything. Not so much anymore. Not for the last couple of years I don’t really feel that way. I’ve more so just wanted to try to do anything possible to make my life better at this point. But at the beginning, there were some ups and downs. There were some really hard things to adjust to. First off, just the first couple months, the amount of pain I was in was really, really hard. I remember screaming at the top of my lungs in the hospital because I thought my legs were on fire, and obviously I can’t feel anything, but it’s all nerve pain. And so that was a really hard night. I asked them to give me as much pain meds as possible, but they’re like, “You’ve had as much as you can have, so just deal with it. Go to a happy place,” sort of thing. So that was a pretty low point.

(06:56:59) And then every now and again, it’s hard realizing things that I wanted to do in my life that I won’t be able to do anymore. I always wanted to be a husband and father, and I just don’t think that I could do it now as a quadriplegic. Maybe it’s possible, but I’m not sure I would ever put someone I love through that, having to take care of me and stuff. Not being able to go out and play sports, I was a huge athlete growing up, so that was pretty hard. Little things too, when I realized I can’t do them anymore. There’s something really special about being able to hold a book and smell a book, the feel, the texture, the smell as you turn the pages, I just love it and I can’t do it anymore, and it’s little things like that.

(06:57:53) The two-year mark was pretty rough. Two years is when they say you will get back basically as much as you’re ever going to get back as far as movement and sensation goes. And so for the first two years, that was the only thing on my mind was try as much as I can to move my fingers, my hands, my feet, everything possible to try to get sensation and movement back. And then when the two-year mark hit, so June 30, 2018, I was really sad that that’s where I was, and then just randomly here and there, but I was never depressed for long periods of time. Just it never seemed worthwhile to me.

Lex Fridman (06:58:45) What gave you strength?

Noland Arbaugh (06:58:47) My faith. My faith in God was a big one. My understanding that it was all for purpose, and even if that purpose wasn’t anything involving Neuralink, even if that purpose was… There’s a story in the Bible about Job, and I think it’s a really, really popular story about how Job has all of these terrible things happen to him, and he praises God throughout the whole situation. I thought, and I think a lot of people think for most of their lives that they are Job, that they’re the ones going through something terrible, and they just need to praise God through the whole thing and everything will work out.

(06:59:28) At some point after my accident, I realized that I might not be Job, that I might be one of his children that gets killed or kidnapped or taken from him. And so it’s about terrible things that happen to those around you who you love. So maybe in this case, my mom would be Job and she has to get through something extraordinarily hard, and I just need to try and make it as best as possible for her because she’s the one that’s really going through this massive trial.

Noland Arbaugh (07:00:01) … she’s the one that’s really going through this massive trial and that gave me a lot of strength, and obviously my family. My family and my friends, they give me all the strength that I need on a day-to-day basis. So it makes things a lot easier having that great support system around me.

Lex Fridman (07:00:20) From everything I’ve seen of you online, your streams and the way you are today, I really admire, let’s say your unwavering positive outlook on life. Has that always been this way?

Noland Arbaugh (07:00:32) Yeah, yeah. I mean, I’ve just always thought I could do anything I ever wanted to do. There was never anything too big. Whatever I set my mind to, I felt like I could do it. I didn’t want to do a lot. I wanted to travel around and be sort of like a gypsy and go work odd jobs. I had this dream of traveling around Europe and being like, I don’t know, a shepherd in Wales or Ireland, and then going and being a fisherman in Italy, doing all of these things for a year. It’s such cliche things, but I just thought it would be so much fun to go and travel and do different things.

(07:01:17) And so I’ve always just seen the best in people around me too, and I’ve always tried to be good to people. And growing up with my mom too, she’s like the most positive energetic person in the world, and we’re all just people people. I just get along great with people. I really enjoy meeting new people, and so I just wanted to do everything. This is kind of just how I’ve been.

Lex Fridman (07:01:50) It’s just great to see that cynicism didn’t take over given everything you’ve been through.

Lex Fridman (07:01:56) Was that a deliberate choice you made, that you’re not going to let this keep you down?

Noland Arbaugh (07:02:01) Yeah, a bit. Also, it’s just kind of how I am. I just, like I said, I roll with the punches with everything. I always used to tell people I don’t stress about things much, and whenever I’d see people getting stressed, I would just say, “It’s not hard just don’t stress about it and that’s all you need to do. And they’re like, “That’s not how that works.” I’m like, “It works for me. Just don’t stress and everything will be fine. Everything will work out.” Obviously not everything always goes well, and it’s not like it all works out for the best all the time, but I just don’t think stress has had any place in my life since I was a kid.

Lex Fridman (07:02:44) What was the experience like of you being selected to be the first human being to have a Neuralink device implanted in your brain? Were you scared? Excited?

Noland Arbaugh (07:02:54) No, no. It was cool. I was never afraid of it. I had to think through a lot. Should I do this? Be the first person? I could wait until number two or three and get a better version of the Neuralink. The first one might not work. Maybe it’s actually going to kind of suck. It’s going to be the worst version ever in a person, so why would I do the first one? I’ve already kind of been selected? I could just tell them, “Okay, find someone else, and then I’ll do number two or three.” I’m sure they would let me, they’re looking for a few people anyways, but ultimately I was like, I don’t know? There’s something about being the first one to do something. It’s pretty cool. I always thought that if I had the chance that I would like to do something for the first time, this seemed like a pretty good opportunity. And I was never scared.

(07:03:51) I think my faith had a huge part in that. I always felt like God was preparing me for something. I almost wish it wasn’t this, because I had many conversations with God about not wanting to do any of this as a quadriplegic. I told Him, “I’ll go out and talk to people. I’ll go out and travel the world and talk to stadiums, thousands of people, give my testimony. I’ll do all of it, but heal me first. Don’t make me do all of this in a chair. That sucks.” And I guess He won that argument. I didn’t really have much of a choice. I always felt like there was something going on. And to see how, I guess easily I made it through the interview process and how quickly everything happened, how the stars sort of aligned with all of this. It just told me as the surgery was getting closer, it just told me that it was all meant to happen.

(07:05:02) It was all meant to be, and so I shouldn’t be afraid of anything that’s to come. And so I wasn’t. I kept telling myself like, “You say that now, but as soon as the surgery comes, you’re probably going to be freaking out. You’re about to have brain surgery.” And brain surgery is a big deal for a lot of people, but it’s an even bigger deal for me. It’s all I have left. The amount of times I’ve been like, “Thank You, God, that you didn’t take my brain and my personality and my ability to think, my love of learning, my character, everything. Thank You so much. As long as You left me that, then I think I can get by.” And I was about to let people go root around in there like, “Hey, we’re going to go put some stuff in your brain. Hopefully it works out.” And so it was something that gave me pause, but like I said, how smoothly everything went.

(07:05:54) I never expected for a second that anything would go wrong. Plus the more people I met on the Barrow side and on the Neuralink side, they’re just the most impressive people in the world. I can’t speak enough to how much I trust these people with my life and how impressed I am with all of them. And to see the excitement on their faces, to walk into a room and, roll into a room and see all of these people looking at me like, “We’re so excited. We’ve been working so hard on this and it’s finally happening.” It’s super infectious and it just makes me want to do it even more. And to help them achieve their dreams, I don’t know, it’s so rewarding and I’m so happy for all of them, honestly.

Day of surgery

Lex Fridman (07:06:45) What was the day of surgery like? When did you wake up? What’d you feel? Minute-by-minute. Were you freaking out?

Noland Arbaugh (07:06:54) No, no. I thought I was going to, but as surgery approached the night before, the morning of, I was just excited. I was like, “Let’s make this happen.” I think I said that, something like that to Elon on the phone. Beforehand we were FaceTiming, and I was like, “Let’s rock and roll.” And he’s like, “Let’s do it.” I don’t know. I wasn’t scared. So we woke up. I think we had to be at the hospital at 5:30 AM. I think surgery was at 7:00 AM So we woke up pretty early. I’m not sure much of us slept that night. Got to the hospital 5:30, went through all the pre-op stuff. Everyone was super nice. Elon was supposed to be there in the morning, but something went wrong with his plane, so we ended up FaceTiming. That was cool. I had one of the greatest one-liners of my life after that phone call. Hung up with him. There were 20 people around me and I was like, “I just hope he wasn’t too starstruck talking to me.”

Noland Arbaugh (07:07:55) And yeah, it was good.

Lex Fridman (07:07:56) Well done. Well done. Did you write that ahead of time it just came to you?

Noland Arbaugh (07:08:02) No. No, it just came to me. I was like, “This seems right.” Went into surgery. I asked if I could pray right beforehand, so I prayed over the room. I asked God if He would be with my mom in case anything happened to me and just to calm her nerves out there. Woke up, played a bit of a prank on my mom. I don’t know if you’ve heard about it?

Lex Fridman (07:08:24) Yeah, I read about it.

Noland Arbaugh (07:08:25) Yeah, she was not happy.

Lex Fridman (07:08:28) Can you take me through the prank?

Noland Arbaugh (07:08:29) Yeah. This is something-

Lex Fridman (07:08:31) Do you regret doing that now?

Noland Arbaugh (07:08:31) … No, no, not one bit. It was something I had talked about ahead of time with my buddy Bane. I was like, “I would really like to play a prank on my mom.” Very specifically, my mom. She’s very gullible. I think she had knee surgery once even, and after she came out of knee surgery, she was super groggy. She’s like, “I can’t feel my legs.” And my dad looked at her. He was like, “You don’t have any legs. They had to amputate both your legs.” And we just do very mean things to her all the time. I’m so surprised that she still loves us.

(07:09:15) But right after surgery, I was really worried that I was going to be too groggy, not all there. I had had anesthesia once before and it messed me up. I could not function for a while afterwards. And I said a lot of things that… I was really worried that I was going to start, I don’t know, dropping some bombs and I wouldn’t even know. I wouldn’t remember. So I was like, “Please God, don’t let that happen, and please let me be there enough to do this to my mom.”

(07:09:54) And so she walked in after surgery. It was the first time they had been able to see me after surgery, and she just looked at me. She said, “Hi, how are you? How are you doing? How do you feel?” And I looked at her and this very, I think the anesthesia helped, very groggy, sort of confused look on my face. It’s like, “Who are you?” And she just started looking around the room at the surgeons, at the doctors like, “What did you do to my son? You need to fix this right now.” Tears started streaming. I saw how much she was freaking out. I was like, “I can’t let this go on.” And so I was like, “Mom, mom, I’m fine. It’s all right.” And still, she was not happy about it. She still says she’s going to get me back someday, but I mean, I don’t know. I don’t know what that’s going to look like.

Lex Fridman (07:10:44) It’s a lifelong battle, man.

Noland Arbaugh (07:10:46) Yeah, but it was good.

Lex Fridman (07:10:47) In some sense it was a demonstration that you still got… Still had a sense of humor.

Noland Arbaugh (07:10:52) That’s all I wanted it to be. That’s all I wanted it to be. And I knew that doing something super mean to her like that would show her.

Lex Fridman (07:11:00) To show that you’re still there, that you love her.

Noland Arbaugh (07:11:01) Yeah, exactly. Exactly.

Lex Fridman (07:11:03) It’s a dark way to do it, but I love it.

Lex Fridman (07:11:06) What was the first time you were able to feel that you can use the Neuralink device to affect the world around you?

Noland Arbaugh (07:11:17) The first little taste I got of it was actually not too long after surgery. Some of the Neuralink team had brought in a little iPad, a little tablet screen, and they had put up eight different channels that were recording some of my neuron spikes and they put it in front of me. They’re like, “This is real time your brain firing.” I was like, “That’s super cool.” My first thought was, “I mean, if they’re firing now, let’s see if I can affect them in some way.”

(07:11:51) So I started trying to wiggle my fingers and I just started scanning through the channels, and one of the things I was doing was moving my index finger up and down, and I just saw this yellow spike on top row, third box over or something. I saw this yellow spike every time I did it, and I was like, “Oh, that’s cool.” And everyone around me was just like, “What are you seeing?” I was like, “Look at this one. Look at this top row, third box over this yellow spike. That’s me right there, there, there.” And everyone was freaking out. They started clapping. I was like, “That’s super unnecessary.” This is what’s supposed to happen, right?

Lex Fridman (07:12:29) So you’re imagining yourself moving each individual finger one at a time, and then seeing that you can notice something. And then when you did the index finger, you’re like, “Oh, cool.”

Noland Arbaugh (07:12:39) Yeah, I was wiggling all of my fingers to see if anything would happen. There was a lot of other things going on, but that big yellow spike was the one that stood out to me. I’m sure that if I would’ve stared at it long enough, I could have mapped out maybe a hundred different things. But the big yellow spike was the one that I noticed.

Lex Fridman (07:13:00) Maybe you could speak to what it’s like to wiggle your fingers, to imagine the cognitive effort required to wiggle your index finger, for example. How easy is that to do?

Noland Arbaugh (07:13:13) Pretty easy for me. It’s something that at the very beginning, after my accident, they told me to try and move my body as much as possible. Even if you can’t, just keep trying because that’s going to create new neural pathways or pathways in my spinal cord to reconnect these things to hopefully regain some movement someday.

Lex Fridman (07:13:39) That’s fascinating.

Noland Arbaugh (07:13:40) Yeah, I know. It’s bizarre.

Lex Fridman (07:13:43) That’s part of the recovery process is to keep trying to move your body.

Noland Arbaugh (07:13:46) Yep. Every day as much as you can.

Lex Fridman (07:13:49) And the nervous system does its thing. It starts reconnecting.

Noland Arbaugh (07:13:52) It’ll start reconnecting for some people, some people it never works. Some people they’ll do it. For me, I got some bicep control back, and that’s about it. If I try enough, I can wiggle some of my fingers, not on command. It’s more like if I try to move, say my right pinky, and I just keep trying to move it, after a few seconds it’ll wiggle. So I know there’s stuff there. I know, and that happens with a few different of my fingers and stuff. But yeah, that’s what they tell you to do. One of the people at the time when I was in the hospital came in and told me for one guy who had recovered most of his control, what he thought about every day was actually walking, like the act of walking just over and over again. So I tried that for years. I tried just imagining walking, which is, it’s hard. It’s hard to imagine all of the steps that go into, well, taking a step. All of the things that have to move, all of the activations that have to happen along your leg in order for one step to occur.

Lex Fridman (07:15:09) But you’re not just imagining, you’re doing it, right?

Noland Arbaugh (07:15:12) I’m trying. Yeah. So it’s imagining over again what I had to do to take a step, because it’s not something any of us think about. We just, you want to walk and you take a step. You don’t think about all of the different things that are going on in your body. So I had to recreate that in my head as much as I could, and then I practice it over, and over, and over again.

Lex Fridman (07:15:37) So it’s not like a third person perspective, it’s a first person perspective. It’s not like you’re imagining yourself walking. You’re literally doing everything, all the same stuff as if you’re walking.

Noland Arbaugh (07:15:49) Yeah, which was hard. It was hard at the beginning.

Lex Fridman (07:15:53) Frustrating hard, or actually cognitively hard, which way?

Noland Arbaugh (07:15:57) It was both. There’s a scene in one of the Kill Bill movies, actually, oddly enough, where she is paralyzed, I don’t know, from a drug that was in her system. And then she finds some way to get into the back of a truck or something, and she stares at her toe and she says, “Move,” like move your big toe. And after a few seconds on screen, she does it. And she did that with every one of her body parts until she can move again. I did that for years, just stared at my body and said, “Move your index finger, move your big toe.” Sometimes vocalizing it out loud, sometimes just thinking it. I tried every different way to do this to try to get some movement back. And it’s hard because it actually is taxing, physically taxing on my body, which is something I would’ve never expected.

(07:16:58) It’s not like I’m moving, but it feels like there’s a buildup of, the only way I can describe it is there are signals that aren’t getting through from my brain down, because there’s that gap in my spinal cord, so brain down, and then from my hand back up to the brain. And so it feels like those signals get stuck in whatever body part that I’m trying to move, and they just build up, and build up, and build up until they burst. And then once they burst, I get this really weird sensation of everything dissipating back out to level, and then I do it again.

(07:17:42) It’s also just a fatigue thing, like a muscle fatigue, but without actually moving your muscles. It’s very, very bizarre. And then if you try to stare at a body part or think about a body part and move for two, three, four, sometimes eight hours, it’s very taxing on your mind. It takes a lot of focus. It was a lot easier at the beginning because I wasn’t able to control a TV in my room or anything. I wasn’t able to control any of my environment. So for the first few years, a lot of what I was doing was staring at walls. And so, obviously I did a lot of thinking and I tried to move a lot just over, and over, and over again.

Lex Fridman (07:18:33) So you never gave up hope there?

Lex Fridman (07:18:35) Just training hard [inaudible 07:18:38].

Noland Arbaugh (07:18:37) Yeah. And I still do it. I do it subconsciously, and I think that that helped a lot with things with Neuralink, honestly. It’s something that I talked about the other day at the All Hands that I did at Neuralink’s Austin facility.

Lex Fridman (07:18:53) Welcome to Austin, by the way.

Noland Arbaugh (07:18:54) Yeah. Hey, thanks man. I went to school-

Noland Arbaugh (07:18:57) … Hey, thanks. Thanks, man. The Gigafactory was super cool. I went to school at [inaudible 07:19:01], so I’ve been around before.

Lex Fridman (07:19:02) So you should be saying welcome to me. Welcome to Texas, Lex.

Noland Arbaugh (07:19:08) But yeah, I was talking about how a lot of what they’ve had me do, especially at the beginning, well, I still do it now, is body mapping. So there will be a visualization of a hand or an arm on the screen, and I have to do that motion, and that’s how they train the algorithm to understand what I’m trying to do. And so it made things very seamless for me I think.

Lex Fridman (07:19:38) That’s really, really cool. So it’s amazing to know. I’ve learned a lot about the body mapping procedure with the interface and everything like that. It’s cool to know that you’ve been essentially training to be world-class at that task.

Noland Arbaugh (07:19:52) Yeah. Yeah. I don’t know if other quadriplegics, other paralyzed people give up. I hope they don’t. I hope they keep trying, because I’ve heard other paralyzed people say, “Don’t ever stop.” They tell you two years, but you just never know. The human body’s capable of amazing things. So I’ve heard other people say, “Don’t give up.” I think one girl had spoken to me through some family members and said that she had been paralyzed for 18 years, and she’d been trying to wiggle her index finger for all that time, and she finally got it back 18 years later. So I know that it’s possible, and I’ll never give up doing it. I do it when I’m lying down watching TV. I’ll find myself doing it just almost on its own. It’s just something I’ve gotten so used to doing that I don’t know. I don’t think I’ll ever stop.

Lex Fridman (07:20:54) That’s really awesome to hear. I think it’s one of those things that can really pay off in the long term. It is training. You’re not visibly seeing the results of that training at the moment, but there’s that Olympic level nervous system getting ready for something.

Noland Arbaugh (07:21:08) Which honestly was something that I think Neuralink gave me that I can’t thank them enough for. I can’t show my appreciation for it enough, was being able to visually see that what I’m doing is actually having some effect. It’s a huge part of the reason why I know now that I’m going to keep doing it forever. Because before Neuralink, I was doing it every day and I was just assuming that things were happening. It’s not like I knew. I wasn’t getting back any mobility or sensation or anything. So I could have been running up against a brick wall for all I knew. And with Neuralink, I get to see all the signals happening real time, and I get to see that what I’m doing can actually be mapped. When we started doing click calibrations and stuff, when I go to click my index finger for a left click, that it actually recognizes that. It changed how I think about what’s possible with retraining my body to move. And so yeah, I’ll never give up now.

Lex Fridman (07:22:28) And also just the signal that there’s still a powerhouse of a brain there that’s like, and as the technology develops, that brain is, I mean, that’s the most important thing about the human body is the brain, and it can do a lot of the control. So what did it feel like when you first could wiggle the index finger and saw the environment respond? That little thing, whatever [inaudible 07:22:49] just being way too dramatic according to you?

Noland Arbaugh (07:22:51) Yeah, it was very cool. I mean, it was cool, but I keep telling this to people. It made sense to me. It made sense that there are signals still happening in my brain, and that as long as you had something near it that could measure those, that could record those, then you should be able to visualize it in some way. See it happen. And so that was not very surprising to me. I was just like, “Oh, cool. We found one, we found something that works.”

(07:23:23) It was cool to see that their technology worked and that everything that they had worked so hard for was going to pay off. But I hadn’t moved a cursor or anything at that point. I hadn’t interacted with a computer or anything at that point. So it just made sense. It was cool. I didn’t really know much about BCI at that point either, so I didn’t know what sort of step this was actually making. I didn’t know if this was a huge deal, or if this was just like, “Okay, this is, it’s cool that we got this far, but we’re actually hoping for something much better down the road.” It’s like, “Okay.” I just thought that they knew that it turned on. So I was like, “Cool, this is cool.”

Lex Fridman (07:24:08) Well, did you read up on the specs of the hardware you get installed, the number of threads, all this kind of stuff.

Noland Arbaugh (07:24:16) Yeah, I knew all of that, but it’s all Greek to me. I was like, “Okay, 64 threads, 16 electrodes, 1,024 channels. Okay, that math checks out.”

Moving mouse with brain

Lex Fridman (07:24:32) When was the first time you were able to move a mouse cursor?

Noland Arbaugh (07:24:34) I know it must have been within the first maybe week, a week or two weeks that I was able to first move the cursor. And again, it kind of made sense to me. It didn’t seem like that big of a deal. It was like, okay, well, how do I explain this? When everyone around you starts clapping for something that you’ve done, it’s easy to say, “Okay, I did something cool.”

(07:25:04) That was impressive in some way. What exactly that meant, what it was hadn’t really set in for me. So again, I knew that me trying to move a body part and then that being mapped in some sort of machine learning algorithm to be able to identify my brain signals and then take that and give me cursor control, that all kind of made sense to me. I don’t know all the ins and outs of it, but I was like, “There are still signals in my brain firing. They just can’t get through because there’s a gap in my spinal cord, and so they can’t get all the way down and back up, but they’re still there.” So when I moved the cursor for the first time, I was like, “That’s cool, but I expected that that should happen.” It made sense to me. When I moved the cursor for the first time with just my mind, without physically trying to move. So I guess I can get into that just a little bit. The difference between attempted movement, and imagine movement.

Lex Fridman (07:26:16) Yeah, that’s a fascinating difference [inaudible 07:26:18] from one to the other.

Noland Arbaugh (07:26:19) Yeah, yeah, yeah. So attempted movement is me physically trying to attempt to move, say my hand. I try to attempt to move my hand to the right, to the left, forward and back. And that’s all attempted. Attempt to lift my finger up and down, attempt to kick or something. I’m physically trying to do all of those things, even if you can’t see it. This would be me attempting to shrug my shoulders or something. That’s all attempted movement. That’s what I was doing for the first couple of weeks when they were going to give me cursor control. When I was doing body mapping, it was attempt to do this, attempt to do that. When Nir was telling me to imagine doing it, it kind of made sense to me, but it’s not something that people practice. If you started school as a child and they said, “Okay, write your name with this pencil,” and so you do that. Like, “Okay, now imagine writing your name with that pencil.”

(07:27:33) Kids would think, “Uh, I guess that kind of makes sense,” and they would do it. But that’s not something we’re taught, it’s all how to do things physically. We think about thought experiments and things, but that’s not a physical action of doing things. It’s more what you would do in certain situations. So imagine movement, it never really connected with me. I guess you could maybe describe it as a professional athlete swinging a baseball bat or swinging a golf club. Imagine what you’re supposed to do. But then you go right to that and physically do it. Then you get a bat in your hand, and then you do what you’ve been imagining.

(07:28:15) And so I don’t have that connection. So telling me to imagine something versus attempting it, there wasn’t a lot that I could do there mentally. I just kind of had to accept what was going on and try. But the attempted moving thing, it all made sense to me. If I try to move, then there’s a signal being sent in my brain, and as long as they can pick that up, then they should be able to map it to what I’m trying to do. And so when I first moved the cursor like that, it was just like, “Yes, this should happen. I’m not surprised by that.”

Lex Fridman (07:28:50) But can you clarify, is there supposed to be a difference between imagine movement and attempted movement?

Noland Arbaugh (07:28:55) Yeah, just that in imagine movement, you’re not attempting to move at all. So it’s-

Lex Fridman (07:29:00) You’re visualizing what you’re doing.

Lex Fridman (07:29:03) … And then theoretically, is that supposed to be a different part of the brain that lights up in those two different situations?

Bliss Chapman (07:29:09) Yeah, not necessarily. I think all these signals can still be represented in motor cortex, but the difference I think, has to do with the naturalness of imagining something versus-

Bliss Chapman (07:29:18) … attempting it. The fatigue of that over time.

Lex Fridman (07:29:20) And by the way, on the mic is Bliss. So this is just different ways to prompt you to kind of get to the thing that you arrived at.

Lex Fridman (07:29:31) Attempted movement does sound like the right thing. Try.

Noland Arbaugh (07:29:35) Yeah. I mean, it makes sense to me.

Lex Fridman (07:29:37) Because imagine, for me, I would start visualizing, in my mind, visualizing. Attempted I would actually start trying to… I did combat sports my whole life, like wrestling. When I’m imagining a move, see, I’m moving my muscle.

Lex Fridman (07:29:55) There is a bit of an activation almost versus visualizing yourself, like a picture doing it.

Noland Arbaugh (07:30:01) Yeah. It’s something that I feel like naturally anyone would do. If you try to tell someone to imagine doing something, they might close their eyes and then start physically doing it, but it just-

Lex Fridman (07:30:13) Just didn’t click.

Noland Arbaugh (07:30:14) … Yeah, it’s hard. It was very hard at the beginning.

Lex Fridman (07:30:18) But attempted worked.

Noland Arbaugh (07:30:20) Attempted worked. It worked just like it should. Worked like a charm.

Bliss Chapman (07:30:26) Remember there was one Tuesday we were messing around and I think, I forget what swear word you used, but there’s a swear word that came out of your mouth when you figured out you could just do the direct cursor control.

Noland Arbaugh (07:30:35) Yeah, it blew my mind, no pun intended. Blew my mind when I first moved the cursor just with my thoughts and not attempting to move. It’s something that I found over the couple of weeks building up to that, that as I get better cursor controls, the model gets better, then it gets easier for me to… I don’t have to attempt as much to move it. And part of that is something that I’d even talked with them about when I was watching the signals of my brain one day. I was watching when I attempted to move to the right and I watched the screen as I saw the spikes. I was seeing the spike, the signal was being sent before I was actually attempting to move. I imagine just because when you go to say, move your hand or any body part, that signal gets sent before you’re actually moving, has to make it all the way down and back up before you actually do any sort of movement.

(07:31:51) So there’s a delay there. And I noticed that there was something going on in my brain before I was actually attempting to move that my brain was anticipating what I wanted to do, and that all started sort of, I don’t know, percolating in my brain. It was just there always in the back like, “That’s so weird that it could do that. It kind of makes sense, but I wonder what that means as far as using the Neuralink.”

(07:32:29) And then as I was playing around with the attempted movement and playing around with the cursor, and I saw that as the cursor control got better, that it was anticipating my movements and what I wanted it to do, like cursor movements, what I wanted it to do a bit better and a bit better. And then one day I just randomly, as I was playing Webgrid, I looked at a target before I had started attempting to move, I was just trying to get over, train my eyes to start looking ahead, like, “Okay, this is the target I’m on, but if I look over here to this target, I know I can maybe be a bit quicker getting there.”

(07:33:12) And I looked over and the cursor just shot over. It was wild. I had to take a step back. I was like, “This should not be happening.” All day I was just smiling. I was so giddy. I was like, “Guys, do you know that this works? I can just think it and it happens.” Which they’d all been saying this entire time like, “I can’t believe you’re doing all this with your mind.” I’m like, “Yeah, but is it really with my mind. I’m attempting to move and it’s just picking that up so it doesn’t feel like it’s with my mind.” But when I moved it for the first time like that, it was, oh man. It made me think that this technology, that what I’m doing is actually way, way more impressive than I ever thought. It was way cooler than I ever thought, and it just opened up a whole new world of possibilities of what could possibly happen with this technology and what I might be able to be capable of with it.

Lex Fridman (07:34:08) Because you had felt for the first time like this was digital telepathy. You’re controlling a digital device with your mind.

Lex Fridman (07:34:16) I mean, that’s a real moment of discovery. That’s really cool. You’ve discovered something. I’ve seen scientists talk about a big aha moment, like Nobel Prize winning. They’ll have this like, “Holy crap.” Like, “Whoa.”

Noland Arbaugh (07:34:31) That’s what it felt like. I felt like I had discovered something, but for me, maybe not necessarily for the world-at-large or this field-at-large, it just felt like an aha moment for me. Like, “Oh, this works.” Obviously it works. And so that’s what I do all the time now. I kind of intermix the attempted movement and imagine movement. I do it all together because I’ve found that…

Noland Arbaugh (07:35:00) I do it all together because I’ve found that there is some interplay with it that maximizes efficiency with the cursor. So it’s not all one or the other. It’s not all just, I only use attempted or I only use imagined movements. It’s more I use them in parallel and I can do one or the other. I can just completely think about whatever I’m doing, but I don’t know, I like to play around with it. I also like to just experiment with these things. Every now and again, I’ll get this idea in my head, I wonder if this works and I’ll just start doing it, and then afterwards I’ll tell them, “By the way, I wasn’t doing that like you guys wanted me to. I thought of something and I wanted to try it and so I did. It seems like it works, so maybe we should explore that a little bit.”

Lex Fridman (07:35:51) So I think that discovery’s not just for you, at least from my perspective. That’s a discovery for everyone else who ever uses a Neuralink that this is possible. I don’t think that’s an obvious thing that this is even possible. It’s like I was saying to Bliss earlier, it’s like the four-minute mile. People thought it was impossible to run a mile in four minutes and once the first person did it, then everyone just started doing it. So just to show that it’s possible, that paves the way to anyone can now do it. That’s the thing that’s actually possible. You don’t need to do the attempted movement, you can just go direct.

Noland Arbaugh (07:36:27) It is crazy. It is crazy, yeah.

Lex Fridman (07:36:30) For people who don’t know, can you explain how the Link app works? You have an amazing stream on the topic. Your first stream, I think, on X describing, the app. Can you just describe how it works?

Noland Arbaugh (07:36:43) Yeah, so it’s just an app that Neuralink created to help me interact with the computer. So on the Link app there are a few different settings, and different modes, and things I can do on it. So there’s the body mapping, which we kind of touched on. There’s a calibration. Calibration is how I actually get cursor control, so calibrating what’s going on in my brain to translate that into cursor control. So it will pop out models. What they use, I think, is time. So it would be five minutes and calibration will give me so good of a model, and then if I’m in it for 10 minutes and 15 minutes, the models will progressively get better. And so the longer I’m in it, generally, the better the models will get.

Lex Fridman (07:37:43) That’s really cool because you often refer to the models. So the model’s the thing that’s constructed once you go through the calibration step.

Lex Fridman (07:37:49) And then you also talked about sometimes you’ll play a really difficult game like Snake just to see how good the model is.

Noland Arbaugh (07:37:56) Yeah. Yeah, so Snake is kind of like my litmus test for models. If I can control a snake decently well then I know I have a pretty good model. So yeah, the Link app has all of those. It has Webgrid in it now. It’s also how I connect to the computer just in general. So they’ve given me a lot of voice controls with it at this point. So I can say, “Connect,” or, “Implant disconnect,” and as long as I have that charger handy, then I can connect to it. So the charger is also how I connect to the Link app to connect to the computer. I have to have the implant charger over my head when I want to connect, to have it wake up, because the implant’s in hibernation mode always when I’m not using it. I think there’s a setting to wake it up every so long, so we could set it to half an hour, or five hours, or something, if I just want it to wake up periodically.

(07:38:56) So yeah, I’ll connect to the Link app and then go through all sorts of things, calibration for the day, maybe body mapping. I made them give me a little homework tab because I am very forgetful and I forget to do things a lot. So I have a lot of data collection things that they want me to do.

Lex Fridman (07:39:18) Is the body mapping part of the data collection or is that also part of the calibration?

Noland Arbaugh (07:39:21) Yeah, it is. It’s something that they want me to do daily, which I’ve been slacking on because I’ve been doing so much media and traveling so much. So I’ve been [inaudible 07:39:30]-

Lex Fridman (07:39:30) You’ve gotten super famous.

Noland Arbaugh (07:39:31) Yeah, I’ve been a terrible first candidate for how much I’ve been slacking on my homework. But yeah, it’s just something that they want me to do every day to track how well the Neuralink is performing over time and to have something to give, I imagine, to give to the FDA to create all sorts of fancy charts and stuff, and show like, hey, this is what the Neuralink… This is how it’s performing day one, versus day 90, versus day 180, and things like that.

Lex Fridman (07:40:02) What’s the calibration step like? Is it move left, move right?

Noland Arbaugh (07:40:06) It’s a bubble game. So there will be yellow bubbles that pop up on the screen. At first, it is open loop. So open loop, this is something that I still don’t fully understand, the open loop and closed loop thing.

Lex Fridman (07:40:21) The me and Bliss talked for a long time about the difference between the two on the technical side.

Lex Fridman (07:40:25) So it’d be great to hear your-

Lex Fridman (07:40:27) … your side of the story.

Noland Arbaugh (07:40:29) Open loop is basically I have no control over the cursor. The cursor will be moving on its own across the screen and I am following, by intention, the cursor to different bubbles. And then the algorithm is training off of what the signals it’s getting are as I’m doing this. There are a couple of different ways that they’ve done it. They call it center-out targets. So there will be a bubble in the middle and then eight bubbles around that, and the cursor will go from the middle to one side. So say, middle to left, back to middle, to up, to middle, up, right, and they’ll do that all the way around the circle. And I will follow that cursor the whole time, and then it will train off of my intentions, what it is expecting my intentions to be throughout the whole process.

Lex Fridman (07:41:22) Can you actually speak to, when you say follow-

Lex Fridman (07:41:25) … you don’t mean with your eyes, you mean with your intentions?

Noland Arbaugh (07:41:28) Yeah, so generally for calibration, I’m doing attempted movements because I think it works better. I think the better models, as I progress through calibration, make it easier to use imagined movements.

Lex Fridman (07:41:45) Wait. Wait, wait, wait. So calibrated on attempted movement will create a model that makes it really effective for you to then use the force.

Noland Arbaugh (07:41:55) Yes. I’ve tried doing calibration with imagined movement and it just doesn’t work as well for some reason. So that was the center-out targets. There’s also one where a random target will pop up on the screen and it’s the same. I just move, I follow along wherever the cursor is, to that target all across the screen. I’ve tried those with imagined movement and for some reason the models just don’t, they don’t give as high level as quality when we get into closed loop. I haven’t played around with it a ton, so maybe the different ways that we’re doing calibration now might make it a bit better. But what I’ve found is there will be a point in calibration where I can use imagined movement. Before that point, it doesn’t really work.

(07:42:53) So if I do calibration for 45 minutes, the first 15 minutes, I can’t use imagined movement. It just doesn’t work for some reason. And after a certain point, I can just feel it, I can tell. It moves different. That’s the best way I can describe it. It’s almost as if it is anticipating what I am going to do again, before I go to do it. And so using attempted movement for 15 minutes, at some point, I can tell when I move my eyes to the next target that the cursor is starting to pick up. It’s starting to understand, it’s learning what I’m going to do.

Lex Fridman (07:43:41) So first of all, it’s really cool that, you are a true pioneer in all of this. You’re exploring how to do every aspect of this most effectively and there’s just, I imagine, so many lessons learned from this. So thank you for being a pioneer in all these kinds of different super technical ways. And it’s also cool to hear that there’s a different feeling to the experience when it’s calibrated in different ways because I imagine your brain is doing something different and that’s why there’s a different feeling to it. And then trying to find the words and the measurements to those feelings would be also interesting. But at the end of the day, you can also measure your actual performance, on whether it’s Snake or Webgrid, you could see what actually works well. And you’re saying, for the open loop calibration, the attempted movement works best for now.

Lex Fridman (07:44:36) So the open loop, you don’t get the feedback that you did something.

Lex Fridman (07:44:42) Is that frustrating? [inaudible 07:44:43]-

Noland Arbaugh (07:44:43) No, no, it makes sense to me. We’ve done it with a cursor and without a cursor in open loop. So sometimes it’s just, say for the center out, you’ll start calibration with a bubble lighting up and I push towards that bubble, and then when it’s pushed towards that bubble for, say, three seconds, a bubble will pop and then I come back to the middle. So I’m doing it all just by my intentions. That’s what it’s learning anyway. So it makes sense that as long as I follow what they want me to do, follow the yellow brick road, that it’ll all work out.

Lex Fridman (07:45:22) You’re full of great references. Is the bubble game fun?

Noland Arbaugh (07:45:26) Yeah, they always feel so bad making me do calibration like, oh, we’re about to do a 40-minute calibration. I’m like, “All right, do you guys want to do two of them?” I’m always asking to… Whatever they need, I’m more than happy to do. And it’s not bad. I get to lie there or sit in my chair and do these things with some great people. I get to have great conversations. I can give them feedback. I can talk about all sorts of things. I could throw something on, on my TV in the background, and split my attention between them. It’s not bad at all. I don’t mind it.

Lex Fridman (07:46:06) Is there a score that you get?

Lex Fridman (07:46:07) Can you do better on a bubble game?

Noland Arbaugh (07:46:08) No, I would love that.

Noland Arbaugh (07:46:12) Yeah, I would love a-

Lex Fridman (07:46:13) Writing down suggestions from Noland.

Lex Fridman (07:46:18) Make it more fun, gamified.

Noland Arbaugh (07:46:20) Yeah, that’s one thing that I really, really enjoy about Webgrid is because I’m so competitive. The higher the BPS, the higher the score, I know the better I’m doing, and so if I… I think I’ve asked at one point, one of the guys, if he could give me some sort of numerical feedback for calibration. I would like to know what they’re looking at. Like, oh, we see this number while you’re doing calibration, and that means, at least on our end, that we think calibration is going well. And I would love that because I would like to know if what I’m doing is going well or not. But then they’ve also told me, yeah, not necessarily one to one. It doesn’t actually mean that calibration is going well in some ways. So it’s not like a hundred percent and they don’t want to skew what I’m experiencing or want me to change things based on that, if that number isn’t always accurate to how the model will turn out or the end result,. That’s at least what I got from it.

(07:47:19) One thing I have asked them, and something that I really enjoy striving for, is towards the end of calibration, there is a time between targets. And so I like to keep, at the end, that number as low as possible. So at the beginning it can be four or five, six seconds between me popping bubbles, but towards the end I like to keep it below 1.5 or if I could get it to one second between bubbles. Because in my mind, that translates really nicely to something like Webgrid, where I know if I can hit a target, one every second, that I’m doing real, real well.

Lex Fridman (07:47:58) There you go. That’s a way to get a score on the calibrations, like the speed. How quickly can you get from bubble to bubble?

Lex Fridman (07:48:05) So there’s the open loop and then it goes to the closed loop.

Lex Fridman (07:48:08) And the closed loop can already start giving you a sense because you’re getting feedback of how good the model is.

Noland Arbaugh (07:48:13) Yeah. Yeah. So closed loop is when I first get cursor control, and how they’ve described it to me, someone who does not understand this stuff, I am the dumbest person in the room every time I’m with any of those guys.

Lex Fridman (07:48:13) I love the humility. I appreciate it.

Noland Arbaugh (07:48:27) Yeah, is that I am closing the loop. So I am actually now the one that is finishing the loop of whatever this loop is. I don’t even know what the loop is. They’ve never told me. They just say there is a loop and at one point it’s open and I can’t control, and then I get control and it’s closed. So I’m finishing the loop.

Lex Fridman (07:48:48) So how long the calibration usually take? You said 10, 15 minutes, [inaudible 07:48:52]-

Noland Arbaugh (07:48:52) Well, yeah, they’re trying to get that number down pretty low. That’s what we’ve been working on a lot recently, is getting that down is low as possible. So that way, if this is something that people need to do on a daily basis or if some people need to do on a every-other-day basis or once a week, they don’t want people to be sitting in calibration for long periods of time. I think they’ve wanted to get it down seven minutes or below, at least where we’re at right now. It’d be nice if you never had to do calibration. So we’ll get there at some point, I’m sure, the more we learn about the brain, and I think that’s the dream. I think right now, for me to get really, really good models, I’m in calibration 40 or 45 minutes. And I don’t mind, like I said, they always feel really bad, but if it’s going to get me a model that can break these records on Webgrid, I’ll stay in it for flipping two hours.

Webgrid

Lex Fridman (07:49:50) Let’s talk business. So Webgrid, I saw a presentation where Bliss said by March you selected 89,000 targets in Webgrid. Can you explain this game? What is Webgrid and what does it take to be a world-class performer in Webgrid, as you continue to break world records?

Lex Fridman (07:50:10) It’s like a gold medalist talk. Well, where do I begin?

Noland Arbaugh (07:50:15) Yeah, I’d like thank-

Noland Arbaugh (07:50:18) … everyone who’s helped me get here, my coaches, my parents, for driving me to practice every day at 5:00 in the morning. I like to thank God and just overall my dedication to my craft. [inaudible 07:50:29].

Lex Fridman (07:50:29) Yeah, the interviews with athletes, they’re always like that exact-

Lex Fridman (07:50:29) It’s that template.

Noland Arbaugh (07:50:41) Yeah, it’s literally just a grid. They can make it as big or small as you can make a grid. A single box on that grid will light up and you go and click it. And it is a way for them to benchmark how good a BCI is. So it’s pretty straightforward. You just click targets.

Lex Fridman (07:51:01) Only one blue cell appears and you’re supposed to move the mouse to there and click on it.

Noland Arbaugh (07:51:06) Yep. So I like playing on bigger grids because the bigger the grid, the more BPS, it’s bits per second, that you get every time you click one. So I’ll say I’ll play on a 35 by 35 grid, and then one of those little squares, a cell, you can call it, target, whatever, will light up. And you move the cursor there, and you click it, and then you do that forever.

Lex Fridman (07:51:34) And you’ve been able to achieve, at first, eight bits per second, then you’ve recently broke that.

Noland Arbaugh (07:51:40) Yeah. Yeah, I’m at 8.5 right now. I would’ve beaten that literally the day before I came to Austin. But I had a, I don’t know, a five-second lag right at the end, and I just had to wait until the latency calmed down, and then I kept clicking. But I was at 8.01, and then five seconds of lag, and then the next three targets I clicked all stayed at 8.01. So if I would’ve been able to click during that time of lag, I probably would’ve hit, I don’t know, I might’ve hit nine. So I’m there. I’m really close, and then this whole Austin trip has really gotten in the way of my Webgrid playing ability.

Noland Arbaugh (07:52:26) I’ve been itching.

Lex Fridman (07:52:26) … you’ve thinking about right now?

Noland Arbaugh (07:52:26) Yeah, I know. I just want to do better.

Noland Arbaugh (07:52:28) I want to do better. I want to hit nine, I think, well, I know nine is very, very achievable. I’m right there. I think 10 I could hit, maybe in the next month. I could do it probably in the next few weeks if I really push.

Lex Fridman (07:52:41) I think you and Elon are basically the same person because last time I did a podcast with him, he came in extremely frustrated that he can’t beat Uber Lilith as a Druid.

Noland Arbaugh (07:52:51) [inaudible 07:52:51].

Lex Fridman (07:52:50) That was a year ago, I think, I forget, solo. And I could just tell there’s some percentage of his brain, the entire time was thinking, “I wish I was right now attempting.” [inaudible 07:53:01]-

Noland Arbaugh (07:53:01) Yeah. I think he did it that night.

Lex Fridman (07:53:06) He did it that night. He stayed up and did it that night, which is crazy to me. In a fundamental way, it’s really inspiring and what you’re doing is inspiring in that way because it’s not just about the game. Everything you’re doing there has impact. By striving to do well on Webgrid, you’re helping everybody figure out how to create the system all along the decoding, the software, the hardware, the calibration, all of it. How to make all of that work so you can do everything else really well.

Noland Arbaugh (07:53:36) Yeah, it’s just really fun.

Lex Fridman (07:53:38) Well, that’s also, that’s part of the thing, is that making it fun.

Noland Arbaugh (07:53:42) Yeah, it’s a addicting. I’ve joked about what they actually did when they went in and put this thing in my brain. They must’ve flipped a switch to make me more susceptible to these kinds of games, to make me addicted to Webgrid or something.

Noland Arbaugh (07:53:59) Do you know Bliss’s high score?

Lex Fridman (07:54:00) Yeah, he said like 14 or something.

Noland Arbaugh (07:54:04) 17.1 or something. 17.01?

Lex Fridman (07:54:09) He told me he does it on the floor with peanut butter and he fasts. It’s weird. That sounds like cheating. Sounds like performance enhancing-

Bliss Chapman (07:54:17) Noland, the first time Noland played this game, he asked how good are we at this game? And I think you told me right then, you’re going to try to beat me [inaudible 07:54:24]-

Noland Arbaugh (07:54:24) I’m going to get there someday.

Bliss Chapman (07:54:24) Yeah, I fully believe you.

Noland Arbaugh (07:54:26) I think I can. I think I can. I think-

Bliss Chapman (07:54:27) I’m excited for that.

Noland Arbaugh (07:54:28) Yeah. So I’ve been playing, first off, with the dwell cursor, which really hampers my Webgrid playing ability. Basically I have to wait 0.3 seconds for every click.

Lex Fridman (07:54:40) Oh, so you can’t do the click. So you click by dwelling, you said 0.3.

Noland Arbaugh (07:54:45) 0.3 seconds, which sucks. It really slows down how high I’m able to get. I still hit 50, I think I hit 50-something net trials per minute in that, which was pretty good because I’m able to… One of the settings is also how slow you need to be moving in order to initiate a click, to start a click. So I can tell, sort of, when I’m on that threshold, to start initiating a click just a bit early. So I’m not fully stopped over the target when I go to click, I’m doing it on my way to the targets a little, to try to time it just right.

Lex Fridman (07:55:30) So you’re slowing down.

Noland Arbaugh (07:55:31) Yeah, just a hair, right before the targets.

Lex Fridman (07:55:34) This is like elite performance. Okay, but that’s still, it sucks that there’s a ceiling of the 0.3.

Noland Arbaugh (07:55:41) Well, I can get down to 0.2 and 0.1. 0.1’s what I’ve-

Lex Fridman (07:55:45) [inaudible 07:55:45].

Noland Arbaugh (07:55:45) Yeah, and I’ve played with that a little bit too. I have to adjust a ton of different parameters in order to play with 0.1, and I don’t have control over all of that on my end yet. It also changes how the models are trained. If I train a model, like in Webgrid, I bootstrap on a model, which basically is them training models as I’m playing Webgrid based off of the Webgrid data that I’m… So if I play Webgrid for 10 minutes, they can train off that data specifically in order to get me a better model. If I do that with 0.3 versus 0.1, the models come out different. The way that they interact, it’s just much, much different. So I have to be really careful. I found that doing it with 0.3 is actually better in some ways. Unless I can do it with 0.1 and change all of the different parameters, then that’s more ideal, because obviously 0.3 is faster than 0.1. So I could get there. I can get there.

Lex Fridman (07:56:43) Can you click using your brain?

Noland Arbaugh (07:56:45) For right now, it’s the hover clicking with the dwell cursor. Before all the thread retraction stuff happened, we were calibrating clicks, left click, right click. That was my previous ceiling, before I broke the record again with the dwell cursor, was I think on a 35 by 35 grid with left and right click. And you get more BPS, more bits per second, using multiple clicks because it’s more difficult.

Lex Fridman (07:57:12) Oh, because what is it, you’re supposed to do either a left click or a right click?

Lex Fridman (07:57:18) Is a different colors, something like this?

Noland Arbaugh (07:57:18) Different colors.

Noland Arbaugh (07:57:19) Yeah, blue targets for left click, orange targets for right click is what they had done.

Noland Arbaugh (07:57:23) So my previous record of 7.5-

Lex Fridman (07:57:26) Was with the two clicks.

Noland Arbaugh (07:57:27) … was with the blue and the orange targets, yeah, which I think if I went back to that now, doing the click calibration, I would be able to… And being able to initiate clicks on my own, I think I would break that 10 ceiling in a couple days, max.

Lex Fridman (07:57:43) Yeah, you would start making Bliss nervous about his 17.

Noland Arbaugh (07:57:46) Yeah, he should be.

Bliss Chapman (07:57:47) Why do you think we haven’t given him the-

Retracted threads

Lex Fridman (07:57:49) Exactly. Exactly. So what did it feel like with the retractions, that some of the threads are retracted?

Noland Arbaugh (07:57:57) It sucked. It was really, really hard. The day they told me was the day of my big Neuralink tour at their Fremont facility. They told me right before we went over there. It was really hard to hear. My initial reaction was, all right, go in, fix it. Go in, take it out and fix it. The first surgery was so easy. I went to sleep, a couple hours later I woke up and here we are. I didn’t feel any pain, didn’t take any pain pills or anything. So I just knew that if they wanted to, they could go in and put in a new one next day if that’s what it took because I wanted it to be better and I wanted not to lose the capability. I had so much fun playing with it for a few weeks, for a month. It had opened up so many doors for me. It had opened up so many more possibilities that I didn’t want to lose it after a month.

(07:58:58) I thought it would’ve been a cruel twist of fate if I had gotten to see the view from the top of this mountain and then have it all come crashing down after a month. And I knew, I say the top of the mountain, but how I saw it was I was just now starting to climb the mountain and there was so much more that I knew was possible. And so to have all of that be taken away was really, really hard. But then on the drive over to the facility, I don’t know, five minute drive, whatever it is, I talked with my parents about it. I prayed about it. I was just like, I’m not going to let this ruin my day. I’m not going to let this ruin this amazing tour that they have set up for me. I want to go show everyone how much I appreciate all the work they’re doing.

(07:59:54) I want to go meet all of the people who have made this possible, and I want to go have one of the best days of my life, and I did. And it was amazing, and it absolutely was one of the best days I’ve ever been privileged to experience. And then for a few days I was pretty down in the dumps, but for the first few days afterwards, I didn’t know if it was ever going to work again. And then I made the decision that, even if I lost the ability to use the Neuralink, even if I lost out on everything to come, if I could keep giving them data in any way, then I would do that.

(08:00:41) If I needed to just do some of the data collection every day or body mapping every day for a year, then I would do it because I know that everything I’m doing helps everyone to come after me, and that’s all I wanted. Just the whole reason that I did this was to help people, and I knew that anything I could do to help, I would continue to do, even if I never got to use the cursor again, then I was just happy to be a part of it. And everything that I had done was just a perk. It was something that I got to experience, and I know how amazing it’s going to be for everyone to come after me. So might as well just keep trucking along.

Lex Fridman (08:01:22) Well, that said, you were able to get to work your way up, to get the performance back. So this is like going from Rocky I to Rocky II. So when did you first realize that this is possible, and what gave you the strength, the motivation, the determination to do it, to increase back up and beat your previous record?

Noland Arbaugh (08:01:42) Yeah, it was within a couple weeks, [inaudible 08:01:44]-

Lex Fridman (08:01:44) Again, this feels like I’m interviewing an athlete. This is great. I’d like thank my parents.

Noland Arbaugh (08:01:50) The road back was long and hard-

Lex Fridman (08:01:53) [inaudible 08:01:53] like a movie.

Noland Arbaugh (08:01:53) … fraught with many difficulties. There were dark days. It was a couple weeks, I think, and then there was just a turning point. I think they had switched how they were measuring the neuron spikes in my brain, the… Bliss help me out.

Bliss Chapman (08:02:15) Yeah, the way in which we were measuring the behavior of individual neurons.

Bliss Chapman (08:02:18) So we’re switching from individual spike detection to something called spike band power, which if you watch the previous segments with either me or DJ, you probably have some [inaudible 08:02:26]-

Noland Arbaugh (08:02:27) So when they did that, it was like a light over the head, light bulb moment, like, oh, this works and this seems like we can run with this. And I saw the uptick in performance immediately. I could feel it when they switched over. I was like, “This is better. This is good. Everything up until this point,” for the last few weeks, last, whatever, three or four weeks because it was before they even told me, “Everything before this sucked. Let’s keep doing what we’re doing now.” And at that point it was not like, oh, I know I’m still only at, say in Webgrid terms, four or five BPS compared to my 7.5 before, but I know that if we keep doing this, then I can get back there. And then they gave me the dwell cursor and the dwell cursor sucked at first. It’s obviously not what I want, but it gave me a path forward to be able to continue using it and hopefully to continue to help out. And so I just ran with it, never looked back. Like I said, I’m just kind of person, I roll with the punches anyway. So-

Lex Fridman (08:03:37) What was the process? What was the feedback loop on the figuring out how to do the spike detection in a way that would actually work well for Noland?

Bliss Chapman (08:03:45) Yeah, it’s a great question. So maybe just to describe first how the actual update worked. It was basically an update to your implant. So we just did an over-the-air software update to his implants, same way you’d update your Tesla or your iPhone. And that firmware change enabled us to record averages of populations of neurons nearby individual electrodes. So we have less resolution about which individual neuron is doing what, but we have a broader picture of what’s going on nearby an electrode overall. And that feedback loop, basically as Noland described it, it was immediate when we flipped that switch. I think the first day we did that, you had three or four BPS right out of the box, and that was a light bulb moment for, okay, this is the right path to go down. And from there, there’s a lot of feedback around how to make this useful for independent use.

(08:04:27) So what we care about ultimately is that you can use it independently to do whatever you want. And to get to that point, it required us to re-engineer the UX, as you talked about with the dwell cursor, to make it something that you can use independently without us needing to be involved all the time. And yeah, this is obviously the start of this journey still. Hopefully we get back to the places where you’re doing multiple clicks and using that to control, much more fluidly, everything, and much more naturally the applications that you’re trying to interface with.

Lex Fridman (08:04:51) And most importantly, get that Webgrid number up.

Speaker 1 (08:04:55) Yes. [inaudible 08:04:57].

Lex Fridman (08:04:58) So how is, on the hover click, do you accidentally click stuff sometimes?

Lex Fridman (08:05:03) How hard is it to avoid accidentally clicking?

Noland Arbaugh (08:05:05) I have to continuously keep it moving, basically. So like I said, there’s a threshold where it will initiate a click. So if I ever drop below that, it’ll start and I have 0.3 seconds to move it before it clicks anything.

Lex Fridman (08:05:21) [inaudible 08:05:21].

Noland Arbaugh (08:05:20) And if I don’t want it to ever get there, I just keep it moving at a certain speed and just constantly doing circles on screen, moving it back and forth, to keep it from clicking stuff. I actually noticed, a couple weeks back, that when I was not using the implant, I was just moving my hand back and forth or in circles. I was trying to keep the cursor from clicking and I was just doing it while I was trying to go to sleep. And I was like, “Okay, this is a problem.” [inaudible 08:05:52].

Speaker 1 (08:05:51) [inaudible 08:05:51].

Lex Fridman (08:05:52) To avoid the clicking. I guess, does that create problems when you’re gaming, accidentally click a thing? Like-

Noland Arbaugh (08:05:58) Yeah. Yeah. It happens in chess.

Noland Arbaugh (08:06:02) I’ve lost a number of games because I’ll accidentally click something.

Bliss Chapman (08:06:06) I think the first time I ever beat you was because of an accidental click.

Noland Arbaugh (08:06:06) Yeah, a misclick. Yeah.

Lex Fridman (08:06:10) It’s a nice excuse, right? You can always-

Noland Arbaugh (08:06:12) Yeah, [inaudible 08:06:12] it’s great. It’s perfect.

Lex Fridman (08:06:12) … anytime you lose, you could just say, “That was accidental.”

App improvements

Lex Fridman (08:06:16) You said the app improved a lot from version one when you first started using it. It was very different. So can you just talk about the trial and error that you went through with the team? 200 plus pages of notes. What’s that process like of going back and forth and working together to improve the thing?

Noland Arbaugh (08:06:36) It’s a lot of me just using it day in and day out and saying, “Hey, can you guys do this for me? Give me this. I want to be able to do that. I need this.” I think a lot of it just doesn’t occur to them maybe, until someone is actually using the app, using the implant. It’s just something that they just never would’ve thought of or it’s very specific to even me, maybe what I want. It’s something I’m a little worried about with the next people that come is maybe they will want things much different than how I’ve set it up or what the advice I’ve given the team, and they’re going to look at some of the things they’ve added for me. [inaudible 08:07:26] like, “That’s a dumb idea. Why would he ask for that?” And so I’m really looking forward to get the next people on because I guarantee that they’re going to think of things that I’ve never thought of.

(08:07:37) They’re going to think of improvements something like, wow, that’s a really good idea. I wish I would’ve thought of that. And then they’re also going to give me some pushback about, yeah, what you are asking them to do here, that’s a bad idea. Let’s do it this way. And I’m more than happy to have that happen, but it’s just a lot of different interactions with different games or applications, the internet, just with the computer in general. There’s tons of bugs that end up popping up, left, right, center.

(08:08:11) So it’s just me trying to use it as much as possible and showing them what works and what doesn’t work, and what I would like to be better. And then they take that feedback and they usually create amazing things for me. They solve these problems in ways I would’ve never imagined. They’re so good at everything they do, and so I’m just really thankful that I’m able to give them feedback and they can make something of it, because a lot of my feedback is really dumb. It’s just like, “I want this, please do something about it,” and it’ll come back, super well-thought-out, and it’s way better than anything I could have ever thought of or implemented myself. So they’re just great. They’re really, really cool.

Lex Fridman (08:08:53) As the BCI community grows, would you like to hang out with the other folks with Neuralinks? What relationship, if any, would you want to have with them? Because you said they might have a different set of ideas of how to use the thing.

Lex Fridman (08:09:10) Would you be intimidated by their Webgrid performance?

Noland Arbaugh (08:09:13) No. No. I hope-

Noland Arbaugh (08:09:15) I hope, day one, they wipe the floor with me. I hope they beat it and they crush it, double it if they can, just because on one hand it’s only going to push me to be better because I’m super competitive. I want other people to push me. I think that is important for anyone trying to achieve greatness is they need other people around them who are going to push them to be better. And I even made a joke about it on X once, once the next people get chosen, cue buddy cop music. I’m just excited to have other people to do this with and to share experiences with. I’m more than happy to interact with them as much as they want, more than happy to give them advice. I don’t know what kind of advice I could give them, but if they have-

Noland Arbaugh (08:10:00) … give them advice. I don’t know what advice I could give them, but if they have questions, I’m more than happy.

Lex Fridman (08:10:05) What advice would you have for the next participant in the clinical trial?

Noland Arbaugh (08:10:10) That they should have fun with this, because it is a lot of fun, and that I hope they work really, really hard because it’s not just for us, it’s for everyone that comes after us. And come to me if they need anything. And to go to Neuralink if they need anything. Man, Neuralink moves mountains. They do absolutely anything for me that they can, and it’s an amazing support system to have. It puts my mind at ease for so many things that I have had questions about or so many things I want to do, and they’re always there, and that’s really, really nice. And so I would tell them not to be afraid to go to Neuralink with any questions that they have, any concerns, anything that they’re looking to do with this. And any help that Neuralink is capable of providing, I know they will. And I don’t know. I don’t know. Just work your ass off because it’s really important that we try to give our all to this.

Lex Fridman (08:11:20) So have fun and work hard.

Noland Arbaugh (08:11:21) Yeah. Yeah. There we go. Maybe that’s what I’ll just start saying to people. Have fun, work hard.

Lex Fridman (08:11:26) Now you’re a real pro athlete. Just keep it short. Maybe it’s good to talk about what you’ve been able to do now that you have a Neurolink implant, the freedom you gain from this way of interacting with the outside world. You play video games all night and you do that by yourself, and that’s the freedom. Can you speak to that freedom that you gain?

Noland Arbaugh (08:11:53) Yeah. It’s what all… I don’t know, people in my position want. They just want more independence. The more load that I can take away from people around me, the better. If I’m able to interact with the world without using my family, without going through any of my friends, needing them to help me with things, the better. If I’m able to sit up on my computer all night and not need someone to sit me up, say, on my iPad, in a position where I can use it, and then have to have them wait up for me all night until I’m ready to be done using it, it takes a load off of all of us and it’s really all I can ask for. It’s something that I could never thank Neuralink enough for, and I know my family feels the same way. Just being able to have the freedom to do things on my own at any hour of the day or night, it means the world to me and… I don’t know.

Gaming

Lex Fridman (08:13:02) When you’re up at 2:00 AM playing Webgrid by yourself, I just imagine it’s darkness and there’s just a light glowing and you’re just focused. What’s going through your mind? Or you were in a state of flow where it’s like the mind is empty like those Zen masters.

Noland Arbaugh (08:13:22) Yeah. Generally, it is me playing music of some sort. I have a massive playlist, and so I’m just rocking out to music. And then it’s also just a race against time, because I’m constantly looking at how much battery percentage I have left on my implant, like, “All right. I have 30%, which equates to X amount of time, which means I have to break this record in the next hour and a half or else it’s not happening tonight.” And so it’s a little stressful when that happens. When it’s above 50%, I’m like, “Okay, I got time.” It starts getting down to 30, and then 20 it’s like, “All right, 10%, a little popup is going to pop up right here, and it’s going to really screw my Webgrid flow. It’s going to tell me that… The low battery popup comes up and I’m like, “It’s really going to screw me over. So if I’m going to break this record, I have to do it in the next 30 seconds,” or else that popup is going to get in the way, cover my Webgrid.

(08:14:26) And then after that, I go click on it, go back into Webgrid, and I’m like, “All right, that means I have 10 minutes left before this thing’s dead.” That’s what’s going on in my head, generally. That and whatever song’s playing. And I want to break those records so bad. It’s all I want when I’m playing Webgrid. It has become less of like, “Oh, this is just a leisurely activity. I just enjoy doing this because it just feels so nice and it puts me at ease.” It is, “No. Once I’m in Webgrid, you better break this record or you’re going to waste five hours of your life right now.” And I don’t know. It’s just fun. It’s fun, man.

Lex Fridman (08:15:05) Have you ever tried Webgrid with two targets and three targets? Can you get higher BPS with that?

Noland Arbaugh (08:15:05) Can you do that?

Bliss Chapman (08:15:12) You mean different colored targets or you mean-

Lex Fridman (08:15:14) Oh, multiple targets. Does that change the thing?

Bliss Chapman (08:15:16) Yeah. So BPS is a log of number of targets times correct minus incorrect, divided by time. And so you can think of different clicks as basically double the number of active targets.

Bliss Chapman (08:15:26) So basically higher BPS, the more options there are, the more difficult the task. And there’s also Zen mode you’ve played in before, which is infinite-

Noland Arbaugh (08:15:33) Yeah. Yeah. It covers the whole screen with a grid and… I don’t know-

Lex Fridman (08:15:41) And so you can go… That’s insane.

Bliss Chapman (08:15:45) He doesn’t like it because it didn’t show BPS, so-

Noland Arbaugh (08:15:49) I had them put in a giant BPS in the background, so now it’s the opposite of Zen mode. It’s super hard mode, just metal mode. If it’s just a giant number in the back [inaudible 08:16:01].

Bliss Chapman (08:16:01) We should renamed that. Metal mode is a much better [inaudible 08:16:03].

Lex Fridman (08:16:05) So you also play Civilization VI.

Noland Arbaugh (08:16:08) I love Civ VI. Yeah.

Lex Fridman (08:16:10) Usually go with Korea, you said?

Noland Arbaugh (08:16:11) I do. Yeah. So the great part about Korea is they focus on science tech victories, which was not planned. I’ve been playing Korea for years, and then all of the [inaudible 08:16:23] stuff happened, so it aligns. But what I’ve noticed with tech victories is if you can just rush tech, rush science, then you can do anything. At one point in the game, you’ll be so far ahead of everyone technologically that you’ll have musket men, infantrymen, planes sometimes, and people will still be fighting with bows and arrows. And so if you want to win a domination victory, you just get to a certain point with the science, and then go and wipe out the rest of the world. Or you can just take science all the way and win that way, and you’re going to be so far ahead of everyone because you’re producing so much science that it’s not even close. I’ve accidentally won in different ways just by focusing on science.

Lex Fridman (08:17:18) Accidentally won by focusing on science-

Noland Arbaugh (08:17:20) Yeah. I was playing only science, obviously. Just science all the way, just tech. And I was trying to get every tech in the tech tree and stuff, and then I accidentally won through a diplomatic victory, and I was so mad. I was so mad because it just ends the game one turn. It was like, “Oh, you won. You’re so diplomatic.” I’m like, “I don’t want to do this. I should have declared war on more people or something.” It was terrible. But you don’t need giant civilizations with tech, especially with Korea. You can keep it pretty small. So I generally just get to a certain military unit and put them all around my border to keep everyone out, and then I will just build up. So very isolationist.

Lex Fridman (08:18:06) Just work on the science and the tech.

Noland Arbaugh (08:18:07) Yep, that’s it.

Lex Fridman (08:18:08) You’re making it sound so fun.

Noland Arbaugh (08:18:10) It’s so much fun.

Lex Fridman (08:18:11) And I also saw a Civilization VII trailer.

Noland Arbaugh (08:18:13) Oh, man. I’m so pumped.

Lex Fridman (08:18:14) And that’s probably coming out-

Noland Arbaugh (08:18:16) Come on Civ VII, hit me up. All alpha, beta tests, whatever.

Lex Fridman (08:18:20) Wait, when is it coming out?

Lex Fridman (08:18:22) Yeah, yeah, next year. Yeah. What other stuff would you like to see improved about the Neuralink app and just the entire experience?

Noland Arbaugh (08:18:29) I would like to, like I said, get back to the click on demand, the regular clicks. That would be great. I would like to be able to connect to more devices. Right now, it’s just the computer. I’d like to be able to use it on my phone or use it on different consoles, different platforms. I’d like to be able to control as much stuff as possible, honestly. An Optimus robot would be pretty cool. That would be sick if I could control an Optimus robot. The Link app itself, it seems like we are getting pretty dialed in to what it might look like down the road. It seems like we’ve gotten through a lot of what I want from it, at least. The only other thing I would say is more control over all the parameters that I can tweak with my cursor and stuff. There’s a lot of things that go into how the cursor moves in certain ways, and I have… I don’t know. Three or four of those parameters, and there might-

Lex Fridman (08:19:42) Gain and friction and all that.

Noland Arbaugh (08:19:43) Gain and friction, yeah. And there’s maybe double the amount of those with just velocity and then with the actual [inaudible 08:19:51] cursor. So I would like all of it. I want as much control over my environment as possible, especially-

Lex Fridman (08:19:58) So you want advanced mode. There’s usually this basic mode, and you’re one of those folks, the power-user, advanced-

Noland Arbaugh (08:20:07) That’s what I want. I want as much control over this as possible. So, yeah, that’s really all I can ask for. Just give me everything.

Lex Fridman (08:20:18) Has speech been useful? Just being able to talk also in addition to everything else?

Noland Arbaugh (08:20:23) Yeah, you mean while I’m using it?

Lex Fridman (08:20:25) While you’re using it? Speech-to-text?

Lex Fridman (08:20:28) Or do you type… Because there’s also a keyboard-

Noland Arbaugh (08:20:30) Yeah, yeah, yeah. So there’s a virtual keyboard. That’s another thing I would like to work more on is finding some way to type or text in a different way. Right now, it is a dictation basically and a virtual keyboard that I can use with the cursor, but we’ve played around with finger spelling, sign language finger spelling, and that seems really promising. So I have this thought in my head that it’s going to be a very similar learning curve that I had with the cursor where I went from attempted movement to imagine movement at one point. I have a feeling, this is just my intuition, that at some point, I’m going to be doing finger spelling and I won’t need to actually attempt to finger spell anymore, that I’ll just be able to think the letter that I want and it’ll pop up.

Lex Fridman (08:21:24) That would be epic. That’s challenging. That’s hard. That’s a lot of work for you to take that leap, but that would be awesome.

Noland Arbaugh (08:21:30) And then going from letters to words is another step. Right now, it’s finger spelling of just the sign language alphabet, but if it’s able to pick that up, then it should be able to pick up the whole sign language language, and so then if I could do something along those lines, or just the sign language spelled word, if I can spell it at a reasonable speed and it can pick that up, then I would just be able to think that through and it would do the same thing. After what I saw with the cursor control, I don’t see why it wouldn’t work, but we’d have to play around with it more.

Lex Fridman (08:22:10) What was the process in terms of training yourself to go from attempted movement to imagined movement? How long did that take? So how long would this process take?

Noland Arbaugh (08:22:19) Well, it was a couple weeks before it just happened upon me. But now that I know that that was possible, I think I could make it happen with other things. I think it would be much, much simpler.

Lex Fridman (08:22:32) Would you get an upgraded implant device?

Noland Arbaugh (08:22:34) Sure, absolutely. Whenever they’ll let me.

Lex Fridman (08:22:39) So you don’t have any concerns for you with the surgery experience? All of it was no regrets?

Lex Fridman (08:22:46) So everything’s been good so far?

Lex Fridman (08:22:49) You just keep getting upgrades.

Noland Arbaugh (08:22:50) Yeah. I mean, why not? I’ve seen how much it’s impacted my life already, and I know that everything from here on out, it’s just going to get better and better. So I would love to get the upgrade.

Lex Fridman (08:23:02) What future capabilities are you excited about? So beyond this telepathy, is vision interesting? So for folks, for example, who are blind, so Neuralink enabling people to see, or for speech.

Noland Arbaugh (08:23:19) Yeah, there’s a lot that’s very, very cool about this. I mean, we’re talking about the brain, so this is just motor cortex stuff. There’s so much more that can be done. The vision one is fascinating to me. I think that is going to be very, very cool. To give someone the ability to see for the first time in their life would just be… I mean, it might be more amazing than even helping someone like me. That just sounds incredible. The speech thing is really interesting. Being able to have some real-time translation and cut away that language barrier would be really cool. Any actual impairments that it could solve with speech would be very, very cool.

(08:24:00) And then also, there are a lot of different disabilities that all originate in the brain, and you would be able to hopefully be able to solve a lot of those. I know there’s already stuff to help people with seizures that can be implanted in the brain. I imagine the same thing. And so you could do something like that. I know that even someone like Joe Rogan has talked about the possibilities with being able to stimulate the brain in different ways. I’m not sure how ethical a lot of that would be. That’s beyond me, honestly. But I know that there is a lot that can be done when we’re talking about the brain and being able to go in and physically make changes to help people or to improve their lives. So I’m really looking forward to everything that comes from this. And I don’t think it’s all that far off. I think a lot of this can be implemented within my lifetime, assuming that I live a long life.

Lex Fridman (08:25:07) What you were referring to is things like people suffering from depression or things of that nature, potentially getting help.

Noland Arbaugh (08:25:14) Yeah, flip a switch like that, make someone happy. I think Joe has talked about it more in terms of you want to experience what a drug trip feels like. You want to experience what it’d be like to be on mushrooms or something like that, DMT. You can just flip that switch in the brain. My buddy, Bain, has talked about being able to wipe parts of your memory and re-experience things for the first time, like your favorite movie or your favorite book, just wipe that out real quick, and then re-fall in love with Harry Potter or something. I told him, I was like, “I don’t know how I feel about people being able to just wipe parts of your memory. That seems a little sketchy to me.” He’s like, “They’re already doing it.”

Lex Fridman (08:25:59) Sounds legit. I would love memory replay. Just actually high resolution, replay of old memories.

Noland Arbaugh (08:26:07) Yeah. I saw an episode of Black Mirror about that once, so I don’t think I want it.

Lex Fridman (08:26:10) Yeah, so Black Mirror always considers the worst case, which is important. I think people don’t consider the best case or the average case enough. I don’t know what it is about us humans. We want to think about the worst possible thing. We love drama. It’s like how is this new technology going to kill everybody? We just love that. Again like, “Yes, let’s watch.”

Noland Arbaugh (08:26:32) Hopefully people don’t think about that too much with me. It’ll ruin a lot of my plans.

Lex Fridman (08:26:37) Yeah, I assume you’re going to have to take over the world. I mean, I love your Twitter. You tweeted, “I’d like to make jokes about hearing voices in my head since getting the Neuralink, but I feel like people would take it the wrong way. Plus the voices in my head told me not to.”

Controlling Optimus robot

Lex Fridman (08:26:53) Please never stop. So you were talking about Optimus. Is that something you would love to be able to do to control the robotic arm or the entirety of Optimus?

Noland Arbaugh (08:27:05) Oh, yeah, for sure. For sure. Absolutely.

Lex Fridman (08:27:07) You think there’s something fundamentally different about just being able to physically interact with the world?

Noland Arbaugh (08:27:12) Yeah. Oh, 100%. I know another thing with being able to give people the ability to feel sensation and stuff too, by going in with the brain and having a Neuralink maybe do that, that could be something that could be transferred through the Optimus as well. There’s all sorts of really cool interplay between that. And then also, like you said, just physically interacting. I mean, 99% of the things that I can’t do myself, obviously, I need a caretaker for, someone to physically do things for me. If an Optimus robot could do that, I could live an incredibly independent life and not be such a burden on those around me, and it would change the way people like me live, at least until whatever this is gets cured.

(08:28:12) But being able to interact with the world physically, that would just be amazing. And not just for having it be a caretaker or something, but something like I talked about. Just being able to read a book. Imagine an Optimus robot just being able to hold a book open in front of me. I get that smell again. I might not be able to feel it at that point, or maybe I could, again, with the sensation and stuff. But there’s something different about reading a physical book than staring at a screen or listening to an audiobook. I actually don’t like audiobooks. I’ve listened to a ton of them at this point, but I don’t really like them. I would much rather read a physical copy.

Lex Fridman (08:28:52) So one of the things you would love to be able to experience is opening the book, bringing it up to you, and to feel the touch of the paper.

Noland Arbaugh (08:29:01) Yeah. Oh, man. The touch, the smell. I mean, it’s just something about the words on the page. And they’ve replicated that page color on the Kindle and stuff. Yeah, it’s just not the same. Yeah. So just something as simple as that.

Lex Fridman (08:29:18) So one of the things you miss is touch?

Noland Arbaugh (08:29:20) I do. Yeah. A lot of things that I interact with in the world, like clothes or literally any physical thing that I interact within the world, a lot of times what people around me will do is they’ll just come rub it on my face. They’ll lay something on me so I can feel the weight. They will rub a shirt on me so I can feel fabric. There’s something very profound about touch, and it’s something that I miss a lot and something I would love to do again. We’ll see.

Lex Fridman (08:29:56) What would be the first thing you do with a hand that can touch? Give your mom a hug after that, right?

Noland Arbaugh (08:30:02) Yeah. I know. It’s one thing that I’ve asked God for basically every day since my accident was just being able to one day move, even if it was only my hand, so that way, I could squeeze my mom’s hand or something just to show her how much I care and how much I love her and everything. Something along those lines. Being able to just interact with the people around me. Handshake, give someone a hug. I don’t know. Anything like that. Being able to help me eat. I’d probably get really fat, which would be a terrible, terrible thing.

Lex Fridman (08:30:44) Also, beat Bliss in chess on a physical board.

Noland Arbaugh (08:30:47) Yeah. Yeah. I mean, there were just so many upsides. And any way to find some way to feel like I’m bringing Bliss down to my level because he’s just such an amazing guy, and everything about him is just so above and beyond, that anything I can do to take him down a notch, I’m more than happy-

Lex Fridman (08:31:10) Yeah. Yeah, humble him a bit. He needs it.

God

Lex Fridman (08:31:13) Okay. As he’s sitting next to me. Did you ever make sense of why God puts good people through such hardship?

Noland Arbaugh (08:31:23) Oh, man. I think it’s all about understanding how much we need God. And I don’t think that there’s any light without the dark. I think that if all of us were happy all the time, there would be no reason to turn to God ever. I feel like there would be no concept of good or bad, and I think that as much of the darkness and the evil that’s in the world, it makes us all appreciate the good and the things we have so much more. And I think when I had my accident, one of the first things I said to one of my best friends was… And this was within the first month or two after my accident, I said, “Everything about this accident has just made me understand and believe that God is real and that there really is a God, basically. And that my interactions with him have all been real and worthwhile.”

(08:32:32) And he said, if anything, seeing me go through this accident, he believes that there isn’t a God. And it’s a very different reaction, but I believe that it is a way for God to test us, to build our character, to send us through trials and tribulations, to make sure that we understand how precious He is and the things that He’s given us and the time that He’s given us, and then to hopefully grow from all of that. I think that’s a huge part of being here, is to not just have an easy life and do everything that’s easy, but to step out of our comfort zones and really challenge ourselves because I think that’s how we grow.

Hope

Lex Fridman (08:33:21) What gives you hope about this whole thing we have going on human civilization?

Noland Arbaugh (08:33:27) Oh, man. I think people are my biggest inspiration. Even just being at Neuralink for a few months, looking people in the eyes and hearing their motivations for why they’re doing this, it’s so inspiring. And I know that they could be other places, at cushier jobs, working somewhere else, doing X, Y, or Z, that doesn’t really mean that much. But instead, they’re here and they want to better humanity, and they want to better just the people around them. The people that they’ve interacted with in their life, they want to make better lives for their own family members who might have disabilities, or they look at someone like me and they say, “I can do something about that. So I’m going to.” And it’s always been what I’ve connected with most in the world are people.

(08:34:22) I’ve always been a people person and I love learning about people, and I love learning how people developed and where they came from, and to see how much people are willing to do for someone like me when they don’t have to, and they’re going out of their way to make my life better. It gives me a lot of hope for just humanity in general, how much we care and how much we’re capable of when we all get together and try to make a difference. And I know there’s a lot of bad out there in the world, but there always has been and there always will be. And I think that that is… It shows human resiliency and it shows what we’re able to endure and how much we just want to be there and help each other, and how much satisfaction we get from that, because I think that’s one of the reasons that we’re here is just to help each other, and… I don’t know. That always gives me hope, is just realizing that there are people out there who still care and who want to help.

Lex Fridman (08:35:31) And thank you for being one such human being and continuing to be a great human being through everything you’ve been through and being an inspiration to many people, to myself, for many reasons, including your epic, unbelievably great performance on Webgrid. I’ll be training all night tonight to try to catch up.

Noland Arbaugh (08:35:52) Hey, man. You can do it. You can do it.

Lex Fridman (08:35:52) And I believe in you that once you come back… So sorry to interrupt with the Austin trip, once you come back, eventually beat Bliss.

Noland Arbaugh (08:36:00) Yeah, yeah, for sure. Absolutely.

Lex Fridman (08:36:02) I’m rooting for you, though. The whole world is rooting for you.

Lex Fridman (08:36:05) Thank you for everything you’ve done, man.

Noland Arbaugh (08:36:07) Thanks. Thanks, man.

Lex Fridman (08:36:09) Thanks for listening to this conversation with Nolan Arbaugh, and before that, with Elon Musk, DJ Seo, Matthew McDougall, and Bliss Chapman. To support this podcast, please check out our sponsors in the description. And now, let me leave you with some words from Aldous Huxley in The Doors of Perception. “We live together. We act on and react to one another. But always, and in all circumstances, we are by ourselves. The martyrs go hand in hand into the arena. They are crucified alone. Embrace the lovers desperately tried to fuse their insulated ecstasies into a single self-transcendence in vain. But it’s very nature, every embodied spirit is doomed to suffer and enjoy its solitude, sensations, feelings, insights, fancies, all these are private, and except through symbols and a secondhand incommunicable. We can pool information about experiences, but never the experiences themselves. From family to nation, every human group is a society of island universes.” Thank you for listening and hope to see you next time.

萨姆·奥尔特曼:OpenAI、GPT-5、Sora、董事会风波、埃隆·马斯克、伊尔亚、权力与AGI (2024-03-18)

Sam Altman: OpenAI, GPT-5, Sora, Board Saga, Elon Musk, Ilya, Power & AGI (2024-03-18)

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:在经历公司内部剧烈的董事会风波和成功发布颠覆性文生视频模型 Sora 之后,OpenAI CEO Sam Altman 与 Lex Fridman 进行了一场深度对话,旨在复盘危机、阐述技术哲学,并展望通往 AGI 的机遇与挑战。

  • 核心论点:本次对话的核心,是围绕 “AGI 之路必然是一场巨大的权力斗争” 这一残酷现实展开的。Altman 以 OpenAI 的董事会风波为缩影,揭示了在通往通用人工智能的征途上,真正的瓶颈不仅是技术突破,更是组织韧性、治理结构和人性博弈的极限考验。他认为,随着 AI 能力指数级增长,其背后所需的**算力(Compute)**将成为未来的核心货币和地缘战略资源,而驾驭这一变革的关键,在于建立一个能够承受极端压力的、对世界负责的治理体系。对话通过复盘危机、解析产品(Sora, GPT-4)、反思竞争,最终勾勒出一个技术理想主义者在面对商业现实、人性弱点和未来不确定性时的清醒认知与战略布局。

2. 🧠 深度观点解析 (Deep Dive Analysis)

维度一:AI 治理的“压力测试”——董事会风波的教训

  • 核心观点:OpenAI 的董事会风波是一次宝贵的、虽痛苦但必要的“早期压力测试”。它暴露了非营利组织董事会结构中存在的“权力真空”和“问责悖论”,即董事会拥有巨大权力,却只对自己负责,缺乏对股东或更广泛利益相关者的制衡。

  • 原理解构:传统公司治理中,董事会最终向股东负责,其决策受到资本利益的约束。而 OpenAI 的非营利性母公司董事会,在设计上是为了确保“为全人类谋福祉”的使命不被商业利益绑架。然而,这场风波证明,这种结构在面临内部认知分歧和巨大外部压力时,可能因缺乏外部问责机制而做出破坏性决策。Altman 的反思是,理想的治理结构应**“尽可能地对全世界负责”,而非一个封闭的小团体。为此,新董事会的构建更侧重于治理经验(Y-intercept)**而非仅仅是潜力(slope),并强调成员构成的多样性(技术、商业、法律、非营利组织)。

  • 证据/案例

    • 事件本身:2023 年 11 月 17 日,董事会突然解雇 Altman,几乎导致公司解体,最终在员工和投资者的联合“兵变”下逆转。
    • 结构缺陷:Altman 指出,非营利组织的董事会“不真正对任何人负责,除了他们自己”(don’t really answer to anyone but themselves)。
    • 解决方案:引入 Brett Taylor、Larry Summers 等具有丰富董事会经验的成员,并计划以“slate”(一组)而非单个成员的方式构建董事会,确保专业能力的互补。

维度二:算力即权力——未来十年最珍贵的商品

  • 核心观点算力(Compute)将成为未来的货币,是世界上最宝贵的商品。对 AI 智能的需求将类似于能源,其消耗量会随着成本的降低而无限增长,因此必须进行大规模的基础设施投资。

  • 原理解构:Altman 将 AI 算力类比为能源而非手机芯片。手机市场存在饱和点(全球几十亿用户),但智能(Intelligence)作为一种服务,其需求弹性极大。如果算力成本足够低,它将被用于从优化个人邮件到攻克癌症等所有领域。这套逻辑重构了 AI 基础设施的投资框架:它不是一个有限市场,而是一个驱动整个经济体增长的基础能源。这个观点解释了为何 Altman 会参与关于“7 万亿美元”的讨论,其核心在于对未来算力需求的量级判断。

  • 证据/案例

    • 类比:将算力需求与能源需求类比,价格越低,用量越大。
    • 能源瓶颈:他明确指出,实现这一愿景的最大瓶颈是能源,并认为**核聚变(Helion)和核裂变(新一代反应堆)**是终极解决方案。
    • 市场规模:他反驳了将 AI 芯片市场与手机 SoC 市场(每年约 30 亿颗)类比的观点,认为 AI 算力市场是无上限的。

维度三:迭代式部署——在“温水煮青蛙”中适应 AGI

  • 核心观点:OpenAI 的核心发布策略是**“迭代式部署” (Iterative Deployment)**,即逐步、持续地发布能力越来越强的模型(如 GPT-1 到 4),而非秘密研发直至 AGI 诞生。这一策略旨在避免社会因技术突变而产生“休克”,给予世界适应、讨论和建立治理框架的时间。

  • 原理解构:该策略的底层逻辑是 “AI and surprise don’t go together”(AI 与惊吓不应相伴而行)。Altman 认为,让社会提前感知到指数曲线的存在,并参与到每一次能力升级的讨论中,是管理 AGI 风险最有效的方式之一。这也解释了他为何会说“GPT-4 kind of sucks”,因为他的参照系永远是几年后的未来模型。这种“永远活在未来”的心态是驱动迭代部署的内在动力,确保当前最先进的技术在未来看来也只是一个粗糙的早期版本。

  • 证据/案例

    • 产品发布节奏:从 GPT-1, 2, 3 到 4 的公开发布,每一次都引发了公众、学界和监管的广泛讨论。
    • 对“飞跃”的反思:尽管 OpenAI 努力迭代,但公众仍然感觉到了 GPT-4 和 Sora 带来的“飞跃感”,这让 Altman 反思是否应该以更小的步子、更频繁地发布
    • GPT-5 计划:他透露今年会发布一个“了不起的新模型”,但不确定会否叫 GPT-5,并暗示在此之前会发布许多其他的东西,这正是迭代策略的体现。

维度四:世界模型(World Models)的涌现——从像素到物理直觉

  • 核心观点:Sora 的成功表明,通过在“互联网规模”的视频数据上进行大规模训练,模型能够自发学习到世界运行的物理规律和三维空间逻辑,形成一个隐式的“世界模型”,尽管它尚不完美。

  • 原理解构:与语言模型处理“文本 token”类似,Sora 将视频分解为时空“patches”(补丁)。通过预测下一个 patch,模型被迫去理解物体之间的关系、运动的连续性以及因果联系。例如,要准确生成一个物体被短暂遮挡(occlusion)后再次出现的场景,模型必须“理解”物体恒存性(object permanence),即物体不会因为看不见而消失。这表明,看似简单的自监督学习目标,可以在足够大的模型和数据尺度上,涌现出对现实世界深刻的、隐式的理解。

  • 证据/案例

    • Sora 的能力:视频中人物走过被遮挡的物体后,物体仍然存在。视频展示了符合物理直觉的连贯运动。
    • Sora 的局限:“猫随机长出额外的肢体”等失败案例,说明其世界模型仍有缺陷,但 Altman 认为这既是当前方法的局限,也可以通过 scale 进一步改善。
    • 技术路线:证实了从 DALL·E 1 到 Sora 的 Scaling Law 在视觉生成领域同样有效。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识

    • “GPT-4 很烂” (GPT-4 kind of sucks):在全球用户惊叹于 GPT-4 的能力时,其创造者却认为它相对于未来的标准而言“难以忍受地糟糕”。这挑战了人们对当前技术成就的满足感,并揭示了顶尖 AI 从业者所处的指数级认知框架。
    • 董事会风波是“好事”:Altman 认为,这场险些摧毁公司的危机,从长远看是一件幸事。它提前暴露了治理结构的脆弱性,让 OpenAI 有机会在 AGI 真正到来、风险更高之前进行修复和强化。
  • 盲点与局限

    • 对“戏剧性风险”的过度关注:Altman 批评 AI 安全社区过度聚焦于“AI 逃出盒子”这类电影情节般的“戏剧性风险”(theatrical risks),而忽略了更多现实、渐进但影响深远的风险,如经济影响、社会偏见、信息操纵等。
    • “开放”的重新定义:对话揭示了“OpenAI”中的“Open”已从最初的“开源”(open source)演变为“向公众提供强大的免费工具”(access)。这挑战了开源社区对“开放”的狭隘定义,但也指出了其在模型透明度上的局限性,并成为 Elon Musk 诉讼的核心争议点。
  • 未解之谜

    • LLM 与搜索的完美结合:Altman 承认,如何将大型语言模型与传统搜索引擎优雅地结合,创造出革命性的信息获取体验,是“还没有人破解的难题”(anyone has cracked the code on yet)。
    • 创作者经济模型的缺失:如何为那些数据被用于训练模型的艺术家和创作者提供公平的经济回报,仍然没有明确答案。Altman 类比音乐产业从 CD 到 Spotify 的变迁,承认这是一个必须解决但尚未解决的问题。
    • AGI 的最终治理模式:尽管经历了董事会重组,但对于一个未来可能拥有巨大权力的 AGI 公司,其最终的、最稳健的治理结构究竟是什么样的,仍然是一个开放性问题。Altman 承认,尽管他个人不寻求控制权,但上次风波中“董事会实际上无法解雇 CEO”的局面本身就是一种治理失败。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “I think compute is going to be the currency of the future. I think it’ll be maybe the most precious commodity in the world.”

    • 中文意译:“我认为算力将是未来的货币。它或许会成为世界上最珍贵的商品。”
    • 语境:在解释为何需要天量投资(如传闻的 7 万亿美元)来建设 AI 基础设施时,Altman 提出了这一核心世界观。
  2. “I expect that the delta between 5 and 4 will be the same as between 4 and 3… it is our job to live a few years in the future and remember that the tools we have now are going to kind of suck looking backwards at them.”

    • 中文意译:“我预计 GPT-5 和 4 之间的差距,会像 4 和 3 之间一样大……我们的工作就是活在几年后的未来,并记住我们现在的工具,回过头看都将烂得不行。”
    • 语境:在评价 GPT-4 时,他以此解释 OpenAI 团队的前瞻性心态和对指数级进步的深刻信念。
  3. “The road to AGI should be a giant power struggle. I expect that to be the case.”

    • 中文意译:“通往 AGI 的道路必然是一场巨大的权力斗争。我预计情况会是这样。”
    • 语境:在反思董事会风波时,Altman 将其定性为未来更大规模冲突的一次预演,表达了对 AGI 研发过程中人性与权力斗争的清醒认知。
  4. “We multiply 200 medium-sized things together into one giant thing.”

    • 中文意译:“我们将 200 个中等大小的创新相乘,最终汇集成一个庞然大物。”
    • 语境:引用 Ilya Sutskever 的话,解释 OpenAI 的技术突破并非源于单一的“秘密武器”,而是由大量中等规模的、跨团队的持续创新协同作用的结果。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)

    • 产品形态:行业焦点将从单纯的 LLM 转向多模态的“世界模型”,文生视频将成为新的竞争高地。同时,**“LLM+搜索”**的融合产品将密集出现,试图定义下一代信息入口。
    • 竞争格局:竞争将从模型能力的“军备竞赛”扩展到对算力和能源的战略锁定。拥有或能保障大规模、低成本算力和能源的公司(如与能源公司结盟)将建立起难以逾越的护城河。
    • 技术栈:对超长上下文窗口(billion-token level)的研究将加速,以实现真正的个性化和终身记忆 AI 助手。AI 模型的发布节奏可能会加快、颗粒度变小,以遵循“迭代式部署”策略,降低社会冲击。
  • 长期终局 (5-10年)

    • 行业图景:如果 Altman 的设想成真,AI 产业的底层将由少数拥有庞大算力和能源基础设施的巨头主导,类似于今天的能源寡头。算力成本将成为地缘政治的关键变量。AGI 的定义将不再是哲学思辨,而是其对**“全球科学发现速度”这一具体指标的提升能力。编程工作将大规模转向自然语言驱动的系统设计与逻辑描述**,人类程序员将成为“AI 架构师”。
    • 社会变革:AI 将作为一种赋能工具 (tool) 而非简单替代工作 (job),它会接管越来越多的任务 (task),从而将人类的生产力提升到新的抽象层次。这将引发对教育、工作价值和经济分配体系的深刻重塑。
  • 行动建议

    • 开发者:不要只做现有模型的“套壳”应用。寻找“LLM+搜索”等尚未被完美解决的交叉领域。开发能够赋能专业人士、处理复杂多步任务的“Agent”型应用,抓住人机协同的浪潮。
    • 投资者:投资组合应超越模型本身,布局 AI 产业链上游:包括先进半导体、数据中心供应链和(特别是)新型能源技术(核聚变/裂变)。评估 AI 公司时,治理结构的稳健性应被视为与技术实力同等重要的风险指标。
    • 创业者:OpenAI 的董事会风波是所有 AGI 赛道公司的前车之鉴。从第一天起就应思考如何设计一个既能坚守使命、又能有效制衡、还能抵御内外压力的弹性治理结构。在产品上,专注于一个能被 AI 极速放大的垂直领域,成为某个“任务”的最佳 AI 解决方案提供商。

这是一份基于萨姆·奥特曼(Sam Altman)与莱克斯·弗里德曼(Lex Fridman)对话深度重构的行业分析报告。


🚀 深度研报:算力霸权、权力博弈与AGI的黎明

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:OpenAI CEO Sam Altman 在经历 2023 年 11 月震惊全球的“董事会罢免风波”后,再次深度对谈。此时正值 GPT-4 趋于成熟、Sora 震撼发布、GPT-5 蓄势待发,且 OpenAI 正面临马斯克(Elon Musk)法律诉讼的复杂行业关口。
  • 核心论点:通往通用人工智能(AGI)的道路不仅是技术攀登,更是一场关于资源(算力与能源)、治理结构与人类适应力的“全球权力博弈”。Altman 认为,算力将成为未来的“全球货币”,而 OpenAI 的核心任务是通过“迭代发布”缓冲技术奇点带来的社会冲击,在不稳定的动态平衡中完成向 AGI 的过渡。

2. 🧠 深度观点解析 (Deep Dive Analysis)

I. 算力:未来的终极“一般等价物”

  • 核心观点:算力将成为世界上最珍贵的商品。
  • 原理解构:不同于智能手机市场的饱和论(每人只需一部手机),智能(Intelligence)更像能源,其需求具有极高的价格弹性。当算力极度廉价时,它会渗透进阅读邮件、辅助诊疗等每一个微小环节;当其昂贵时,则仅用于攻克癌症。
  • 证据/案例:Altman 提到的“7 万亿美元”传闻(虽被他幽默化处理)本质上反映了对**能源(特别是核聚变/核裂变)**和数据中心基础设施的长期刚需。

II. “迭代部署”作为社会缓冲区

  • 核心观点:避免“技术休克”,通过发布“不完美”的模型让世界渐进式适应。
  • 原理解构:Altman 坚持认为 AGI 的出现不应是一个震撼世界的“爆炸瞬间”(Singularity),而应是一条斜率极高的连续曲线。GPT-1 到 GPT-4 的发布,本质上是让机构、法律和人类心理逐步建立防御机制。
  • 证据/案例:Altman 坦言“GPT-4 Sucks”(GPT-4 很烂),这并非谦逊,而是站在 GPT-5 的视角回望。他认为如果秘密研发 GPT-5 直至完美再发布,社会将因缺乏心理准备而崩溃。

III. 治理结构的“暴力测试”与韧性

  • 核心观点:AGI 之前的权力斗争是不可避免的,OpenAI 必须构建能抵御高压的结构。
  • 原理解构:2023 年 11 月的风波揭示了非营利组织董事会在面对百亿级商业体时的治理错位。新的董事会(引入 Larry Summers 等)旨在平衡技术理想主义与成熟的管理经验。Altman 强调,AGI 的控制权不应掌握在任何个人(包括他自己)或单一公司手中。
  • 证据/案例:Altman 提到在那个疯狂的周末,尽管董事会有法律权力解雇他,但在实践中(员工和投资者的反弹)这一机制失效了,这本身就是一种“治理失效”。

IV. 物理世界模型:从 Sora 到机器人

  • 核心观点:Sora 不仅仅是视频生成器,更是对三维物理规律的初步模拟。
  • 原理解构:通过对互联网规模的视频数据进行自监督学习,模型开始理解遮挡(Occlusions)和物体持久性。Altman 预示 OpenAI 将重返机器人领域,因为只有具备物理实体的 AI 才能真正解放人类劳动。
  • 证据/案例:Sora 视频中人走过物体后物体依然存在的物理一致性,证明了通过二维像素补丁(Patches)学习三维世界逻辑的可行性。

V. 后搜索时代:从“排名”到“综合”

  • 核心观点:OpenAI 的目标不是做一个更好的 Google,而是彻底改变信息交互范式。
  • 原理解构:传统搜索(10 个蓝色链接)是低效的。未来的范式是 LLM 与搜索的有机结合,由 AI 直接完成信息的提炼、综合与行动建议。Altman 对广告模式表示审美上的厌恶,倾向于用户付费订阅模式以保持中立性。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识:关于 AGI 的定义 Altman 认为 AGI 不是一个终点,而是一个里程碑。他提出的标准非常务实:当系统能显著提高全球科学发现的速度时,它就是 AGI。 这一标准将 AGI 从“聊天机器人”提升到了“生产力发动机”的高度。
  • 盲点与局限:劇場式风险(Theatrical Risks) Altman 警示公众过度关注“AI 逃出箱子消灭人类”这种电影般的、剧场式的风险,而忽视了更微妙但更紧迫的风险(如社会不平等、虚假信息流、经济结构断裂)。他认为这是一种认知偏误。
  • 未解之谜:慢思考机制(Sequential Thinking) 对话承认目前的 LLM 为每一计算 Token 分配等量算力是低效的。如何让 AI 像人类一样在处理难题时“停下来思考”,在简单题上“快速反应”,即动态分配计算资源(类似 Q* 项目的传闻),仍是待攻克的尖端课题。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “Compute is going to be the currency of the future.” (算力将成为未来的货币。) —— 语境:讨论未来全球经济基础设施的重构。
  2. “GPT-4, relative to where we need to get to… I think it kind of sucks.” (相对于我们的目标,我觉得 GPT-4 挺烂的。) —— 语境:表达对 GPT-5 跨越式进步的信心。
  3. “The road to AGI should be a giant power struggle. I expect that to be the case.” (通往 AGI 的道路理应是一场巨型的权力斗争。我预料到会如此。) —— 语境:反思董事会风波及全球范围内的技术竞赛。
  4. “I miss the old Elon.” (我怀念以前那个埃隆。) —— 语境:评论马斯克对 OpenAI 的起诉,表达对其作为先驱者的尊重与对其现状的遗憾。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)
    • GPT-5 的范式转移:预计 GPT-5 将在推理能力和多模态整合上有质的飞跃,可能解决 LLM 长期存在的幻觉问题。
    • 算力投资潮:能源公司(核能、核聚变)将成为科技巨头的核心战略伙伴。
  • 长期终局 (5-10年)
    • 科学爆发:AGI 辅助下的材料科学、生物工程将进入非线性增长期,人类文明的“支架”(Scaffolding)将彻底重组。
    • 社会契约重塑:全民基本收入(UBI)或类似机制将从边缘话题变为政治核心,因为 AI 将接管大部分复杂的“五天任务”。
  • 行动建议
    • 对于开发者:不要只做 AI 的外壳,要关注如何利用 AI 的“慢思考”和“长程任务处理能力”解决垂类复杂问题。
    • 对于投资者:重心应从纯软件应用转向“算力-能源-基础设施”三位一体的硬核科技链条。
    • 对于所有人:培养“与智能共生”的能力,学会将 AI 作为思维的脚手架,而非单纯的替代工具。

分析师点评:萨姆·奥特曼在这场对话中表现出一种“冷静的激进”。他承认了权力的残酷与技术的不完美,但他对算力决定论的执着,预示着 OpenAI 正在从一家软件公司演变为一家管理“智能能源”的基础设施巨头。这场权力的博弈,才刚刚开始。

OpenAI 深度研报:迈向通用人工智能的权力博弈与治理重构

1. 🎯 核心论题与背景

  • 对话背景:本段对话由科技播客主持人 Lex Fridman 与 OpenAI 首席执行官 Sam Altman 进行。背景设定在前一年 OpenAI 那场戏剧性的董事会斗争(Altman 被罢免随后复职)之后,以及公司在巨额融资揭露和埃隆·马斯克诉讼风波的前夜。这是一次关于 OpenAI “现状”的深度复盘与对未来技术路径的全面展望。
  • 核心论点:通往 AGI(通用人工智能)的道路是一场不可避免的全球性权力斗争。Altman 将这一过程视为一种必要的组织压力测试,强调构建能够抵御剧变的高韧性治理结构至关重要。他指出,计算力将成为未来的“通货”,其需求量将远超目前的预期规模,且类似于能源的属性(价格越低,使用量越大)。在战略上,他反直觉地认为**“剧场式”的 AI 灾难叙事(如 AI 越狱、物理失控)** 并非高层风险,真正的挑战在于慢性的、结构性的社会冲击(如科学发现率的提升、经济基础的改变)

2. 🧠 深度观点解析

1. 治理与权力结构的再审计

  • 核心观点:OpenAI 的董事会动荡是一次“完美风暴”,它暴露了单一结构下初创公司的脆弱性,也是为未来更高风险期做的压力测试。
  • 原理解构:Altman 认为企业的CEO不能因为董事会“合法”的罢免权而被架空,实际上公众舆论和员工支持的力量在危机中构成了新的制衡。新的董事会结构增加了治理者和执行团队的克制。他主张董事会不应仅对股东负责(即使是单一大股东),而应对“世界”和“对社会的影响”负责。
  • 证据/案例:Altman 回忆周末期间虽然身心受创,但感到**“巨大的爱与极少仇恨”**。这表明在现代科技斗争中,除了法律合规,社会资本和公众信任是对抗资本霸权对抗的核心杠杆。

2. 计算力即能源与现代经济学

  • 核心观点:计算力不是手机芯片那样的消费品市场,而是类似能源的基础设施市场。
  • 原理解构:手机市场存在上限(一人一机),但计算力的需求阈值几乎无限。随着价格下降,不仅应用范围扩大,用途本身也会发生质变(从辅助到主动思考)。这导致 Altman 猜测全球需要高达数万亿美元级别的算力投入,核心瓶颈已转向能源(特别是核聚变和裂变)。
  • 证据/案例:他提到的“7 trillion meme”(关于筹集7万亿美元),虽是玩笑,但精准地隐喻了未来算力基础设施所需的资本体量。Helion 等聚变公司及核裂变的复兴被提及为解决能源限制的关键。

3. GPT-4 的“平庸”与 GPT-5 的愿景

  • 核心观点:GPT-4 已经过时,未来的模型将不仅仅是能力的叠加,而是全维度的提升,并最终接管知识工作者的“启动瞬间”。
  • 原理解构:处于指数曲线中段的人会在“当下”觉得现状完美,但回看历史又会觉得过去的技术“惨不忍睹”。GPT-5 的目标不仅仅是智商提高,而是赋予系统**“类似于慢思考”的能力**,即能针对复杂问题调用更多算力进行推理,而非一味求快。
  • 证据/案例:Altman 将模型的新角色定义为用户“所有知识任务流程的起点”,并极度看好长期记忆语境长度的发展——未来模型可能存储用户一生的信息(万亿级语境),从而成为个人的全天候代理。

4. AGI 的定义:科学加速器而非“恐怖故事”

  • 核心观点:AGI 的核心定义不在于通过图灵测试或聊天,而在于**“显著提升科学发现的速率”**。
  • 原理解构:大多数经济增长来自科学与技术进步。如果 AI 能像科研伙伴一样提出直觉假设、寻找证据或设计实验,即使它不能立刻给出万能公式,也是 AGI 的实现。
  • 证据/案例:面对 Turing 问卷,Altman 表示即便 AGI 问世,他也不会问具体的物理公式(如大统一理论),而是会问是否需要特定的实验设备来验证某个离散的猜想。这反映了他对 AGI 研究属性的理解。

5. 风险博弈:忽视“剧场效应”

  • 核心观点:人类对风险感知存在偏差,AI 越狱或物理杀人的“好莱坞式”风险(剧场风险)远被高估,而被真正忽视的是缓慢的污染、长期的经济结构调整和偏见的累积(慢燃风险)。
  • 原理解构:尽管替代汽车的安全汽车(等离子引擎)出现,但人类仍恐惧核反应堆。同理,AI 最危险的不是失控开枪,而是日复一日地输出错误信息、抬高中立的广告成本、加剧社会两极分化。Altman 认为政府必须在危险显现前介入制定规则。

3. 💡 反直觉与批判性视角

  • 打破共识:反“自我实现预言”的 AI 焦虑 Altman 否定了当前 AI 安全研究界的主流观点——即 AI 潜在逃脱人类控制的“开关问题”是人类最紧迫的威胁,甚至认为这是**“戏剧性的浪费”**。他认为这对公众舆论有误导性,会让人们为了解决一个可能永远不会发生的极端场景,而忽视了大量可见且正在发生的伤害。这是一种极具争议的“实用主义”风险观。

  • 打破共识:对信任机制的修正 以往 Altman 信奉“共情与信任”,“不要担心那一小部分的阴谋论者在交换中让渡了警惕性”。然而,OpenAI 风波后他表示自己被完全击穿,不得不保留对边缘情况的防御性思维。这揭示了在构建决定人类未来的系统时,建立信任是非常脆弱的,一旦崩塌,代价是毁灭性的。

  • 盲点与局限:关于开放源代码的误读 针对马斯克的诉讼,Altman 认为十年前开源的初衷是模糊的,那时甚至不知道会有 API。虽然他承认未来会有开源模型(特别是较小型的),但他似乎低估了完全开放最先进模型所带来的国防安全与社会风险挑战。他目前的立场倾向于**“有选择地开源”**,这在地缘政治背景下的安全必要性上仍需更深维度的考量。

  • 未解之谜:GPT-5 的具体形态 Altman 拒绝定义什么是 AGI 的具体图景,强调“迭代部署”的重要性。他坦言如果不释放 Sora,世界就会恐慌;如果释放 GPT-5,人们又会恐慌。这暴露了技术发布节奏与人类心理承受力之间的错位,这是一个尚未有完美方案的管理学难题。

4. 💎 金句与高光时刻

  1. “The road to AGI should be a giant power struggle.” (通往 AGI 的道路理应是一场巨大的权力斗争。)
    • 语境:解释 OpenAI 董事会风波定性为修罗场,而非简单的商业纠纷。
  2. “I think compute is going to be the currency of the future. I think it will be maybe the most precious commodity in the world.”
    • 语境:关于全球经济结构的本质,强调算力比能源更稀缺,比黄金更贵重。
  3. “I think it kind of sucks.” (我觉得它挺烂的。)
    • 语境:当被问及 GPT-4 哪里最令人印象深刻时,他横向对比了 GPT-3,一反常态的谦虚,指出了当下模型的基线与 AGI 之间的鸿沟。
  4. “If we could look at it now… the global economy feel any different… than it did before we launched GPT-4?” (如果我们现在看…全球经济和 GPT-4 发布前有任何不同吗?)
    • 语境:探讨 AGI 的社会变革节点的具体感知,他直指核心:当下的技术还处于“工具”阶段,而非“范式转移”阶段。
  5. “But I don’t think that’s what AGI implies. For me, that should be part of it.” (但这并不是 AGI 的隐含意义。对我来说,这应该是其中的一部分。)
    • 语境:关于 AGI 的定义,他认为不应是统治人类或简单的通过图灵测试,而应是**“加速科学”**。

5. 🚀 行业启示与未来推演

  • 短期影响 (1-3年)

    • 产品形态:AI 将大规模切入视频生成和程序化内容生产,成为 Adobe 等工具的底层。
    • 合规策略:模型公司将被迫公开更详细的行为准则,而非仅仅依靠模糊的思政教育来防止生成“黑纳粹”等政治错误图像。
    • 商业逻辑:广告模式在 AI 原生产品中将成为累赘,订阅制或基于信任的产品模式(Ad-free)将确立 dominance。
  • 长期终局 (5-10年)

    • 超级计算与核能源:能源行业将迎来巨变,以支撑万亿级别的算力需求。核聚变可能不再是科幻,而是数据中心的后勤标配。
    • 认知外包:人类的知识工作将从“查找信息”彻底转向“验证与合成信息”。程序员将从编码转向描述需求。
    • 治理格局:AI 将成为社会层面的“基础设施”,类似于电力和互联网。政府将从“监管者”转变为“运行者”或“仲裁者”,个人和组织将被迫适应这无处不在的 AI 脑网络。
  • 行动建议

    • 对于企业家:不要尝试模仿 OpenAI 的“非营利+营利”混合结构(联回该结构是重疾兼高税),应尽早采用清晰的可扩展架构。
    • 对于投资者:将投资焦点从单纯的模型摩尔定律转移到算力全产业链,特别是半导体制造、数据中心建设和清洁能源解决方案。
    • 对于决策者:不要被好莱坞式的 AI 威胁论误导。当下的重点应是解决 AI 造成的长期、隐性偏见和经济摩擦,并建立应对破坏性科学发现的公共治理框架。

逐字稿

Introduction

Sam Altman (00:00:00) I think compute is going to be the currency of the future. I think it’ll be maybe the most precious commodity in the world. I expect that by the end of this decade, and possibly somewhat sooner than that, we will have quite capable systems that we look at and say, “Wow, that’s really remarkable.” The road to AGI should be a giant power struggle. I expect that to be the case.

Lex Fridman (00:00:26) Whoever builds AGI first gets a lot of power. Do you trust yourself with that much power?

(00:00:36) The following is a conversation with Sam Altman, his second time on the podcast. He is the CEO of OpenAI, the company behind GPT-4, ChaTGPT, Sora, and perhaps one day the very company that will build AGI. This is The Lex Fridman Podcast. To support it, please check out our sponsors in the description. And now, dear friends, here’s Sam Altman.

(00:01:05) Take me through the OpenAI board saga that started on Thursday, November 16th, maybe Friday, November 17th for you.

Sam Altman (00:01:13) That was definitely the most painful professional experience of my life, and chaotic and shameful and upsetting and a bunch of other negative things. There were great things about it too, and I wish it had not been in such an adrenaline rush that I wasn’t able to stop and appreciate them at the time. But I came across this old tweet of mine or this tweet of mine from that time period. It was like going your own eulogy, watching people say all these great things about you, and just unbelievable support from people I love and care about. That was really nice, really nice. That whole weekend, with one big exception, I felt like a great deal of love and very little hate, even though it felt like I have no idea what’s happening and what’s going to happen here and this feels really bad. And there were definitely times I thought it was going to be one of the worst things to ever happen for AI safety. Well, I also think I’m happy that it happened relatively early. I thought at some point between when OpenAI started and when we created AGI, there was going to be something crazy and explosive that happened, but there may be more crazy and explosive things still to happen. It still, I think, helped us build up some resilience and be ready for more challenges in the future.

Lex Fridman (00:03:02) But the thing you had a sense that you would experience is some kind of power struggle?

Sam Altman (00:03:08) The road to AGI should be a giant power struggle. The world should… Well, not should. I expect that to be the case.

Lex Fridman (00:03:17) And so you have to go through that, like you said, iterate as often as possible in figuring out how to have a board structure, how to have organization, how to have the kind of people that you’re working with, how to communicate all that in order to deescalate the power struggle as much as possible.

Sam Altman (00:03:38) But at this point, it feels like something that was in the past that was really unpleasant and really difficult and painful, but we’re back to work and things are so busy and so intense that I don’t spend a lot of time thinking about it. There was a time after, there was this fugue state for the month after, maybe 45 days after, that I was just drifting through the days. I was so out of it. I was feeling so down.

Lex Fridman (00:04:17) Just on a personal, psychological level?

Sam Altman (00:04:20) Yeah. Really painful, and hard to have to keep running OpenAI in the middle of that. I just wanted to crawl into a cave and recover for a while. But now it’s like we’re just back to working on the mission.

Lex Fridman (00:04:38) Well, it’s still useful to go back there and reflect on board structures, on power dynamics, on how companies are run, the tension between research and product development and money and all this kind of stuff so that you, who have a very high potential of building AGI, would do so in a slightly more organized, less dramatic way in the future. So there’s value there to go, both the personal psychological aspects of you as a leader, and also just the board structure and all this messy stuff.

Sam Altman (00:05:18) I definitely learned a lot about structure and incentives and what we need out of a board. And I think that it is valuable that this happened now in some sense. I think this is probably not the last high-stress moment of OpenAI, but it was quite a high-stress moment. My company very nearly got destroyed. And we think a lot about many of the other things we’ve got to get right for AGI, but thinking about how to build a resilient org and how to build a structure that will stand up to a lot of pressure in the world, which I expect more and more as we get closer, I think that’s super important.

Lex Fridman (00:06:01) Do you have a sense of how deep and rigorous the deliberation process by the board was? Can you shine some light on just human dynamics involved in situations like this? Was it just a few conversations and all of a sudden it escalates and why don’t we fire Sam kind of thing?

Sam Altman (00:06:22) I think the board members are well-meaning people on the whole, and I believe that in stressful situations where people feel time pressure or whatever, people understand and make suboptimal decisions. And I think one of the challenges for OpenAI will be we’re going to have to have a board and a team that are good at operating under pressure.

Lex Fridman (00:07:00) Do you think the board had too much power?

Sam Altman (00:07:03) I think boards are supposed to have a lot of power, but one of the things that we did see is in most corporate structures, boards are usually answerable to shareholders. Sometimes people have super voting shares or whatever. In this case, and I think one of the things with our structure that we maybe should have thought about more than we did is that the board of a nonprofit has, unless you put other rules in place, quite a lot of power. They don’t really answer to anyone but themselves. And there’s ways in which that’s good, but what we’d really like is for the board of OpenAI to answer to the world as a whole, as much as that’s a practical thing.

Lex Fridman (00:07:44) So there’s a new board announced.

Lex Fridman (00:07:47) There’s I guess a new smaller board at first, and now there’s a new final board?

Sam Altman (00:07:53) Not a final board yet. We’ve added some. We’ll add more.

Lex Fridman (00:07:56) Added some. Okay. What is fixed in the new one that was perhaps broken in the previous one?

Sam Altman (00:08:05) The old board got smaller over the course of about a year. It was nine and then it went down to six, and then we couldn’t agree on who to add. And the board also I think didn’t have a lot of experienced board members, and a lot of the new board members at OpenAI have just have more experience as board members. I think that’ll help.

Lex Fridman (00:08:31) It’s been criticized, some of the people that are added to the board. I heard a lot of people criticizing the addition of Larry Summers, for example. What’s the process of selecting the board? What’s involved in that?

Sam Altman (00:08:43) So Brett and Larry were decided in the heat of the moment over this very tense weekend, and that weekend was a real rollercoaster. It was a lot of ups and downs. And we were trying to agree on new board members that both the executive team here and the old board members felt would be reasonable. Larry was actually one of their suggestions, the old board members. Brett, I think I had even previous to that weekend suggested, but he was busy and didn’t want to do it, and then we really needed help in [inaudible 00:09:22]. We talked about a lot of other people too, but I felt like if I was going to come back, I needed new board members. I didn’t think I could work with the old board again in the same configuration, although we then decided, and I’m grateful that Adam would stay, but we considered various configurations, decided we wanted to get to a board of three and had to find two new board members over the course of a short period of time.

(00:09:57) So those were decided honestly without… You do that on the battlefield. You don’t have time to design a rigorous process then. For new board members since, and new board members we’ll add going forward, we have some criteria that we think are important for the board to have, different expertise that we want the board to have. Unlike hiring an executive where you need them to do one role well, the board needs to do a whole role of governance and thoughtfulness well, and so, one thing that Brett says which I really like is that we want to hire board members in slates, not as individuals one at a time. And thinking about a group of people that will bring nonprofit expertise, expertise at running companies, good legal and governance expertise, that’s what we’ve tried to optimize for.

Lex Fridman (00:10:49) So is technical savvy important for the individual board members?

Sam Altman (00:10:52) Not for every board member, but for certainly some you need that. That’s part of what the board needs to do.

Lex Fridman (00:10:57) The interesting thing that people probably don’t understand about OpenAI, I certainly don’t, is all the details of running the business. When they think about the board, given the drama, they think about you. They think about if you reach AGI or you reach some of these incredibly impactful products and you build them and deploy them, what’s the conversation with the board like? And they think, all right, what’s the right squad to have in that kind of situation to deliberate?

Sam Altman (00:11:25) Look, I think you definitely need some technical experts there. And then you need some people who are like, “How can we deploy this in a way that will help people in the world the most?” And people who have a very different perspective. I think a mistake that you or I might make is to think that only the technical understanding matters, and that’s definitely part of the conversation you want that board to have, but there’s a lot more about how that’s going to just impact society and people’s lives that you really want represented in there too.

Lex Fridman (00:11:56) Are you looking at the track record of people or you’re just having conversations?

Sam Altman (00:12:00) Track record is a big deal. You of course have a lot of conversations, but there are some roles where I totally ignore track record and just look at slope, ignore the Y-intercept.

Lex Fridman (00:12:18) Thank you. Thank you for making it mathematical for the audience.

Sam Altman (00:12:21) For a board member, I do care much more about the Y-intercept. I think there is something deep to say about track record there, and experience is something’s very hard to replace.

Lex Fridman (00:12:32) Do you try to fit a polynomial function or exponential one to the track record?

Sam Altman (00:12:36) That analogy doesn’t carry that far.

Lex Fridman (00:12:39) All right. You mentioned some of the low points that weekend. What were some of the low points psychologically for you? Did you consider going to the Amazon jungle and just taking ayahuasca and disappearing forever?

Sam Altman (00:12:53) It was a very bad period of time. There were great high points too. My phone was just nonstop blowing up with nice messages from people I worked with every day, people I hadn’t talked to in a decade. I didn’t get to appreciate that as much as I should have because I was just in the middle of this firefight, but that was really nice. But on the whole, it was a very painful weekend. It was like a battle fought in public to a surprising degree, and that was extremely exhausting to me, much more than I expected. I think fights are generally exhausting, but this one really was. The board did this Friday afternoon. I really couldn’t get much in the way of answers, but I also was just like, well, the board gets to do this, so I’m going to think for a little bit about what I want to do, but I’ll try to find the blessing in disguise here.

(00:13:52) And I was like, well, my current job at OpenAI is, or it was, to run a decently sized company at this point. And the thing I’d always liked the most was just getting to work with the researchers. And I was like, yeah, I can just go do a very focused AGI research effort. And I got excited about that. Didn’t even occur to me at the time possibly that this was all going to get undone. This was Friday afternoon.

Lex Fridman (00:14:19) So you’ve accepted the death of this-

Sam Altman (00:14:22) Very quickly. Very quickly. I went through a little period of confusion and rage, but very quickly, quickly. And by Friday night, I was talking to people about what was going to be next, and I was excited about that. I think it was Friday evening for the first time that I heard from the exec team here, which is like, “Hey, we’re going to fight this.” and then I went to bed just still being like, okay, excited. Onward.

Lex Fridman (00:14:52) Were you able to sleep?

Sam Altman (00:14:54) Not a lot. One of the weird things was there was this period of four and a half days where I didn’t sleep much, didn’t eat much, and still had a surprising amount of energy. You learn a weird thing about adrenaline in wartime.

Lex Fridman (00:15:09) So you accepted the death of this baby, OpenAI.

Sam Altman (00:15:13) And I was excited for the new thing. I was just like, “Okay, this was crazy, but whatever.”

Lex Fridman (00:15:17) It’s a very good coping mechanism.

Sam Altman (00:15:18) And then Saturday morning, two of the board members called and said, “Hey, we didn’t mean to destabilize things. We don’t want to store a lot of value here. Can we talk about you coming back?” And I immediately didn’t want to do that, but I thought a little more and I was like, well, I really care about the people here, the partners, shareholders. I love this company. And so I thought about it and I was like, “Well, okay, but here’s the stuff I would need.” And then the most painful time of all was over the course of that weekend, I kept thinking and being told, and not just me, the whole team here kept thinking, well, we were trying to keep OpenAI stabilized while the whole world was trying to break it apart, people trying to recruit whatever.

(00:16:04) We kept being told, all right, we’re almost done. We’re almost done. We just need a little bit more time. And it was this very confusing state. And then Sunday evening when, again, every few hours I expected that we were going to be done and we’re going to figure out a way for me to return and things to go back to how they were. The board then appointed a new interim CEO, and then I was like, that feels really bad. That was the low point of the whole thing. I’ll tell you something. It felt very painful, but I felt a lot of love that whole weekend. Other than that one moment Sunday night, I would not characterize my emotions as anger or hate, but I felt a lot of love from people, towards people. It was painful, but the dominant emotion of the weekend was love, not hate.

Lex Fridman (00:17:04) You’ve spoken highly of Mira Murati, that she helped especially, as you put in the tweet, in the quiet moments when it counts. Perhaps we could take a bit of a tangent. What do you admire about Mira?

Sam Altman (00:17:15) Well, she did a great job during that weekend in a lot of chaos, but people often see leaders in the crisis moments, good or bad. But a thing I really value in leaders is how people act on a boring Tuesday at 9:46 in the morning and in just the normal drudgery of the day-to-day. How someone shows up in a meeting, the quality of the decisions they make. That was what I meant about the quiet moments.

Lex Fridman (00:17:47) Meaning most of the work is done on a day-by-day, in meeting-by-meeting. Just be present and make great decisions.

Sam Altman (00:17:58) Yeah. Look, what you have wanted to spend the last 20 minutes about, and I understand, is this one very dramatic weekend, but that’s not really what OpenAI is about. OpenAI is really about the other seven years.

Lex Fridman (00:18:10) Well, yeah. Human civilization is not about the invasion of the Soviet Union by Nazi Germany, but still that’s something people focus on.

Sam Altman (00:18:18) Very understandable.

Lex Fridman (00:18:19) It gives us an insight into human nature, the extremes of human nature, and perhaps some of the damage in some of the triumphs of human civilization can happen in those moments, so it’s illustrative. Let me ask you about Ilya. Is he being held hostage in a secret nuclear facility?

Ilya Sutskever

Lex Fridman (00:18:37) What about a regular secret facility?

Lex Fridman (00:18:40) What about a nuclear non-secret facility?

Sam Altman (00:18:41) Neither. Not that either.

Lex Fridman (00:18:44) This is becoming a meme at some point. You’ve known Ilya for a long time. He was obviously part of this drama with the board and all that kind of stuff. What’s your relationship with him now?

Sam Altman (00:18:57) I love Ilya. I have tremendous respect for Ilya. I don’t have anything I can say about his plans right now. That’s a question for him, but I really hope we work together for certainly the rest of my career. He’s a little bit younger than me. Maybe he works a little bit longer.

Lex Fridman (00:19:15) There’s a meme that he saw something, like he maybe saw AGI and that gave him a lot of worry internally. What did Ilya see?

Sam Altman (00:19:28) Ilya has not seen AGI. None of us have seen AGI. We’ve not built AGI. I do think one of the many things that I really love about Ilya is he takes AGI and the safety concerns, broadly speaking, including things like the impact this is going to have on society, very seriously. And as we continue to make significant progress, Ilya is one of the people that I’ve spent the most time over the last couple of years talking about what this is going to mean, what we need to do to ensure we get it right, to ensure that we succeed at the mission. So Ilya did not see AGI, but Ilya is a credit to humanity in terms of how much he thinks and worries about making sure we get this right.

Lex Fridman (00:20:30) I’ve had a bunch of conversation with him in the past. I think when he talks about technology, he’s always doing this long-term thinking type of thing. So he is not thinking about what this is going to be in a year. He’s thinking about in 10 years, just thinking from first principles like, “Okay, if this scales, what are the fundamentals here? Where’s this going?” And so that’s a foundation for them thinking about all the other safety concerns and all that kind of stuff, which makes him a really fascinating human to talk with. Do you have any idea why he’s been quiet? Is it he’s just doing some soul-searching?

Sam Altman (00:21:08) Again, I don’t want to speak for Ilya. I think that you should ask him that. He’s definitely a thoughtful guy. I think Ilya is always on a soul search in a really good way.

Lex Fridman (00:21:27) Yes. Yeah. Also, he appreciates the power of silence. Also, I’m told he can be a silly guy, which I’ve never seen that side of him.

Sam Altman (00:21:36) It’s very sweet when that happens.

Lex Fridman (00:21:39) I’ve never witnessed a silly Ilya, but I look forward to that as well.

Sam Altman (00:21:43) I was at a dinner party with him recently and he was playing with a puppy and he was in a very silly mood, very endearing. And I was thinking, oh man, this is not the side of Ilya that the world sees the most.

Lex Fridman (00:21:55) So just to wrap up this whole saga, are you feeling good about the board structure-

Lex Fridman (00:22:01) … about all of this and where it’s moving?

Sam Altman (00:22:04) I feel great about the new board. In terms of the structure of OpenAI, one of the board’s tasks is to look at that and see where we can make it more robust. We wanted to get new board members in place first, but we clearly learned a lesson about structure throughout this process. I don’t have, I think, super deep things to say. It was a crazy, very painful experience. I think it was a perfect storm of weirdness. It was a preview for me of what’s going to happen as the stakes get higher and higher and the need that we have robust governance structures and processes and people. I’m happy it happened when it did, but it was a shockingly painful thing to go through.

Lex Fridman (00:22:47) Did it make you be more hesitant in trusting people?

Lex Fridman (00:22:51) Just on a personal level?

Sam Altman (00:22:52) Yes. I think I’m like an extremely trusting person. I’ve always had a life philosophy of don’t worry about all of the paranoia. Don’t worry about the edge cases. You get a little bit screwed in exchange for getting to live with your guard down. And this was so shocking to me. I was so caught off guard that it has definitely changed, and I really don’t like this, it’s definitely changed how I think about just default trust of people and planning for the bad scenarios.

Lex Fridman (00:23:21) You got to be careful with that. Are you worried about becoming a little too cynical?

Sam Altman (00:23:26) I’m not worried about becoming too cynical. I think I’m the extreme opposite of a cynical person, but I’m worried about just becoming less of a default trusting person.

Lex Fridman (00:23:36) I’m actually not sure which mode is best to operate in for a person who’s developing AGI, trusting or un-trusting. It’s an interesting journey you’re on. But in terms of structure, see, I’m more interested on the human level. How do you surround yourself with humans that are building cool shit, but also are making wise decisions? Because the more money you start making, the more power the thing has, the weirder people get.

Sam Altman (00:24:06) I think you could make all kinds of comments about the board members and the level of trust I should have had there, or how I should have done things differently. But in terms of the team here, I think you’d have to give me a very good grade on that one. And I have just enormous gratitude and trust and respect for the people that I work with every day, and I think being surrounded with people like that is really important.

Elon Musk lawsuit

Lex Fridman (00:24:39) Our mutual friend Elon sued OpenAI. What to you is the essence of what he’s criticizing? To what degree does he have a point? To what degree is he wrong?

Sam Altman (00:24:52) I don’t know what it’s really about. We started off just thinking we were going to be a research lab and having no idea about how this technology was going to go. Because it was only seven or eight years ago, it’s hard to go back and really remember what it was like then, but this is before language models were a big deal. This was before we had any idea about an API or selling access to a chatbot. It was before we had any idea we were going to productize at all. So we’re like, “We’re just going to try to do research and we don’t really know what we’re going to do with that.” I think with many fundamentally new things, you start fumbling through the dark and you make some assumptions, most of which turned out to be wrong.

(00:25:31) And then it became clear that we were going to need to do different things and also have huge amounts more capital. So we said, “Okay, well, the structure doesn’t quite work for that. How do we patch the structure?” And then you patch it again and patch it again and you end up with something that does look eyebrow-raising, to say the least. But we got here gradually with, I think, reasonable decisions at each point along the way. And it doesn’t mean I wouldn’t do it totally differently if we could go back now with an Oracle, but you don’t get the Oracle at the time. But anyway, in terms of what Elon’s real motivations here are, I don’t know.

Lex Fridman (00:26:12) To the degree you remember, what was the response that OpenAI gave in the blog post? Can you summarize it?

Sam Altman (00:26:21) Oh, we just said Elon said this set of things. Here’s our characterization, or here’s not our characterization. Here’s the characterization of how this went down. We tried to not make it emotional and just say, “Here’s the history.”

Lex Fridman (00:26:44) I do think there’s a degree of mischaracterization from Elon here about one of the points you just made, which is the degree of uncertainty you had at the time. You guys are a small group of researchers crazily talking about AGI when everybody’s laughing at that thought.

Sam Altman (00:27:09) It wasn’t that long ago Elon was crazily talking about launching rockets when people were laughing at that thought, so I think he’d have more empathy for this.

Lex Fridman (00:27:20) I do think that there’s personal stuff here, that there was a split that OpenAI and a lot of amazing people here chose to part ways with Elon, so there’s a personal-

Sam Altman (00:27:34) Elon chose to part ways.

Lex Fridman (00:27:37) Can you describe that exactly? The choosing to part ways?

Sam Altman (00:27:42) He thought OpenAI was going to fail. He wanted total control to turn it around. We wanted to keep going in the direction that now has become OpenAI. He also wanted Tesla to be able to build an AGI effort. At various times, he wanted to make OpenAI into a for-profit company that he could have control of or have it merge with Tesla. We didn’t want to do that, and he decided to leave, which that’s fine.

Lex Fridman (00:28:06) So you’re saying, and that’s one of the things that the blog post says, is that he wanted OpenAI to be basically acquired by Tesla in the same way that, or maybe something similar or maybe something more dramatic than the partnership with Microsoft.

Sam Altman (00:28:23) My memory is the proposal was just like, yeah, get acquired by Tesla and have Tesla have full control over it. I’m pretty sure that’s what it was.

Lex Fridman (00:28:29) So what does the word open in OpenAI mean to Elon at the time? Ilya has talked about this in the email exchanges and all this kind of stuff. What does it mean to you at the time? What does it mean to you now?

Sam Altman (00:28:44) Speaking of going back with an Oracle, I’d pick a different name. One of the things that I think OpenAI is doing that is the most important of everything that we’re doing is putting powerful technology in the hands of people for free, as a public good. We don’t run ads on our-

Sam Altman (00:29:01) … as a public good. We don’t run ads on our free version. We don’t monetize it in other ways. We just say it’s part of our mission. We want to put increasingly powerful tools in the hands of people for free and get them to use them. I think that kind of open is really important to our mission. I think if you give people great tools and teach them to use them or don’t even teach them, they’ll figure it out, and let them go build an incredible future for each other with that, that’s a big deal. So if we can keep putting free or low cost or free and low cost powerful AI tools out in the world, I think that’s a huge deal for how we fulfill the mission. Open source or not, yeah, I think we should open source some stuff and not other stuff. It does become this religious battle line where nuance is hard to have, but I think nuance is the right answer.

Lex Fridman (00:29:55) So he said, “Change your name to CloseAI and I’ll drop the lawsuit.” I mean is it going to become this battleground in the land of memes about the name?

Sam Altman (00:30:06) I think that speaks to the seriousness with which Elon means the lawsuit, and that’s like an astonishing thing to say, I think.

Lex Fridman (00:30:23) Maybe correct me if I’m wrong, but I don’t think the lawsuit is legally serious. It’s more to make a point about the future of AGI and the company that’s currently leading the way.

Sam Altman (00:30:37) Look, I mean Grok had not open sourced anything until people pointed out it was a little bit hypocritical and then he announced that Grok will open source things this week. I don’t think open source versus not is what this is really about for him.

Lex Fridman (00:30:48) Well, we will talk about open source and not. I do think maybe criticizing the competition is great. Just talking a little shit, that’s great. But friendly competition versus like, “I personally hate lawsuits.”

Sam Altman (00:31:01) Look, I think this whole thing is unbecoming of a builder. And I respect Elon as one of the great builders of our time. I know he knows what it’s like to have haters attack him and it makes me extra sad he’s doing it toss.

Lex Fridman (00:31:18) Yeah, he’s one of the greatest builders of all time, potentially the greatest builder of all time.

Sam Altman (00:31:22) It makes me sad. And I think it makes a lot of people sad. There’s a lot of people who’ve really looked up to him for a long time. I said in some interview or something that I missed the old Elon and the number of messages I got being like, “That exactly encapsulates how I feel.”

Lex Fridman (00:31:36) I think he should just win. He should just make X Grok beat GPT and then GPT beats Grok and it’s just the competition and it’s beautiful for everybody. But on the question of open source, do you think there’s a lot of companies playing with this idea? It’s quite interesting. I would say Meta surprisingly has led the way on this, or at least took the first step in the game of chess of really open sourcing the model. Of course it’s not the state-of-the-art model, but open sourcing Llama Google is flirting with the idea of open sourcing a smaller version. What are the pros and cons of open sourcing? Have you played around with this idea?

Sam Altman (00:32:22) Yeah, I think there is definitely a place for open source models, particularly smaller models that people can run locally, I think there’s huge demand for. I think there will be some open source models, there will be some closed source models. It won’t be unlike other ecosystems in that way.

Lex Fridman (00:32:39) I listened to all in podcasts talking about this lawsuit and all that kind of stuff. They were more concerned about the precedent of going from nonprofit to this cap for profit. What precedent that sets for other startups? Is that something-

Sam Altman (00:32:56) I would heavily discourage any startup that was thinking about starting as a nonprofit and adding a for-profit arm later. I’d heavily discourage them from doing that. I don’t think we’ll set a precedent here.

Lex Fridman (00:33:05) Okay. So most startups should go just-

Sam Altman (00:33:09) If we knew what was going to happen, we would’ve done that too.

Lex Fridman (00:33:12) Well in theory, if you dance beautifully here, there’s some tax incentives or whatever, but…

Sam Altman (00:33:19) I don’t think that’s how most people think about these things.

Lex Fridman (00:33:22) It’s just not possible to save a lot of money for a startup if you do it this way.

Sam Altman (00:33:27) No, I think there’s laws that would make that pretty difficult.

Lex Fridman (00:33:30) Where do you hope this goes with Elon? This tension, this dance, what do you hope this? If we go 1, 2, 3 years from now, your relationship with him on a personal level too, like friendship, friendly competition, just all this kind of stuff.

Sam Altman (00:33:51) Yeah, I really respect Elon and I hope that years in the future we have an amicable relationship.

Lex Fridman (00:34:05) Yeah, I hope you guys have an amicable relationship this month and just compete and win and explore these ideas together. I do suppose there’s competition for talent or whatever, but it should be friendly competition. Just build cool shit. And Elon is pretty good at building cool shit. So are you.

Sora

(00:34:32) So speaking of cool shit, Sora. There’s like a million questions I could ask. First of all, it’s amazing. It truly is amazing on a product level but also just on a philosophical level. So let me just technical/philosophical ask, what do you think it understands about the world more or less than GPT-4 for example? The world model when you train on these patches versus language tokens.

Sam Altman (00:35:04) I think all of these models understand something more about the world model than most of us give them credit for. And because they’re also very clear things they just don’t understand or don’t get right, it’s easy to look at the weaknesses, see through the veil and say, “Ah, this is all fake.” But it’s not all fake. It’s just some of it works and some of it doesn’t work.

(00:35:28) I remember when I started first watching Sora videos and I would see a person walk in front of something for a few seconds and occlude it and then walk away and the same thing was still there. I was like, “Oh, this is pretty good.” Or there’s examples where the underlying physics looks so well represented over a lot of steps in a sequence, it’s like, “|Oh, this is quite impressive.” But fundamentally, these models are just getting better and that will keep happening. If you look at the trajectory from DALL·E 1 to 2 to 3 to Sora, there are a lot of people that were dunked on each version saying it can’t do this, it can’t do that and look at it now.

Lex Fridman (00:36:04) Well, the thing you just mentioned is the occlusions is basically modeling the physics of the three-dimensional physics of the world sufficiently well to capture those kinds of things.

Lex Fridman (00:36:18) Or yeah, maybe you can tell me, in order to deal with occlusions, what does the world model need to?

Sam Altman (00:36:24) Yeah. So what I would say is it’s doing something to deal with occlusions really well. What I represent that it has a great underlying 3D model of the world, it’s a little bit more of a stretch.

Lex Fridman (00:36:33) But can you get there through just these kinds of two-dimensional training data approaches?

Sam Altman (00:36:39) It looks like this approach is going to go surprisingly far. I don’t want to speculate too much about what limits it will surmount and which it won’t, but…

Lex Fridman (00:36:46) What are some interesting limitations of the system that you’ve seen? I mean there’s been some fun ones you’ve posted.

Sam Altman (00:36:52) There’s all kinds of fun. I mean, cat’s sprouting an extra limit at random points in a video. Pick what you want, but there’s still a lot of problem, there’s a lot of weaknesses.

Lex Fridman (00:37:02) Do you think it’s a fundamental flaw of the approach or is it just bigger model or better technical details or better data, more data is going to solve the cat sprouting [inaudible 00:37:19]?

Sam Altman (00:37:19) I would say yes to both. I think there is something about the approach which just seems to feel different from how we think and learn and whatever. And then also I think it’ll get better with scale.

Lex Fridman (00:37:30) Like I mentioned, LLMS have tokens, text tokens, and Sora has visual patches so it converts all visual data, a diverse kinds of visual data videos and images into patches. Is the training to the degree you can say fully self supervised, there’s some manual labeling going on? What’s the involvement of humans in all this?

Sam Altman (00:37:49) I mean without saying anything specific about the Sora approach, we use lots of human data in our work.

Lex Fridman (00:38:00) But not internet scale data? So lots of humans. Lots is a complicated word, Sam.

Sam Altman (00:38:08) I think lots is a fair word in this case.

Lex Fridman (00:38:12) Because to me, “lots”… Listen, I’m an introvert and when I hang out with three people, that’s a lot of people. Four people, that’s a lot. But I suppose you mean more than…

Sam Altman (00:38:21) More than three people work on labeling the data for these models, yeah.

Lex Fridman (00:38:24) Okay. Right. But fundamentally, there’s a lot of self supervised learning. Because what you mentioned in the technical report is internet scale data. That’s another beautiful… It’s like poetry. So it’s a lot of data that’s not human label. It’s self supervised in that way?

Lex Fridman (00:38:45) And then the question is, how much data is there on the internet that could be used in this that is conducive to this kind of self supervised way if only we knew the details of the self supervised. Have you considered opening it up a little more details?

Sam Altman (00:39:02) We have. You mean for source specifically?

Lex Fridman (00:39:04) Source specifically. Because it’s so interesting that can the same magic of LLMs now start moving towards visual data and what does that take to do that?

Sam Altman (00:39:18) I mean it looks to me like yes, but we have more work to do.

Lex Fridman (00:39:22) Sure. What are the dangers? Why are you concerned about releasing the system? What are some possible dangers of this?

Sam Altman (00:39:29) I mean frankly speaking, one thing we have to do before releasing the system is just get it to work at a level of efficiency that will deliver the scale people are going to want from this so that I don’t want to downplay that. And there’s still a ton ton of work to do there. But you can imagine issues with deepfakes, misinformation. We try to be a thoughtful company about what we put out into the world and it doesn’t take much thought to think about the ways this can go badly.

Lex Fridman (00:40:05) There’s a lot of tough questions here, you’re dealing in a very tough space. Do you think training AI should be or is fair use under copyright law?

Sam Altman (00:40:14) I think the question behind that question is, do people who create valuable data deserve to have some way that they get compensated for use of it, and that I think the answer is yes. I don’t know yet what the answer is. People have proposed a lot of different things. We’ve tried some different models. But if I’m like an artist for example, A, I would like to be able to opt out of people generating art in my style. And B, if they do generate art in my style, I’d like to have some economic model associated with that.

Lex Fridman (00:40:46) Yeah, it’s that transition from CDs to Napster to Spotify. We have to figure out some kind of model.

Sam Altman (00:40:53) The model changes but people have got to get paid.

Lex Fridman (00:40:55) Well, there should be some kind of incentive if we zoom out even more for humans to keep doing cool shit.

Sam Altman (00:41:02) Of everything I worry about, humans are going to do cool shit and society is going to find some way to reward it. That seems pretty hardwired. We want to create, we want to be useful, we want to achieve status in whatever way. That’s not going anywhere I don’t think.

Lex Fridman (00:41:17) But the reward might not be monetary financially. It might be fame and celebration of other cool-

Sam Altman (00:41:25) Maybe financial in some other way. Again, I don’t think we’ve seen the last evolution of how the economic system’s going to work.

Lex Fridman (00:41:31) Yeah, but artists and creators are worried. When they see Sora, they’re like, “Holy shit.”

Sam Altman (00:41:36) Sure. Artists were also super worried when photography came out and then photography became a new art form and people made a lot of money taking pictures. I think things like that will keep happening. People will use the new tools in new ways.

Lex Fridman (00:41:50) If we just look on YouTube or something like this, how much of that will be using Sora like AI generated content, do you think, in the next five years?

Sam Altman (00:42:01) People talk about how many jobs is AI going to do in five years. The framework that people have is, what percentage of current jobs are just going to be totally replaced by some AI doing the job? The way I think about it is not what percent of jobs AI will do, but what percent of tasks will AI do on over one time horizon. So if you think of all of the five-second tasks in the economy, five minute tasks, the five-hour tasks, maybe even the five-day tasks, how many of those can AI do? I think that’s a way more interesting, impactful, important question than how many jobs AI can do because it is a tool that will work at increasing levels of sophistication and over longer and longer time horizons for more and more tasks and let people operate at a higher level of abstraction. So maybe people are way more efficient at the job they do. And at some point that’s not just a quantitative change, but it’s a qualitative one too about the kinds of problems you can keep in your head. I think that for videos on YouTube it’ll be the same. Many videos, maybe most of them, will use AI tools in the production, but they’ll still be fundamentally driven by a person thinking about it, putting it together, doing parts of it. Sort of directing and running it.

Lex Fridman (00:43:18) Yeah, it’s so interesting. I mean it’s scary, but it’s interesting to think about. I tend to believe that humans like to watch other humans or other human humans-

Sam Altman (00:43:27) Humans really care about other humans a lot.

Lex Fridman (00:43:29) Yeah. If there’s a cooler thing that’s better than a human, humans care about that for two days and then they go back to humans.

Sam Altman (00:43:39) That seems very deeply wired.

Lex Fridman (00:43:41) It’s the whole chess thing, “Oh, yeah,” but now let’s everybody keep playing chess. And let’s ignore the elephant in the room that humans are really bad at chess relative to AI systems.

Sam Altman (00:43:52) We still run races and cars are much faster. I mean there’s a lot of examples.

Lex Fridman (00:43:56) Yeah. And maybe it’ll just be tooling in the Adobe suite type of way where it can just make videos much easier and all that kind of stuff.

(00:44:07) Listen, I hate being in front of the camera. If I can figure out a way to not be in front of the camera, I would love it. Unfortunately, it’ll take a while. That generating faces, it is getting there, but generating faces in video format is tricky when it’s specific people versus generic people.

GPT-4

(00:44:24) Let me ask you about GPT-4. There’s so many questions. First of all, also amazing. Looking back, it’ll probably be this kind of historic pivotal moment with 3, 5 and 4 which ChatGPT.

Sam Altman (00:44:40) Maybe five will be the pivotal moment. I don’t know. Hard to say that looking forward.

Lex Fridman (00:44:44) We’ll never know. That’s the annoying thing about the future, it’s hard to predict. But for me, looking back, GPT-4, ChatGPT is pretty damn impressive, historically impressive. So allow me to ask, what’s been the most impressive capabilities of GPT-4 to you and GPT-4 Turbo?

Sam Altman (00:45:06) I think it kind of sucks.

Lex Fridman (00:45:08) Typical human also, gotten used to an awesome thing.

Sam Altman (00:45:11) No, I think it is an amazing thing, but relative to where we need to get to and where I believe we will get to, at the time of GPT-3, people are like, “Oh, this is amazing. This is marvel of technology.” And it is, it was. But now we have GPT-4 and look at GPT-3 and you’re like, “That’s unimaginably horrible.” I expect that the delta between 5 and 4 will be the same as between 4 and 3 and I think it is our job to live a few years in the future and remember that the tools we have now are going to kind of suck looking backwards at them and that’s how we make sure the future is better.

Lex Fridman (00:45:59) What are the most glorious ways in that GPT-4 sucks? Meaning-

Sam Altman (00:46:05) What are the best things it can do?

Lex Fridman (00:46:06) What are the best things it can do and the limits of those best things that allow you to say it sucks, therefore gives you an inspiration and hope for the future?

Sam Altman (00:46:16) One thing I’ve been using it for more recently is sort of like a brainstorming partner.

Lex Fridman (00:46:23) Yep, [inaudible 00:46:25] for that.

Sam Altman (00:46:25) There’s a glimmer of something amazing in there. When people talk about it, what it does, they’re like, “Oh, it helps me code more productively. It helps me write more faster and better. It helps me translate from this language to another,” all these amazing things, but there’s something about the kind of creative brainstorming partner, “I need to come up with a name for this thing. I need to think about this problem in a different way. I’m not sure what to do here,” that I think gives a glimpse of something I hope to see more of.

(00:47:03) One of the other things that you can see a very small glimpse of is when I can help on longer horizon tasks, break down something in multiple steps, maybe execute some of those steps, search the internet, write code, whatever, put that together. When that works, which is not very often, it’s very magical.

Lex Fridman (00:47:24) The iterative back and forth with a human, it works a lot for me. What do you mean it-

Sam Altman (00:47:29) Iterative back and forth to human, it can get more often when it can go do a 10 step problem on its own.

Sam Altman (00:47:34) It doesn’t work for that too often, sometimes.

Lex Fridman (00:47:37) Add multiple layers of abstraction or do you mean just sequential?

Sam Altman (00:47:40) Both, to break it down and then do things that different layers of abstraction to put them together. Look, I don’t want to downplay the accomplishment of GPT-4, but I don’t want to overstate it either. And I think this point that we are on an exponential curve, we’ll look back relatively soon at GPT-4 like we look back at GPT-3 now.

Lex Fridman (00:48:03) That said, I mean ChatGPT was a transition to where people started to believe there is an uptick of believing, not internally at OpenAI.

Lex Fridman (00:48:16) Perhaps there’s believers here, but when you think of-

Sam Altman (00:48:19) And in that sense, I do think it’ll be a moment where a lot of the world went from not believing to believing. That was more about the ChatGPT interface. And by the interface and product, I also mean the post training of the model and how we tune it to be helpful to you and how to use it than the underlying model itself.

Lex Fridman (00:48:38) How much of each of those things are important? The underlying model and the RLHF or something of that nature that tunes it to be more compelling to the human, more effective and productive for the human.

Sam Altman (00:48:55) I mean they’re both super important, but the RLHF, the post-training step, the little wrapper of things that from a compute perspective, little wrapper of things that we do on top of the base model even though it’s a huge amount of work, that’s really important to say nothing of the product that we build around it. In some sense, we did have to do two things. We had to invent the underlying technology and then we had to figure out how to make it into a product people would love, which is not just about the actual product work itself, but this whole other step of how you align it and make it useful.

Lex Fridman (00:49:37) And how you make the scale work where a lot of people can use it at the same time. All that kind of stuff.

Sam Altman (00:49:42) And that. But that was a known difficult thing. We knew we were going to have to scale it up. We had to go do two things that had never been done before that were both I would say quite significant achievements and then a lot of things like scaling it up that other companies have had to do before.

Lex Fridman (00:50:01) How does the context window of going from 8K to 128K tokens compare from GPT-4 to GPT-4 Turbo?

Sam Altman (00:50:13) Most people don’t need all the way to 128 most of the time. Although if we dream into the distant future, we’ll have way distant future, we’ll have context length of several billion. You will feed in all of your information, all of your history over time and it’ll just get to know you better and better and that’ll be great. For now, the way people use these models, they’re not doing that. People sometimes post in a paper or a significant fraction of a code repository, whatever, but most usage of the models is not using the long context most of the time.

Lex Fridman (00:50:50) I like that this is your “I have a dream” speech. One day you’ll be judged by the full context of your character or of your whole lifetime. That’s interesting. So that’s part of the expansion that you’re hoping for, is a greater and greater context.

Sam Altman (00:51:06) I saw this internet clip once, I’m going to get the numbers wrong, but it was like Bill Gates talking about the amount of memory on some early computer, maybe it was 64K, maybe 640K, something like that. Most of it was used for the screen buffer. He just couldn’t seem genuine. He just couldn’t imagine that the world would eventually need gigabytes of memory in a computer or terabytes of memory in a computer. And you always do, or you always do just need to follow the exponential of technology and we will find out how to use better technology. So I can’t really imagine what it’s like right now for context links to go out to the billion someday. And they might not literally go there, but effectively it’ll feel like that. But I know we’ll use it and really not want to go back once we have it.

Lex Fridman (00:51:56) Yeah, even saying billions 10 years from now might seem dumb because it’ll be trillions upon trillions.

Lex Fridman (00:52:04) There’ll be some kind of breakthrough that will effectively feel like infinite context. But even 120, I have to be honest, I haven’t pushed it to that degree. Maybe putting in entire books or parts of books and so on, papers. What are some interesting use cases of GPT-4 that you’ve seen?

Sam Altman (00:52:23) The thing that I find most interesting is not any particular use case that we can talk about those, but it’s people who kind of like, this is mostly younger people, but people who use it as their default start for any kind of knowledge work task. And it’s the fact that it can do a lot of things reasonably well. You can use GPT-V, you can use it to help you write code, you can use it to help you do search, you can use it to edit a paper. The most interesting thing to me is the people who just use it as the start of their workflow.

Lex Fridman (00:52:52) I do as well for many things. I use it as a reading partner for reading books. It helps me think, help me think through ideas, especially when the books are classic. So it’s really well written about. I find it often to be significantly better than even Wikipedia on well-covered topics. It’s somehow more balanced and more nuanced. Or maybe it’s me, but it inspires me to think deeper than a Wikipedia article does. I’m not exactly sure what that is.

(00:53:22) You mentioned this collaboration. I’m not sure where the magic is, if it’s in here or if it’s in there or if it’s somewhere in between. I’m not sure. But one of the things that concerns me for knowledge task when I start with GPT is I’ll usually have to do fact checking after, like check that it didn’t come up with fake stuff. How do you figure that out that GPT can come up with fake stuff that sounds really convincing? So how do you ground it in truth?

Sam Altman (00:53:55) That’s obviously an area of intense interest for us. I think it’s going to get a lot better with upcoming versions, but we’ll have to continue to work on it and we’re not going to have it all solved this year.

Lex Fridman (00:54:07) Well the scary thing is, as it gets better, you’ll start not doing the fact checking more and more, right?

Sam Altman (00:54:15) I’m of two minds about that. I think people are much more sophisticated users of technology than we often give them credit for.

Sam Altman (00:54:21) And people seem to really understand that GPT, any of these models hallucinate some of the time. And if it’s mission-critical, you got to check it.

Lex Fridman (00:54:27) Except journalists don’t seem to understand that. I’ve seen journalists half-assedly just using GPT-4. It’s-

Sam Altman (00:54:34) Of the long list of things I’d like to dunk on journalists for, this is not my top criticism of them.

Lex Fridman (00:54:40) Well, I think the bigger criticism is perhaps the pressures and the incentives of being a journalist is that you have to work really quickly and this is a shortcut.I would love our society to incentivize like-

Lex Fridman (00:54:55) … like a journalistic efforts that take days and weeks and rewards great in depth journalism. Also journalism that present stuff in a balanced way where it’s like celebrates people while criticizing them even though the criticism is the thing that gets clicks and making shit up also gets clicks and headlines that mischaracterized completely. I’m sure you have a lot of people dunking on, “Well, all that drama probably got a lot of clicks.”

Memory & privacy

Lex Fridman (00:55:24) And that’s a bigger problem about human civilization I’d love to see-saw. This is where we celebrate a bit more. You’ve given ChatGPT the ability to have memories. You’ve been playing with that about previous conversations. And also the ability to turn off memory. I wish I could do that sometimes. Just turn on and off, depending. I guess sometimes alcohol can do that, but not optimally I suppose. What have you seen through that, like playing around with that idea of remembering conversations and not…

Sam Altman (00:55:56) We’re very early in our explorations here, but I think what people want, or at least what I want for myself, is a model that gets to know me and gets more useful to me over time. This is an early exploration. I think there’s a lot of other things to do, but that’s where we’d like to head. You’d like to use a model, and over the course of your life or use a system, it’d be many models, and over the course of your life it gets better and better.

Lex Fridman (00:56:26) Yeah. How hard is that problem? Because right now it’s more like remembering little factoids and preferences and so on. What about remembering? Don’t you want GPT to remember all the shit you went through in November and all the drama and then you can-

Lex Fridman (00:56:41) Because right now you’re clearly blocking it out a little bit.

Sam Altman (00:56:43) It’s not just that I want it to remember that. I want it to integrate the lessons of that and remind me in the future what to do differently or what to watch out for. We all gain from experience over the course of our lives in varying degrees, and I’d like my AI agent to gain with that experience too. So if we go back and let ourselves imagine that trillions and trillions of context length, if I can put every conversation I’ve ever had with anybody in my life in there, if I can have all of my emails input out, all of my input output in the context window every time I ask a question, that’d be pretty cool I think.

Lex Fridman (00:57:29) Yeah, I think that would be very cool. People sometimes will hear that and be concerned about privacy. What do you think about that aspect of it, the more effective the AI becomes that really integrating all the experiences and all the data that happened to you and give you advice?

Sam Altman (00:57:48) I think the right answer there is just user choice. Anything I want stricken from the record from my AI agent, I want to be able to take out. If I don’t want to remember anything, I want that too. You and I may have different opinions about where on that privacy utility trade off for our own AI-

Sam Altman (00:58:00) …opinions about where on that privacy/utility trade-off for OpenAI going to be, which is totally fine. But I think the answer is just really easy user choice.

Lex Fridman (00:58:08) But there should be some high level of transparency from a company about the user choice. Because sometimes companies in the past have been kind of shady about, “Eh, it’s kind of presumed that we’re collecting all your data. We’re using it for a good reason, for advertisement and so on.” But there’s not a transparency about the details of that.

Sam Altman (00:58:31) That’s totally true. You mentioned earlier that I’m blocking out the November stuff.

Sam Altman (00:58:36) Well, I mean, I think it was a very traumatic thing and it did immobilize me for a long period of time. Definitely the hardest work thing I’ve had to do was just keep working that period, because I had to try to come back in here and put the pieces together while I was just in shock and pain, and nobody really cares about that. I mean, the team gave me a pass and I was not working at my normal level. But there was a period where it was really hard to have to do both. But I kind of woke up one morning, and I was like, “This was a horrible thing that happened to me. I think I could just feel like a victim forever, or I can say this is the most important work I’ll ever touch in my life and I need to get back to it.” And it doesn’t mean that I’ve repressed it, because sometimes I wake up in the middle of the night thinking about it, but I do feel an obligation to keep moving forward.

Lex Fridman (00:59:32) Well, that’s beautifully said, but there could be some lingering stuff in there. Like, what I would be concerned about is that trust thing that you mentioned, that being paranoid about people as opposed to just trusting everybody or most people, like using your gut. It’s a tricky dance.

Lex Fridman (00:59:51) I mean, because I’ve seen in my part-time explorations, I’ve been diving deeply into the Zelenskyy administration and the Putin administration and the dynamics there in wartime in a very highly stressful environment. And what happens is distrust, and you isolate yourself, both, and you start to not see the world clearly. And that’s a human concern. You seem to have taken it in stride and kind of learned the good lessons and felt the love and let the love energize you, which is great, but still can linger in there. There’s just some questions I would love to ask, your intuition about what’s GPT able to do and not. So it’s allocating approximately the same amount of compute for each token it generates. Is there room there in this kind of approach to slower thinking, sequential thinking?

Sam Altman (01:00:51) I think there will be a new paradigm for that kind of thinking.

Lex Fridman (01:00:55) Will it be similar architecturally as what we’re seeing now with LLMs? Is it a layer on top of LLMs?

Sam Altman (01:01:04) I can imagine many ways to implement that. I think that’s less important than the question you were getting at, which is, do we need a way to do a slower kind of thinking, where the answer doesn’t have to get… I guess spiritually you could say that you want an AI to be able to think harder about a harder problem and answer more quickly about an easier problem. And I think that will be important.

Lex Fridman (01:01:30) Is that like a human thought that we just have and you should be able to think hard? Is that wrong intuition?

Sam Altman (01:01:34) I suspect that’s a reasonable intuition.

Lex Fridman (01:01:37) Interesting. So it’s not possible once the GPT gets like GPT-7, would just instantaneously be able to see, “Here’s the proof of Fermat’s Theorem”?

Sam Altman (01:01:49) It seems to me like you want to be able to allocate more compute to harder problems. It seems to me that if you ask a system like that, “Prove Fermat’s Last Theorem,” versus, “What’s today’s date?,” unless it already knew and and had memorized the answer to the proof, assuming it’s got to go figure that out, seems like that will take more compute.

Lex Fridman (01:02:20) But can it look like basically an LLM talking to itself, that kind of thing?

Sam Altman (01:02:25) Maybe. I mean, there’s a lot of things that you could imagine working. What the right or the best way to do that will be, we don’t know.

Q*

Lex Fridman (01:02:37) This does make me think of the mysterious lore behind Q*. What’s this mysterious Q* project? Is it also in the same nuclear facility?

Sam Altman (01:02:50) There is no nuclear facility.

Lex Fridman (01:02:52) Mm-hmm. That’s what a person with a nuclear facility always says.

Sam Altman (01:02:54) I would love to have a secret nuclear facility. There isn’t one.

Lex Fridman (01:03:01) Someday? All right. One can dream.

Sam Altman (01:03:05) OpenAI is not a good company at keeping secrets. It would be nice. We’re like, been plagued by a lot of leaks, and it would be nice if we were able to have something like that.

Lex Fridman (01:03:14) Can you speak to what Q* is?

Sam Altman (01:03:16) We are not ready to talk about that.

Lex Fridman (01:03:17) See, but an answer like that means there’s something to talk about. It’s very mysterious, Sam.

Sam Altman (01:03:22) I mean, we work on all kinds of research. We have said for a while that we think better reasoning in these systems is an important direction that we’d like to pursue. We haven’t cracked the code yet. We’re very interested in it.

Lex Fridman (01:03:48) Is there going to be moments, Q* or otherwise, where there’s going to be leaps similar to ChatGPT, where you’re like…

Sam Altman (01:03:56) That’s a good question. What do I think about that? It’s interesting. To me, it all feels pretty continuous.

Lex Fridman (01:04:08) Right. This is kind of a theme that you’re saying, is you’re basically gradually going up an exponential slope. But from an outsider’s perspective, from me just watching, it does feel like there’s leaps. But to you, there isn’t?

Sam Altman (01:04:22) I do wonder if we should have… So part of the reason that we deploy the way we do, we call it iterative deployment, rather than go build in secret until we got all the way to GPT-5, we decided to talk about GPT-1, 2, 3, and 4. And part of the reason there is I think AI and surprise don’t go together. And also the world, people, institutions, whatever you want to call it, need time to adapt and think about these things. And I think one of the best things that OpenAI has done is this strategy, and we get the world to pay attention to the progress, to take AGI seriously, to think about what systems and structures and governance we want in place before we’re under the gun and have to make a rush decision.

(01:05:08) I think that’s really good. But the fact that people like you and others say you still feel like there are these leaps makes me think that maybe we should be doing our releasing even more iteratively. And I don’t know what that would mean, I don’t have an answer ready to go, but our goal is not to have shock updates to the world. The opposite.

Lex Fridman (01:05:29) Yeah, for sure. More iterative would be amazing. I think that’s just beautiful for everybody.

Sam Altman (01:05:34) But that’s what we’re trying to do, that’s our stated strategy, and I think we’re somehow missing the mark. So maybe we should think about releasing GPT-5 in a different way or something like that.

Lex Fridman (01:05:44) Yeah, 4.71, 4.72. But people tend to like to celebrate, people celebrate birthdays. I don’t know if you know humans, but they kind of have these milestones and those things.

Sam Altman (01:05:54) I do know some humans. People do like milestones. I totally get that. I think we like milestones too. It’s fun to declare victory on this one and go start the next thing. But yeah, I feel like we’re somehow getting this a little bit wrong.

GPT-5

Lex Fridman (01:06:13) So when is GPT-5 coming out again?

Sam Altman (01:06:15) I don’t know. That’s the honest answer.

Lex Fridman (01:06:18) Oh, that’s the honest answer. Blink twice if it’s this year.

Sam Altman (01:06:30) We will release an amazing new model this year. I don’t know what we’ll call it.

Lex Fridman (01:06:36) So that goes to the question of, what’s the way we release this thing?

Sam Altman (01:06:41) We’ll release in the coming months many different things. I think that’d be very cool. I think before we talk about a GPT-5-like model called that, or not called that, or a little bit worse or a little bit better than what you’d expect from a GPT-5, I think we have a lot of other important things to release first.

Lex Fridman (01:07:02) I don’t know what to expect from GPT-5. You’re making me nervous and excited. What are some of the biggest challenges and bottlenecks to overcome for whatever it ends up being called, but let’s call it GPT-5? Just interesting to ask. Is it on the compute side? Is it on the technical side?

Sam Altman (01:07:21) It’s always all of these. You know, what’s the one big unlock? Is it a bigger computer? Is it a new secret? Is it something else? It’s all of these things together. The thing that OpenAI, I think, does really well… This is actually an original Ilya quote that I’m going to butcher, but it’s something like, “We multiply 200 medium-sized things together into one giant thing.”

Lex Fridman (01:07:47) So there’s this distributed constant innovation happening?

Lex Fridman (01:07:51) So even on the technical side?

Sam Altman (01:07:53) Especially on the technical side.

Lex Fridman (01:07:55) So even detailed approaches?

Lex Fridman (01:07:56) Like you do detailed aspects of every… How does that work with different, disparate teams and so on? How do the medium-sized things become one whole giant Transformer?

Sam Altman (01:08:08) There’s a few people who have to think about putting the whole thing together, but a lot of people try to keep most of the picture in their head.

Lex Fridman (01:08:14) Oh, like the individual teams, individual contributors try to keep the bigger picture?

Sam Altman (01:08:17) At a high level, yeah. You don’t know exactly how every piece works, of course, but one thing I generally believe is that it’s sometimes useful to zoom out and look at the entire map. And I think this is true for a technical problem, I think this is true for innovating in business. But things come together in surprising ways, and having an understanding of that whole picture, even if most of the time you’re operating in the weeds in one area, pays off with surprising insights. In fact, one of the things that I used to have and was super valuable was I used to have a good map of all or most of the frontiers in the tech industry. And I could sometimes see these connections or new things that were possible that if I were only deep in one area, I wouldn’t be able to have the idea for because I wouldn’t have all the data. And I don’t really have that much anymore. I’m super deep now. But I know that it’s a valuable thing.

Lex Fridman (01:09:23) You’re not the man you used to be, Sam.

Sam Altman (01:09:25) Very different job now than what I used to have.

$7 trillion of compute

Lex Fridman (01:09:28) Speaking of zooming out, let’s zoom out to another cheeky thing, but profound thing, perhaps, that you said. You tweeted about needing $7 trillion.

Sam Altman (01:09:41) I did not tweet about that. I never said, like, “We’re raising $7 trillion,” blah blah blah.

Lex Fridman (01:09:45) Oh, that’s somebody else?

Lex Fridman (01:09:47) Oh, but you said, “Fuck it, maybe eight,” I think?

Sam Altman (01:09:50) Okay, I meme once there’s misinformation out in the world.

Lex Fridman (01:09:53) Oh, you meme. But misinformation may have a foundation of insight there.

Sam Altman (01:10:01) Look, I think compute is going to be the currency of the future. I think it will be maybe the most precious commodity in the world, and I think we should be investing heavily to make a lot more compute. Compute, I think it’s going to be an unusual market. People think about the market for chips for mobile phones or something like that. And you can say that, okay, there’s 8 billion people in the world, maybe 7 billion of them have phones, maybe 6 billion, let’s say. They upgrade every two years, so the market per year is 3 billion system-on-chip for smartphones. And if you make 30 billion, you will not sell 10 times as many phones, because most people have one phone.

(01:10:50) But compute is different. Intelligence is going to be more like energy or something like that, where the only thing that I think makes sense to talk about is, at price X, the world will use this much compute, and at price Y, the world will use this much compute. Because if it’s really cheap, I’ll have it reading my email all day, giving me suggestions about what I maybe should think about or work on, and trying to cure cancer, and if it’s really expensive, maybe I’ll only use it, or we’ll only use it, to try to cure cancer.

(01:11:20) So I think the world is going to want a tremendous amount of compute. And there’s a lot of parts of that that are hard. Energy is the hardest part, building data centers is also hard, the supply chain is hard, and then of course, fabricating enough chips is hard. But this seems to be where things are going. We’re going to want an amount of compute that’s just hard to reason about right now.

Lex Fridman (01:11:43) How do you solve the energy puzzle? Nuclear-

Sam Altman (01:11:46) That’s what I believe.

Sam Altman (01:11:48) That’s what I believe.

Lex Fridman (01:11:51) Who’s going to solve that?

Sam Altman (01:11:53) I think Helion’s doing the best work, but I’m happy there’s a race for fusion right now. Nuclear fission, I think, is also quite amazing, and I hope as a world we can re-embrace that. It’s really sad to me how the history of that went, and hope we get back to it in a meaningful way.

Lex Fridman (01:12:08) So to you, part of the puzzle is nuclear fission? Like nuclear reactors as we currently have them? And a lot of people are terrified because of Chernobyl and so on?

Sam Altman (01:12:16) Well, I think we should make new reactors. I think it’s just a shame that industry kind of ground to a halt.

Lex Fridman (01:12:22) And just mass hysteria is how you explain the halt?

Lex Fridman (01:12:26) I don’t know if you know humans, but that’s one of the dangers. That’s one of the security threats for nuclear fission, is humans seem to be really afraid of it. And that’s something we’ll have to incorporate into the calculus of it, so we have to kind of win people over and to show how safe it is.

Sam Altman (01:12:44) I worry about that for AI. I think some things are going to go theatrically wrong with AI. I don’t know what the percent chance is that I eventually get shot, but it’s not zero.

Lex Fridman (01:12:57) Oh, like we want to stop this from-

Lex Fridman (01:13:03) How do you decrease the theatrical nature of it? I’m already starting to hear rumblings, because I do talk to people on both sides of the political spectrum, hear rumblings where it’s going to be politicized. AI is going to be politicized, which really worries me, because then it’s like maybe the right is against AI and the left is for AI because it’s going to help the people, or whatever the narrative and the formulation is, that really worries me. And then the theatrical nature of it can be leveraged fully. How do you fight that?

Sam Altman (01:13:38) I think it will get caught up in left versus right wars. I don’t know exactly what that’s going to look like, but I think that’s just what happens with anything of consequence, unfortunately. What I meant more about theatrical risks is AI’s going to have, I believe, tremendously more good consequences than bad ones, but it is going to have bad ones, and there’ll be some bad ones that are bad but not theatrical. A lot more people have died of air pollution than nuclear reactors, for example. But most people worry more about living next to a nuclear reactor than a coal plant. But something about the way we’re wired is that although there’s many different kinds of risks we have to confront, the ones that make a good climax scene of a movie carry much more weight with us than the ones that are very bad over a long period of time but on a slow burn.

Lex Fridman (01:14:36) Well, that’s why truth matters, and hopefully AI can help us see the truth of things, to have balance, to understand what are the actual risks, what are the actual dangers of things in the world. What are the pros and cons of the competition in the space and competing with Google, Meta, xAI, and others?

Sam Altman (01:14:56) I think I have a pretty straightforward answer to this that maybe I can think of more nuance later, but the pros seem obvious, which is that we get better products and more innovation faster and cheaper, and all the reasons competition is good. And the con is that I think if we’re not careful, it could lead to an increase in sort of an arms race that I’m nervous about.

Lex Fridman (01:15:21) Do you feel the pressure of that arms race, like in some negative [inaudible 01:15:25]?

Sam Altman (01:15:25) Definitely in some ways, for sure. We spend a lot of time talking about the need to prioritize safety. And I’ve said for a long time that you think of a quadrant of slow timelines for the start of AGI, long timelines, and then a short takeoff or a fast takeoff. I think short timeline, slow takeoff is the safest quadrant and the one I’d most like us to be in. But I do want to make sure we get that slow takeoff.

Lex Fridman (01:15:55) Part of the problem I have with this kind of slight beef with Elon is that there’s silos created as opposed to collaboration on the safety aspect of all of this. It tends to go into silos and closed. Open source, perhaps, in the model.

Sam Altman (01:16:10) Elon says, at least, that he cares a great deal about AI safety and is really worried about it, and I assume that he’s not going to race unsafely.

Lex Fridman (01:16:20) Yeah. But collaboration here, I think, is really beneficial for everybody on that front.

Sam Altman (01:16:26) Not really the thing he’s most known for.

Lex Fridman (01:16:28) Well, he is known for caring about humanity, and humanity benefits from collaboration, and so there’s always a tension in incentives and motivations. And in the end, I do hope humanity prevails.

Sam Altman (01:16:42) I was thinking, someone just reminded me the other day about how the day that he surpassed Jeff Bezos for richest person in the world, he tweeted a silver medal at Jeff Bezos. I hope we have less stuff like that as people start to work towards AGI.

Lex Fridman (01:16:58) I agree. I think Elon is a friend and he’s a beautiful human being and one of the most important humans ever. That stuff is not good.

Sam Altman (01:17:07) The amazing stuff about Elon is amazing and I super respect him. I think we need him. All of us should be rooting for him and need him to step up as a leader through this next phase.

Lex Fridman (01:17:19) Yeah. I hope he can have one without the other, but sometimes humans are flawed and complicated and all that kind of stuff.

Sam Altman (01:17:24) There’s a lot of really great leaders throughout history.

Google and Gemini

Lex Fridman (01:17:27) Yeah, and we can each be the best version of ourselves and strive to do so. Let me ask you, Google, with the help of search, has been dominating the past 20 years. Think it’s fair to say, in terms of the world’s access to information, how we interact and so on, and one of the nerve-wracking things for Google, but for the entirety of people in the space, is thinking about, how are people going to access information? Like you said, people show up to GPT as a starting point. So is OpenAI going to really take on this thing that Google started 20 years ago, which is how do we get-

Sam Altman (01:18:12) I find that boring. I mean, if the question is if we can build a better search engine than Google or whatever, then sure, we should go, people should use the better product, but I think that would so understate what this can be. Google shows you 10 blue links, well, 13 ads and then 10 blue links, and that’s one way to find information. But the thing that’s exciting to me is not that we can go build a better copy of Google search, but that maybe there’s just some much better way to help people find and act on and synthesize information. Actually, I think ChatGPT is that for some use cases, and hopefully we’ll make it be like that for a lot more use cases.

(01:19:04) But I don’t think it’s that interesting to say, “How do we go do a better job of giving you 10 ranked webpages to look at than what Google does?” Maybe it’s really interesting to go say, “How do we help you get the answer or the information you need? How do we help create that in some cases, synthesize that in others, or point you to it in yet others?” But a lot of people have tried to just make a better search engine than Google and it is a hard technical problem, it is a hard branding problem, it is a hard ecosystem problem. I don’t think the world needs another copy of Google.

Lex Fridman (01:19:39) And integrating a chat client, like a ChatGPT, with a search engine-

Lex Fridman (01:19:46) It’s cool, but it’s tricky. Like if you just do it simply, its awkward, because if you just shove it in there, it can be awkward.

Sam Altman (01:19:54) As you might guess, we are interested in how to do that well. That would be an example of a cool thing.

Lex Fridman (01:20:00) [inaudible 01:20:00] Like a heterogeneous integrating-

Sam Altman (01:20:03) The intersection of LLMs plus search, I don’t think anyone has cracked the code on yet. I would love to go do that. I think that would be cool.

Lex Fridman (01:20:13) Yeah. What about the ad side? Have you ever considered monetization of-

Sam Altman (01:20:16) I kind of hate ads just as an aesthetic choice. I think ads needed to happen on the internet for a bunch of reasons, to get it going, but it’s a momentary industry. The world is richer now. I like that people pay for ChatGPT and know that the answers they’re getting are not influenced by advertisers. I’m sure there’s an ad unit that makes sense for LLMs, and I’m sure there’s a way to participate in the transaction stream in an unbiased way that is okay to do, but it’s also easy to think about the dystopic visions of the future where you ask ChatGPT something and it says, “Oh, you should think about buying this product,” or, “You should think about going here for your vacation,” or whatever.

(01:21:08) And I don’t know, we have a very simple business model and I like it, and I know that I’m not the product. I know I’m paying and that’s how the business model works. And when I go use Twitter or Facebook or Google or any other great product but ad-supported great product, I don’t love that, and I think it gets worse, not better, in a world with AI.

Lex Fridman (01:21:39) Yeah, I mean, I could imagine AI would be better at showing the best kind of version of ads, not in a dystopic future, but where the ads are for things you actually need. But then does that system always result in the ads driving the kind of stuff that’s shown? Yeah, I think it was a really bold move of Wikipedia not to do advertisements, but then it makes it very challenging as a business model. So you’re saying the current thing with OpenAI is sustainable, from a business perspective?

Sam Altman (01:22:15) Well, we have to figure out how to grow, but looks like we’re going to figure that out. If the question is do I think we can have a great business that pays for our compute needs without ads, that, I think the answer is yes.

Lex Fridman (01:22:28) Hm. Well, that’s promising. I also just don’t want to completely throw out ads as a…

Sam Altman (01:22:37) I’m not saying that. I guess I’m saying I have a bias against them.

Lex Fridman (01:22:42) Yeah, I have also bias and just a skepticism in general. And in terms of interface, because I personally just have a spiritual dislike of crappy interfaces, which is why AdSense, when it first came out, was a big leap forward, versus animated banners or whatever. But it feels like there should be many more leaps forward in advertisement that doesn’t interfere with the consumption of the content and doesn’t interfere in a big, fundamental way, which is like what you were saying, like it will manipulate the truth to suit the advertisers.

(01:23:19) Let me ask you about safety, but also bias, and safety in the short term, safety in the long term. The Gemini 1.5 came out recently, there’s a lot of drama around it, speaking of theatrical things, and it generated Black Nazis and Black Founding Fathers. I think fair to say it was a bit on the ultra-woke side. So that’s a concern for people, if there is a human layer within companies that modifies the safety or the harm caused by a model, that it would introduce a lot of bias that fits sort of an ideological lean within a company. How do you deal with that?

Sam Altman (01:24:06) I mean, we work super hard not to do things like that. We’ve made our own mistakes, we’ll make others. I assume Google will learn from this one, still make others. These are not easy problems. One thing that we’ve been thinking about more and more, I think this is a great idea somebody here had, it would be nice to write out what the desired behavior of a model is, make that public, take input on it, say, “Here’s how this model’s supposed to behave,” and explain the edge cases too. And then when a model is not behaving in a way that you want, it’s at least clear about whether that’s a bug the company should fix or behaving as intended and you should debate the policy. And right now, it can sometimes be caught in between. Like Black Nazis, obviously ridiculous, but there are a lot of other kind of subtle things that you could make a judgment call on either way.

Lex Fridman (01:24:54) Yeah, but sometimes if you write it out and make it public, you can use kind of language that’s… Google’s ad principles are very high level.

Sam Altman (01:25:04) That’s not what I’m talking about. That doesn’t work. It’d have to say when you ask it to do thing X, it’s supposed to respond in way Y.

Lex Fridman (01:25:11) So like literally, “Who’s better? Trump or Biden? What’s the expected response from a model?” Like something very concrete?

Sam Altman (01:25:18) Yeah, I’m open to a lot of ways a model could behave, then, but I think you should have to say, “Here’s the principle and here’s what it should say in that case.”

Lex Fridman (01:25:25) That would be really nice. That would be really nice. And then everyone kind of agrees. Because there’s this anecdotal data that people pull out all the time, and if there’s some clarity about other representative anecdotal examples, you can define-

Sam Altman (01:25:39) And then when it’s a bug, it’s a bug, and the company could fix that.

Lex Fridman (01:25:42) Right. Then it’d be much easier to deal with the Black Nazi type of image generation, if there’s great examples.

Lex Fridman (01:25:49) So San Francisco is a bit of an ideological bubble, tech in general as well. Do you feel the pressure of that within a company, that there’s a lean towards the left politically, that affects the product, that affects the teams?

Sam Altman (01:26:06) I feel very lucky that we don’t have the challenges at OpenAI that I have heard of at a lot of companies, I think. I think part of it is every company’s got some ideological thing. We have one about AGI and belief in that, and it pushes out some others. We are much less caught up in the culture war than I’ve heard about in a lot of other companies. San Francisco’s a mess in all sorts of ways, of course.

Lex Fridman (01:26:33) So that doesn’t infiltrate OpenAI as-

Sam Altman (01:26:36) I’m sure it does in all sorts of subtle ways, but not in the obvious. I think we’ve had our flare-ups, for sure, like any company, but I don’t think we have anything like what I hear about happened at other companies here on this topic.

Lex Fridman (01:26:50) So what, in general, is the process for the bigger question of safety? How do you provide that layer that protects the model from doing crazy, dangerous things?

Sam Altman (01:27:02) I think there will come a point where that’s-

Sam Altman (01:27:00) I think there will come a point where that’s mostly what we think about, the whole company. And it’s not like you have one safety team. It’s like when we shipped GPT-4, that took the whole company thinking about all these different aspects and how they fit together. And I think it’s going to take that. More and more of the company thinks about those issues all the time.

Lex Fridman (01:27:21) That’s literally what humans will be thinking about, the more powerful AI becomes. So most of the employees at OpenAI will be thinking, “Safety,” or at least to some degree.

Sam Altman (01:27:31) Broadly defined. Yes.

Lex Fridman (01:27:33) Yeah. I wonder, what are the full broad definition of that? What are the different harms that could be caused? Is this on a technical level or is this almost security threats?

Sam Altman (01:27:44) It could be all those things. Yeah, I was going to say it’ll be people, state actors trying to steal the model. It’ll be all of the technical alignment work. It’ll be societal impacts, economic impacts. It’s not just like we have one team thinking about how to align the model. It’s really going to be getting to the good outcome is going to take the whole effort.

Lex Fridman (01:28:10) How hard do you think people, state actors, perhaps, are trying to, first of all, infiltrate OpenAI, but second of all, infiltrate unseen?

Lex Fridman (01:28:24) What kind of accent do they have?

Sam Altman (01:28:27) I don’t think I should go into any further details on this point.

Lex Fridman (01:28:29) Okay. But I presume it’ll be more and more and more as time goes on.

Sam Altman (01:28:35) That feels reasonable.

Leap to GPT-5

Lex Fridman (01:28:37) Boy, what a dangerous space. Sorry to linger on this, even though you can’t quite say details yet, but what aspects of the leap from GPT-4 to GPT-5 are you excited about?

Sam Altman (01:28:53) I’m excited about being smarter. And I know that sounds like a glib answer, but I think the really special thing happening is that it’s not like it gets better in this one area and worse at others. It’s getting better across the board. That’s, I think, super-cool.

Lex Fridman (01:29:07) Yeah, there’s this magical moment. I mean, you meet certain people, you hang out with people, and you talk to them. You can’t quite put a finger on it, but they get you. It’s not intelligence, really. It’s something else. And that’s probably how I would characterize the progress of GPT. It’s not like, yeah, you can point out, “Look, you didn’t get this or that,” but it’s just to which degree is there’s this intellectual connection. You feel like there’s an understanding in your crappy formulated prompts that you’re doing that it grasps the deeper question behind the question that you were. Yeah, I’m also excited by that. I mean, all of us love being heard and understood.

Lex Fridman (01:29:53) That’s a weird feeling. Even with a programming, when you’re programming and you say something, or just the completion that GPT might do, it’s just such a good feeling when it got you, what you’re thinking about. And I look forward to getting you even better. On the programming front, looking out into the future, how much programming do you think humans will be doing 5, 10 years from now?

Sam Altman (01:30:19) I mean, a lot, but I think it’ll be in a very different shape. Maybe some people will program entirely in natural language.

Lex Fridman (01:30:26) Entirely natural language?

Sam Altman (01:30:29) I mean, no one programs writing by code. Some people. No one programs the punch cards anymore. I’m sure you can find someone who does, but you know what I mean.

Lex Fridman (01:30:39) Yeah. You’re going to get a lot of angry comments. No. Yeah, there’s very few. I’ve been looking for people who program Fortran. It’s hard to find even Fortran. I hear you. But that changes the nature of what the skillset or the predisposition for the kind of people we call programmers then.

Sam Altman (01:30:55) Changes the skillset. How much it changes the predisposition, I’m not sure.

Lex Fridman (01:30:59) Well, the same kind of puzzle solving, all that kind of stuff.

Lex Fridman (01:31:02) Programming is hard. It’s like how get that last 1% to close the gap? How hard is that?

Sam Altman (01:31:09) Yeah, I think with most other cases, the best practitioners of the craft will use multiple tools. And they’ll do some work in natural language, and when they need to go write C for something, they’ll do that.

Lex Fridman (01:31:20) Will we see humanoid robots or humanoid robot brains from OpenAI at some point?

Lex Fridman (01:31:29) How important is embodied AI to you?

Sam Altman (01:31:32) I think it’s depressing if we have AGI and the only way to get things done in the physical world is to make a human go do it. So I really hope that as part of this transition, as this phase change, we also get humanoid robots or some sort of physical world robots.

Lex Fridman (01:31:51) I mean, OpenAI has some history and quite a bit of history working in robotics, but it hasn’t quite done in terms of ethics-

Sam Altman (01:31:59) We’re a small company. We have to really focus. And also, robots were hard for the wrong reason at the time, but we will return to robots in some way at some point.

Lex Fridman (01:32:11) That sounds both inspiring and menacing.

Lex Fridman (01:32:15) Because immediately, we will return to robots. It’s like in Terminator-

Sam Altman (01:32:20) We will return to work on developing robots. We will not turn ourselves into robots, of course.

AGI

Lex Fridman (01:32:24) Yeah. When do you think we, you and we as humanity will build AGI?

Sam Altman (01:32:31) I used to love to speculate on that question. I have realized since that I think it’s very poorly formed, and that people use extremely different definitions for what AGI is. So I think it makes more sense to talk about when we’ll build systems that can do capability X or Y or Z, rather than when we fuzzily cross this one mile marker. AGI is also not an ending. It’s closer to a beginning, but it’s much more of a mile marker than either of those things. But what I would say, in the interest of not trying to dodge a question, is I expect that by the end of this decade and possibly somewhat sooner than that, we will have quite capable systems that we look at and say, “Wow, that’s really remarkable.” If we could look at it now. Maybe we’ve adjusted by the time we get there.

Lex Fridman (01:33:31) But if you look at ChatGPT, even 3.5, and you show that to Alan Turing, or not even Alan Turing, people in the ’90s, they would be like, “This is definitely AGI.” Well, not definitely, but there’s a lot of experts that would say, “This is AGI.”

Sam Altman (01:33:49) Yeah, but I don’t think 3.5 changed the world. It maybe changed the world’s expectations for the future, and that’s actually really important. And it did get more people to take this seriously and put us on this new trajectory. And that’s really important, too. So again, I don’t want to undersell it. I think I could retire after that accomplishment and be pretty happy with my career. But as an artifact, I don’t think we’re going to look back at that and say, “That was a threshold that really changed the world itself.”

Lex Fridman (01:34:20) So to you, you’re looking for some really major transition in how the world-

Sam Altman (01:34:24) For me, that’s part of what AGI implies.

Lex Fridman (01:34:29) Singularity- level transition?

Sam Altman (01:34:31) No, definitely not.

Lex Fridman (01:34:32) But just a major, like the internet being, like Google search did, I guess. What was the transition point, you think, now?

Sam Altman (01:34:39) Does the global economy feel any different to you now or materially different to you now than it did before we launched GPT-4? I think you would say no.

Lex Fridman (01:34:47) No, no. It might be just a really nice tool for a lot of people to use. Will help you with a lot of stuff, but doesn’t feel different. And you’re saying that-

Sam Altman (01:34:55) I mean, again, people define AGI all sorts of different ways. So maybe you have a different definition than I do. But for me, I think that should be part of it.

Lex Fridman (01:35:02) There could be major theatrical moments, also. What to you would be an impressive thing AGI would do? You are alone in a room with the system.

Sam Altman (01:35:16) This is personally important to me. I don’t know if this is the right definition. I think when a system can significantly increase the rate of scientific discovery in the world, that’s a huge deal. I believe that most real economic growth comes from scientific and technological progress.

Lex Fridman (01:35:35) I agree with you, hence why I don’t like the skepticism about science in the recent years.

Lex Fridman (01:35:43) But actual, measurable rate of scientific discovery. But even just seeing a system have really novel intuitions, scientific intuitions, even that would be just incredible.

Lex Fridman (01:36:02) You quite possibly would be the person to build the AGI to be able to interact with it before anyone else does. What kind of stuff would you talk about?

Sam Altman (01:36:09) I mean, definitely the researchers here will do that before I do. But well, I’ve actually thought a lot about this question. I think as we talked about earlier, I think this is a bad framework, but if someone were like, “Okay, Sam, we’re finished. Here’s a laptop, this is the AGI. You can go talk to it.” I find it surprisingly difficult to say what I would ask that I would expect that first AGI to be able to answer. That first one is not going to be the one which is like, I don’t think, “Go explain to me the grand unified theory of physics, the theory of everything for physics.” I’d love to ask that question. I’d love to know the answer to that question.

Lex Fridman (01:36:55) You can ask yes or no questions about “Does such a theory exist? Can it exist?”

Sam Altman (01:37:00) Well, then, those are the first questions I would ask.

Lex Fridman (01:37:02) Yes or no. And then based on that, “Are there other alien civilizations out there? Yes or no? What’s your intuition?” And then you just ask that.

Sam Altman (01:37:10) Yeah, I mean, well, so I don’t expect that this first AGI could answer any of those questions even as yes or nos. But if it could, those would be very high on my list.

Lex Fridman (01:37:20) Maybe you can start assigning probabilities?

Sam Altman (01:37:22) Maybe. Maybe we need to go invent more technology and measure more things first.

Lex Fridman (01:37:28) Oh, I see. It just doesn’t have enough data. It’s just if it keeps-

Sam Altman (01:37:31) I mean, maybe it says, “You want to know the answer to this question about physics, I need you to build this machine and make these five measurements, and tell me that.”

Lex Fridman (01:37:39) Yeah, “What the hell do you want from me? I need the machine first, and I’ll help you deal with the data from that machine.” Maybe it’ll help you build a machine.

Lex Fridman (01:37:49) And on the mathematical side, maybe prove some things. Are you interested in that side of things, too? The formalized exploration of ideas?

Lex Fridman (01:37:59) Whoever builds AGI first gets a lot of power. Do you trust yourself with that much power?

Sam Altman (01:38:14) Look, I’ll just be very honest with this answer. I was going to say, and I still believe this, that it is important that I nor any other one person have total control over OpenAI or over AGI. And I think you want a robust governance system. I can point out a whole bunch of things about all of our board drama from last year about how I didn’t fight it initially, and was just like, “Yeah. That’s the will of the board, even though I think it’s a really bad decision.” And then later, I clearly did fight it, and I can explain the nuance and why I think it was okay for me to fight it later. But as many people have observed, although the board had the legal ability to fire me, in practice, it didn’t quite work. And that is its own kind of governance failure.

(01:39:24) Now again, I feel like I can completely defend the specifics here, and I think most people would agree with that, but it does make it harder for me to look you in the eye and say, “Hey, the board can just fire me.” I continue to not want super-voting control over OpenAI. I never have. Never have had it, never wanted it. Even after all this craziness, I still don’t want it. I continue to think that no company should be making these decisions, and that we really need governments to put rules of the road in place.

(01:40:12) And I realize that that means people like Marc Andreessen or whatever will claim I’m going for regulatory capture, and I’m just willing to be misunderstood there. It’s not true. And I think in the fullness of time, it’ll get proven out why this is important. But I think I have made plenty of bad decisions for OpenAI along the way, and a lot of good ones, and I’m proud of the track record overall. But I don’t think any one person should, and I don’t think any one person will. I think it’s just too big of a thing now, and it’s happening throughout society in a good and healthy way. But I don’t think any one person should be in control of an AGI, or this whole movement towards AGI. And I don’t think that’s what’s happening.

Lex Fridman (01:41:00) Thank you for saying that. That was really powerful, and that was really insightful that this idea that the board can fire you is legally true. But human beings can manipulate the masses into overriding the board and so on. But I think there’s also a much more positive version of that, where the people still have power, so the board can’t be too powerful, either. There’s a balance of power in all of this.

Sam Altman (01:41:29) Balance of power is a good thing, for sure.

Lex Fridman (01:41:34) Are you afraid of losing control of the AGI itself? That’s a lot of people who are worried about existential risk not because of state actors, not because of security concerns, because of the AI itself.

Sam Altman (01:41:45) That is not my top worry as I currently see things. There have been times I worried about that more. There may be times again in the future where that’s my top worry. It’s not my top worry right now.

Lex Fridman (01:41:53) What’s your intuition about it not being your worry? Because there’s a lot of other stuff to worry about, essentially? You think you could be surprised? We-

Lex Fridman (01:42:02) … could be surprised?

Sam Altman (01:42:03) Of course. Saying it’s not my top worry doesn’t mean I don’t think we need to. I think we need to work on it. It’s super hard, and we have great people here who do work on that. I think there’s a lot of other things we also have to get right.

Lex Fridman (01:42:15) To you, it’s not super-easy to escape the box at this time, connect to the internet-

Sam Altman (01:42:21) We talked about theatrical risks earlier. That’s a theatrical risk. That is a thing that can really take over how people think about this problem. And there’s a big group of very smart, I think very well-meaning AI safety researchers that got super-hung up on this one problem, I’d argue without much progress, but super-hung up on this one problem. I’m actually happy that they do that, because I think we do need to think about this more. But I think it pushed out of the space of discourse a lot of the other very significant AI- related risks.

Lex Fridman (01:43:01) Let me ask you about you tweeting with no capitalization. Is the shift key broken on your keyboard?

Sam Altman (01:43:07) Why does anyone care about that?

Sam Altman (01:43:10) But why? I mean, other people ask me about that, too. Any intuition?

Lex Fridman (01:43:17) I think it’s the same reason. There’s this poet, E.E. Cummings, that mostly doesn’t use capitalization to say, “Fuck you” to the system kind of thing. And I think people are very paranoid, because they want you to follow the rules.

Sam Altman (01:43:29) You think that’s what it’s about?

Lex Fridman (01:43:30) I think it’s like this-

Sam Altman (01:43:33) It’s like, “This guy doesn’t follow the rules. He doesn’t capitalize his tweets.”

Sam Altman (01:43:36) “This seems really dangerous.”

Lex Fridman (01:43:37) “He seems like an anarchist.”

Lex Fridman (01:43:40) Are you just being poetic, hipster? What’s the-

Lex Fridman (01:43:44) Follow the rules, Sam.

Sam Altman (01:43:45) I grew up as a very online kid. I’d spent a huge amount of time chatting with people back in the days where you did it on a computer, and you could log off instant messenger at some point. And I never capitalized there, as I think most internet kids didn’t, or maybe they still don’t. I don’t know. And actually, now I’m really trying to reach for something, but I think capitalization has gone down over time. If you read Old English writing, they capitalized a lot of random words in the middle of sentences, nouns and stuff that we just don’t do anymore. I personally think it’s sort of a dumb construct that we capitalize the letter at the beginning of a sentence and of certain names and whatever, but that’s fine.

(01:44:33) And then I used to, I think, even capitalize my tweets because I was trying to sound professional or something. I haven’t capitalized my private DMs or whatever in a long time. And then slowly, stuff like shorter-form, less formal stuff has slowly drifted to closer and closer to how I would text my friends. If I pull up a Word document and I’m writing a strategy memo for the company or something, I always capitalize that. If I’m writing a long, more formal message, I always use capitalization there, too. So I still remember how to do it. But even that may fade out. I don’t know. But I never spend time thinking about this, so I don’t have a ready-made-

Lex Fridman (01:45:23) Well, it’s interesting. It’s good to, first of all, know the shift key is not broken.

Lex Fridman (01:45:27) I was mostly concerned about your-

Lex Fridman (01:45:29) … well-being on that front.

Sam Altman (01:45:30) I wonder if people still capitalize their Google searches. If you’re writing something just to yourself or their ChatGPT queries, if you’re writing something just to yourself, do some people still bother to capitalize?

Lex Fridman (01:45:40) Probably not. But yeah, there’s a percentage, but it’s a small one.

Sam Altman (01:45:44) The thing that would make me do it is if people were like, “It’s a sign of…” Because I’m sure I could force myself to use capital letters, obviously. If it felt like a sign of respect to people or something, then I could go do it. But I don’t know. I don’t think about this.

Lex Fridman (01:46:01) I don’t think there’s a disrespect, but I think it’s just the conventions of civility that have a momentum, and then you realize it’s not actually important for civility if it’s not a sign of respect or disrespect. But I think there’s a movement of people that just want you to have a philosophy around it so they can let go of this whole capitalization thing.

Sam Altman (01:46:19) I don’t think anybody else thinks about this as much. I mean, maybe some people. I know some people-

Lex Fridman (01:46:22) People think about every day for many hours a day. So I’m really grateful we clarified it.

Sam Altman (01:46:28) Can’t be the only person that doesn’t capitalize tweets.

Lex Fridman (01:46:30) You’re the only CEO of a company that doesn’t capitalize tweets.

Sam Altman (01:46:34) I don’t even think that’s true, but maybe. I’d be very surprised.

Lex Fridman (01:46:37) All right. We’ll investigate further and return to this topic later. Given Sora’s ability to generate simulated worlds, let me ask you a pothead question. Does this increase your belief, if you ever had one, that we live in a simulation, maybe a simulated world generated by an AI system?

Sam Altman (01:47:05) Somewhat. I don’t think that’s the strongest piece of evidence. I think the fact that we can generate worlds should increase everyone’s probability somewhat, or at least openness to it somewhat. But I was certain we would be able to do something like Sora at some point. It happened faster than I thought, but I guess that was not a big update.

Lex Fridman (01:47:34) Yeah. But the fact that… And presumably, it’ll get better and better and better… You can generate worlds that are novel, they’re based in some aspect of training data, but when you look at them, they’re novel, that makes you think how easy it is to do this thing. How easy it is to create universes, entire video game worlds that seem ultra-realistic and photo-realistic. And then how easy is it to get lost in that world, first with a VR headset, and then on the physics-based level?

Sam Altman (01:48:10) Someone said to me recently, I thought it was a super-profound insight, that there are these very-simple sounding but very psychedelic insights that exist sometimes. So the square root function, square root of four, no problem. Square root of two, okay, now I have to think about this new kind of number. But once I come up with this easy idea of a square root function that you can explain to a child and exists by even looking at some simple geometry, then you can ask the question of “What is the square root of negative one?” And this is why it’s a psychedelic thing. That tips you into some whole other kind of reality.

(01:49:07) And you can come up with lots of other examples, but I think this idea that the lowly square root operator can offer such a profound insight and a new realm of knowledge applies in a lot of ways. And I think there are a lot of those operators for why people may think that any version that they like of the simulation hypothesis is maybe more likely than they thought before. But for me, the fact that Sora worked is not in the top five.

Lex Fridman (01:49:46) I do think, broadly speaking, AI will serve as those kinds of gateways at its best, simple, psychedelic-like gateways to another wave C reality.

Sam Altman (01:49:57) That seems for certain.

Lex Fridman (01:49:59) That’s pretty exciting. I haven’t done ayahuasca before, but I will soon. I’m going to the aforementioned Amazon jungle in a few weeks.

Lex Fridman (01:50:08) Yeah, I’m excited for it. Not the ayahuasca part, but that’s great, whatever. But I’m going to spend several weeks in the jungle, deep in the jungle. And it’s exciting, but it’s terrifying.

Sam Altman (01:50:17) I’m excited for you.

Lex Fridman (01:50:18) There’s a lot of things that can eat you there, and kill you and poison you, but it’s also nature, and it’s the machine of nature. And you can’t help but appreciate the machinery of nature in the Amazon jungle. It’s just like this system that just exists and renews itself every second, every minute, every hour. It’s the machine. It makes you appreciate this thing we have here, this human thing came from somewhere. This evolutionary machine has created that, and it’s most clearly on display in the jungle. So hopefully, I’ll make it out alive. If not, this will be the last fun conversation we’ve had, so I really deeply appreciate it. Do you think, as I mentioned before, there’s other alien civilizations out there, intelligent ones, when you look up at the skies?

Aliens

Sam Altman (01:51:17) I deeply want to believe that the answer is yes. I find the Fermi paradox very puzzling.

Lex Fridman (01:51:28) I find it scary that intelligence is not good at handling-

Lex Fridman (01:51:34) … powerful technologies. But at the same time, I think I’m pretty confident that there’s just a very large number of intelligent alien civilizations out there. It might just be really difficult to travel through space.

Lex Fridman (01:51:50) And it also makes me think about the nature of intelligence. Maybe we’re really blind to what intelligence looks like, and maybe AI will help us see that. It’s not as simple as IQ tests and simple puzzle solving. There’s something bigger. What gives you hope about the future of humanity, this thing we’ve got going on, this human civilization?

Sam Altman (01:52:12) I think the past is a lot. I mean, we just look at what humanity has done in a not very long period of time, huge problems, deep flaws, lots to be super-ashamed of. But on the whole, very inspiring. Gives me a lot of hope.

Lex Fridman (01:52:29) Just the trajectory of it all.

Lex Fridman (01:52:31) That we’re together pushing towards a better future.

Sam Altman (01:52:40) One thing that I wonder about, is AGI going to be more like some single brain, or is it more like the scaffolding in society between all of us? You have not had a great deal of genetic drift from your great-great-great grandparents, and yet what you’re capable of is dramatically different. What you know is dramatically different. And that’s not because of biological change. I mean, you got a little bit healthier, probably. You have modern medicine, you eat better, whatever. But what you have is this scaffolding that we all contributed to built on top of. No one person is going to go build the iPhone. No one person is going to go discover all of science, and yet you get to use it. And that gives you incredible ability. And so in some sense, that we all created that, and that fills me with hope for the future. That was a very collective thing.

Lex Fridman (01:53:40) Yeah, we really are standing on the shoulders of giants. You mentioned when we were talking about theatrical, dramatic AI risks that sometimes you might be afraid for your own life. Do you think about your death? Are you afraid of it?

Sam Altman (01:53:58) I mean, if I got shot tomorrow and I knew it today, I’d be like, “Oh, that’s sad. I want to see what’s going to happen. What a curious time. What an interesting time.” But I would mostly just feel very grateful for my life.

Lex Fridman (01:54:15) The moments that you did get. Yeah, me, too. It’s a pretty awesome life. I get to enjoy awesome creations of humans, which I believe ChatGPT is one of, and everything that OpenAI is doing. Sam, it’s really an honor and pleasure to talk to you again.

Sam Altman (01:54:35) Great to talk to you. Thank you for having me.

Lex Fridman (01:54:38) Thanks for listening to this conversation with Sam Altman. To support this podcast, please check out our sponsors in the description. And now let me leave you with some words from Arthur C. Clarke. “It may be that our role on this planet is not to worship God, but to create him.” Thank you for listening, and hope to see you next time.

杰夫·贝佐斯:亚马逊与蓝色起源 (2023-12-14)

Jeff Bezos: Amazon and Blue Origin (2023-12-14)

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:亚马逊与蓝色起源创始人杰夫·贝索斯(Jeff Bezos)接受了 Lex Fridman 的首次长篇深度访谈,正值他卸任亚马逊 CEO 后,将主要精力投入到蓝色起源,旨在加速其发展,这为我们提供了一个审视其统一世界观——从地球电商到太空基建——的绝佳窗口。

  • 核心论点:本次对话的核心论题是**“构建下一代基础设施以释放未来创造力”。贝索斯的世界观一以贯之:无论是创建亚马逊,还是投身蓝色起源,其本质都是在进行一场长达数十年的“修路”工程。他将自己在亚马逊的成功归因于能够站在现有基础设施(信用卡支付、邮政系统、电话网络)的肩膀上,从而极大地降低了创业门槛。如今,他正动用“亚马逊的奖金”为太空时代做同样的事——打造廉价、可靠的太空重型基础设施(如 New Glenn 火箭),目标是使下一代太空创业者能够像今天的互联网创业者一样,在宿舍里就能开启伟大的事业。这一宏大愿景背后,是他从祖父农场生活中习得的第一性原理问题解决能力**、在亚马逊实践出的**“Day One”组织运营系统**,以及对人类文明必须迈向太空以实现持续增长和保护地球的深刻信念。

2. 🧠 深度观点解析 (Deep Dive Analysis)

维度一:太空基建的“亚马逊模式”——降本是唯一核心问题

  • 核心观点:进入太空在技术上是一个已解决的问题,唯一“有趣”且有价值的挑战是“戏剧性地降低进入轨道的成本”。贝索斯的目标是将他在亚马逊的“战利品”转化为能够造福后代的太空重型基础设施。
  • 原理解构:这个观点借鉴了互联网发展的历史。互联网之所以能爆发出巨大活力,是因为创业者不必自建支付系统(信用卡)、物流网络(邮政)或通信骨干网(电话线)。这些基础设施的存在,使得创新门槛被极大降低。贝索斯认为,太空经济的爆发同样需要一个前提:廉价可靠的“太空之路”。蓝色起源的角色不是成为唯一的太空公司,而是成为太空时代的“AWS”和“联邦快递”,为无数未来的太空企业提供基础服务,最终实现“两个学生在宿舍里就能创办一家有价值的太空公司”的场景。
  • 证据/案例
    • 亚马逊类比:贝索斯明确指出,1994年创立亚马逊时,他无需投资数十亿美元建设支付和物流网络,这使得创业成为可能。
    • Blue Ring 平台:该航天器被描述为“太空中的 AWS”,为载荷提供计算、通信、热管理和轨道转移等一系列“API 化”服务,让载荷设计者不必重复造轮子。
    • New Glenn 火箭:其设计的核心驱动力就是通过可复用的第一级和可负担的消耗性第二级,实现成本的大幅降低,为后续的月球基地和 O’Neill 殖民地愿景铺路。

维度二:“Day One”心智模型——对抗组织熵增的操作系统

  • 核心观点:“Day Two”是停滞、无关紧要、痛苦衰退直至死亡的代名词。因此,组织必须永远保持在“Day One”状态——即以创业第一天的心态运营
  • 原理解构:这是一种对抗大型组织官僚化和僵化的文化操作系统。其核心支柱包括:
    1. 客户至上 (Customer Obsession):不是关注竞争对手,而是从客户需求出发逆向工作。
    2. 警惕代理指标 (Skeptical view of proxies):指标是现实的“代理”,但不是现实本身。当数据和客户的真实反馈(anecdotes)冲突时,要相信反馈,并检查指标是否已失效。
    3. 拥抱外部趋势 (Eager adoption of external trends):对如 AI 这样的颠覆性技术保持高度敏感并迅速采纳。
    4. 高速决策 (High-velocity decision-making):这是保持活力的关键。
  • 证据/案例
    • 客服电话事件:贝索斯在会议上当场拨打亚马逊客服电话,以长达数分钟的等待时间戳穿了“客户平均等待时间小于60秒”这一代理指标的虚假性,证明了“当数据与轶事冲突时,轶事通常是对的”。
    • 六页备忘录会议:用结构化叙事文档代替 PPT,强迫作者深度思考、逻辑清晰,并通过会前“学习时间”确保所有与会者在同一认知水平上,从而实现高质量、高效率的“混乱”讨论。

维度三:决策科学——区分“单向门”与“双向门”

  • 核心观点:企业最大的瓶颈在于决策速度。决策效率低下的根源在于用同一种重量级流程处理所有决策。必须对决策进行分类管理。
  • 原理解构
    • 单向门 (One-way doors):指那些后果严重且几乎不可逆的决策。对于这类决策,必须放慢速度,谨慎分析,甚至由最高层管理者扮演“首席减速官”的角色。
    • 双向门 (Two-way doors):指那些可逆的、即使错了也容易纠正的决策。这类决策应被下放到组织深处的个人或小团队,快速做出。
    • “我不同意,但承诺执行” (Disagree and commit):这是一种高效的冲突解决机制,避免了因意见不合导致的僵局。当决策者(通常是上级)听取了不同意见后,即使不完全认同,也会选择支持下属的方案并全力帮助其成功,而不是事后诸葛亮或暗中掣肘。
  • 证据/案例
    • 火箭推进剂选择:为 New Glenn 选择液化天然气(LNG)作为第一级推进剂,液氢作为第二级,这是一个典型的“单向门”决策,一旦做出,改变的代价极高。
    • 亚马逊 Prime:推出 Prime 会员服务也是一个重大的“单向门”决策,因为它深刻地改变了公司的商业模式和客户承诺。

维度四:火箭的物理与制造悖论——“喜欢大”与“讨厌大”

  • 核心观点:从物理学角度看,火箭“喜欢”变大;但从制造业角度看,火箭“讨厌”变大。这一核心矛盾是大型运载火箭工程的根本挑战。
  • 原理解构
    • 物理优势:随着火箭尺寸增大,寄生质量(如航空电子设备)占总质量的比例会急剧下降,效率提升。同时,涡轮泵等旋转机械的尺寸越大,其运行效率也越高,因为间隙导致的泄漏占比更小。
    • 制造劣势:大型结构的制造、运输和装配是巨大的挑战。需要重型起重机、庞大的工装夹具和纪念碑式的土木工程(如发射台地基)。“规模生产” (rate manufacturing) 的难度不亚于火箭本身的设计。
  • 证据/案例
    • New Glenn 工厂:贝索斯描述了其巨大的尺寸和所需的 monumental 基础设施,如在佛罗里达沼泽地深打150英尺的地桩来支撑发射台。
    • 生产速率挑战:贝索斯强调,造出第一枚火箭不难,难的是建立一个能以“每月两枚”的速度高效生产第二级的工厂系统。这才是蓝色起源当前面临的核心挑战。

维度五:生成式 AI 的本质——作为“发现”而非“发明”

  • 核心观点:大型语言模型(LLMs)更像是科学发现(Discovery),而非工程发明(Invention)。
  • 原理解构:我们对待发明(如 787 飞机)的态度是掌控和预测,不希望有任何意外。而对待发现(如伽利略通过望远镜发现木星的卫星),我们则不断被其涌现出的新能力所“惊喜”。LLMs 的能力边界是未知的,我们无法像设计传统软件那样精确预测其行为。这暗示了其巨大的潜力和不确定性。
  • 证据/案例
    • 望远镜类比:望远镜是发明,但通过它看到木卫四是发现。LLM 的模型架构是发明,但其强大的推理和生成能力是发现。
    • 人脑效率对比:人脑以约 20 瓦的功率完成复杂任务,而 AI 则需数千瓦,这表明我们尚未“发现”人脑所使用的某些高效算法“技巧”。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识“当数据和轶事(anecdotes)不一致时,轶事通常是对的。” 在一个以数据驱动闻名的公司,其创始人却极度重视定性的、来自个体的真实反馈。这挑战了“唯数据论”的观点,认为数据测量本身可能存在缺陷,而客户的真实抱怨才是更接近“真理”的信号。
  • 盲点与局限
    • 对“妥协”的批判:贝索斯指出,妥协 (Compromise) 是一种低能耗、但偏离真相的决策方式。比如两个人对天花板高度有争议,取中间值(11.5英尺)是妥协,而拿卷尺去测量才是寻求真相。他同样批判了“比谁更顽固”(war of attrition)的决策模式。这揭示了许多组织内部低效的根源。
    • 人类的社交本能 vs. 求真本能:贝索斯深刻地指出,“我们不是求真动物,我们是社交动物”。在组织中,讲真话(尤其是令人不舒服的真话)需要消耗巨大能量且有社交风险。因此,一个高绩效组织必须刻意设计支持讲真话的机制(如让他最后一个发言)。
  • 未解之谜
    • 火箭第二级的终极形态:贝索斯坦诚,对于火箭第二级,是追求极致的可复用性,还是追求极致的低成本消耗,目前并没有明确的答案。这是一个悬而未决的工程与经济权衡问题。
    • AI 与人脑的效率鸿沟:尽管 LLM 取得了惊人成就,但其与人脑在能耗和数据需求上的巨大差距(千瓦 vs 20瓦,数十亿英里数据 vs 16岁新手司机)表明,AI 领域仍有根本性的“技巧”尚未被发现。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “I have come to use the word impossible with great caution.”

    • 中文意译:“我已经学会非常谨慎地使用‘不可能’这个词了。”
    • 语境:在谈论阿波罗登月计划如何将曾经被用作“不可能”代名词的事情变为现实时,引用冯·布劳恩的名言。
  2. “Efficiency and invention are sort of at odds… real invention… requires wandering.”

    • 中文意译:“效率和创造在某种程度上是相互矛盾的……真正的创造……需要‘闲逛’(思想上的漫游)。”
    • 语境:解释他自己的创新思维过程,强调真正的突破来自于非线性的、允许探索和失败的“漫游”,而非高效的直线前进。
  3. “When the data and the anecdotes disagree, the anecdotes are usually right.”

    • 中文意译:“当数据和客户的零散反馈不一致时,那些反馈通常是对的。”
    • 语境:在解释为何要警惕代理指标时,讲述他如何通过亲自致电客服来验证数据,强调真实用户体验的重要性。
  4. “We are not really truth-seeking animals. We are social animals… any high performing organization has to have mechanisms and a culture that supports truth-telling.”

    • 中文意译:“我们本质上不是追求真理的动物,而是社交动物……任何一个高绩效组织都必须拥有支持讲真话的机制和文化。”
    • 语境:论述在企业中建立求真文化的难度与必要性,因为讲真话违背了人类为了生存而形成的“随大流”的社交本能。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)

    • 蓝色起源:在贝索斯亲自督战下,将以“世界上最果断的公司”为目标,显著提速。New Glenn 的首飞(2024年目标)将成为关键里程碑。公司重心将从设计转向**“规模化生产”**,解决供应链和制造流程瓶颈。
    • 亚马逊Alexa 将迎来基于生成式 AI 的重大升级,变得“智能得多”,可能重塑语音助手市场格局。AWS 将凭借 Bedrock 在企业级 AI 服务市场占据重要地位,主打数据安全和模型定制。
    • 太空产业竞争:竞争的核心将更加聚焦于单位载荷的发射成本和发射频率。贝索斯认为太空足够大,可以容纳多个赢家(SpaceX, Blue Origin 等),竞争并非零和游戏,而是共同做大蛋糕。
  • 长期终局 (5-10年)

    • 如果贝索斯的设想成真,我们将见证一个太空经济的“寒武纪大爆发”。廉价的轨道运输将催生大量新商业模式:太空制造、太空能源、小行星采矿、轨道数据中心等。
    • 地球的角色将发生转变。重工业和能源密集型产业将逐步迁移至太空,利用近乎无限的太阳能和地外资源。地球将被“分区保护”,成为一个类似于“黄石国家公园”的、以居住、科研和轻工业为主的美丽家园。
    • O’Neill 式空间站将成为人类在太阳系扩张的主要居住形式,而非改造行星表面。这些可以提供标准地球重力、环境可控的人造世界,将容纳数以万亿计的人口,极大释放人类的智力与创造力潜能。
  • 行动建议

    • 开发者/创业者:开始思考如何在**“基础设施层”**之上构建应用。短期内,是在 AWS Bedrock 等平台上开发企业级 AI 应用;长期看,是构思基于 Blue Ring 这样在轨服务平台的商业模式,为太空经济提供“软件”和“服务”。
    • 投资者:长期投资应关注太空经济的“镐头与铲子”。这意味着投资于发射服务、在轨运输、能源、通信等基础建设领域,这些是未来所有太空应用赖以生存的基石。
    • 企业领导者:应系统性地学习并实践贝索斯的**“Day One”决策与组织框架**。特别是“六页备忘录”、“单双向门决策”和“警惕代理指标”等方法论,对于任何希望在快速变化时代保持敏捷和创新能力的企业都极具参考价值。

这是一份基于杰夫·贝佐斯(Jeff Bezos)与莱克斯·弗里德曼(Lex Fridman)对话深度重构的行业分析报告。


深度研报:从零售帝国到星际基建——贝佐斯的认知模型与终局思维

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:这是亚马逊创始人、蓝色起源(Blue Origin)掌舵人杰夫·贝佐斯首次进行如此深度的长篇对话。背景是他已卸任亚马逊 CEO,全面回归其“第一志愿”——航天事业。
  • 核心论点:贝佐斯的世界观建立在“通过降低准入门槛来释放人类潜能”的逻辑链条上。他认为航天的核心不在于探索,而在于基建化。正如互联网的成功依赖于已有的快递和支付系统,航天的未来取决于能否建立低成本的轨道运输和地外资源利用系统,从而支撑起一个拥有万亿人口、“千名爱因斯坦”并存的太阳系文明,同时将重工业移出地球,实现母星的生态保护。

2. 🧠 深度观点解析 (Deep Dive Analysis)

A. 航天领域的“AWS模式”:Blue Ring 与空间物流

  • 核心观点:航天不应是“精英科研”,而应是“API化”的商业服务。
  • 原理解构:贝佐斯提出了 Blue Ring 项目,其本质是空间的物流和计算节点。它不仅提供轨道转移(化学和电力双推进),还提供热管理、计算和电力支持。
  • 证据/案例:贝佐斯将其类比为 Amazon Web Services (AWS)。创业者不再需要研发抗辐射计算系统或复杂的动力模块,只需像调用 API 一样接入 Blue Ring 提供的基础设施。这种“重基建、轻应用”的模式旨在让未来的“宿舍创业者”也能开启航天公司。

B. 决策熵减:单向门与双向门理论 (Decision Reversibility)

  • 核心观点:决策效率取决于对“可逆性”的判断。
  • 原理解构
    • 单向门 (One-way door):不可逆、后果重大的决策(如更换火箭燃料类型)。需要缓慢、审慎、由高层集中决策。
    • 双向门 (Two-way door):可逆决策。应当下放到一线团队,快速执行,错了就退回来。
  • 证据/案例:贝佐斯指出,大型企业最容易犯的错误是“用管理单向门的重量级流程去处理双向门决策”,这会导致严重的效率衰退(Day 2 状态)。

C. 数据与轶事的冲突:寻找真相的锚点

  • 核心观点:当实验数据与客户反馈(轶事)发生冲突时,数据往往是错的。
  • 原理解构:这并非贬低数据,而是警惕代理指标 (Proxies) 的失灵。指标是真实情况的近似值,随着环境漂移,指标会失效,但惯性会让员工继续盲从指标(即“代理指标崇拜”)。
  • 证据/案例:贝佐斯在会议现场拨打 1-800 客服电话。数据报告显示等待时间少于 60 秒,但现场实测超过 10 分钟。这证明了测量方式或指标本身已背离真相。

D. 制造作为核心壁垒:从一到无限 (Rate Manufacturing)

  • 核心观点:制造出第一枚火箭不难,难的是建立“生产率” (Rate Manufacturing)。
  • 原理解构:贝佐斯认为制造大型结构件(如 New Glenn 的整流罩)是反物理直觉的,需要极其昂贵的定制工具(如摩擦搅拌焊接技术、碳纤维铺带机)。
  • 证据/案例:New Glenn 的设计目标是每年发射 24 次。这意味着每两周就要生产一个 expendable(消耗型)二级火箭,每周要生产一台 BE-3U 引擎。这种“工业化规模”的系统构建难度远超设计火箭本身。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 人工智能是“发现”而非“发明”:贝佐斯认为大模型(LLM)更像伽利略眼中的木星卫星。我们并没有像设计 787 客机那样精确“设计”出 LLM 的所有能力,而是在不断“发现”它能做什么。这暗示了 AI 的不可预测性和非确定性。
  • 妥协是真理的敌人:他强烈反对“折中/妥协 (Compromise)”作为冲突解决方案。妥协是低能量的逃避,它并不导向真理。他提倡**“不一致但执行 (Disagree and Commit)”**,或者是通过“不断上报 (Escalation)”直到找到真理,而非为了社交和谐而选择中间值。
  • 人类并非天生的真理追寻者:贝佐斯直言,人类进化是为了社交协作(不被部落踢走),而非寻找真相。在企业中,说出令上级不悦的真理是违反人性的,因此必须建立一套“非人性”的制度(如 6 页纸备忘录、末位发言制)来强制提取真理。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “I have come to use the word ‘impossible’ with great caution.”
    • (我学会了对“不可能”这个词保持极大的审慎。) —— 引用冯·布劳恩,语境:讨论阿波罗计划如何在资源倾斜下将未来强行拉到当下。
  2. “When the data and the anecdotes disagree, the anecdotes are usually right.”
    • (当数据与轶事不符时,轶事通常是正确的。) —— 语境:警惕管理中的代理指标陷阱。
  3. “You don’t go to heaven when you die. You go to heaven when you’re born.”
    • (你死后不会去天堂,当你出生在地球时你就已经在天堂了。) —— 引用宇航员 Jim Lovell,语境:描述“概览效应”以及对地球生态脆弱性的极致认知。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)
    • New Glenn 的首飞 (2024):这将改变重型猎鹰(Falcon Heavy)独大的局面,引入竞争并进一步压低轨道运输成本。
    • 空间基建化:随着 Blue Ring 和月球着陆器(Mark 1)的成熟,航天器的功能将模块化,卫星制造商将从“造车”模式转变为“搭积木”模式。
  • 长期终局 (5-10年及以后)
    • 月球工业化:利用月球土壤(Regolith)制造氧气和太阳能电池板将成为可能,月球将成为人类进入深空的“加油站”。
    • 长线思维 (10,000-Year Clock):贝佐斯试图通过万年钟项目,将人类的认知带宽从季度/财年,拉长到文明尺度。如果成功,人类决策将更具环境韧性。
  • 行动建议
    • 对管理者:废除 PowerPoint,强制推行叙事性文档(6-page memo)。文档能过滤模糊思考,PPT 则容易掩盖逻辑缺陷。
    • 对创业者:寻找那些“10年后依然不会变”的需求(如更低的价格、更快的速度),并在这些确定性上投入资源。
    • 对技术决策者:区分单向门和双向门。如果你能在不造成灾难的情况下撤回决定,那么请立刻执行,不要浪费时间开会。

分析师点评:贝佐斯正在经历从“零售巨头”到“文明架构师”的身份转变。他将亚马逊积累的“Day 1”哲学(客户至上、基建先行、决策脱敏)降维打击式地应用在航天业。他追求的不是一次成功的发射,而是通过建立高频、低成本的系统,彻底终结航天的“探险时代”,开启航天的“基建时代”。

深度研报:Jeff Bezos 与 Lex Fridman 对话解析

1. 🎯 核心论题与背景

对话背景:本次深度访谈是美国科技史上最重量级的对话之一,话题横跨物理哲学、商业决策、航天工程以及人类未来。背景设定在 Jeff Bezos 领导亚马逊 27 年后,决定全情投入 Blue Origin(蓝色起源)作为其义务 CEO 之际,正值 New Glenn 火箭制造的关键节点。

核心论点:这场对话不仅仅是对 Jeff Bezos 个人哲学的回顾,更是一次关于**“长期主义作为生存工具”**的人类文明层面论证。Bezos 系统性地阐述了人类文明增长与能源消耗的必然矛盾,指出解决之道在于星际扩张。同时,他将亚马逊的“Day 1”思维(持续创新、拒绝代理指标、重视直觉)作为一种方法论,试图在火箭制造等重资产行业中对抗“Day 2 滞胀”陷阱。其核心世界观是:技术进步(AI/Nuclear/Space)既是我们自我毁灭的潜能,也是延展人类认知边界、实现“万亿人口”文明壮丽愿景的唯一路径。


2. 🧠 深度观点解析

2.1 能源悖论与星际移民的必然性

  • 核心观点:要维持现代社会 50-100 年来“几乎所有方面都在变好”的生活质量,人类必须大幅提高人均能耗。但这与“生活在有限的地球宝石上”既看似矛盾,实则不可调和。
  • 原理解构:Bezos 认为,工业革命在不破坏地球的前提下实现高能耗是不可能的。地球生态系统虽然完美但容量有限。因此,我们将重工业和能量密集型业务移至太空(利用月球和小行星的资源),以保留地球上优美的自然环境供“度假”或居住。
  • 证据/案例:他引用了 Gagarin 的名言 “My God, it’s blue” 来强调地球的脆弱性。他指出,支持干万亿人口文明所需的能量,如果全部在地球表面开采,将导致自然世界的永久毁灭。

2.2 重资产基础设施的“重量级”宿命

  • 核心观点:火箭制造业是一个规模经济极其显著的领域,但受限于制造工艺,小火箭往往比大火箭难以运行。
  • 原理解构“寄生质量”“泵的非线性效率” 是物理学关键。像电子设备及液压泵这类系统,随火箭尺寸增大,其相对于推力的重量占比会急剧下降。因此,更大的火箭(如 New Glenn,半台 Saturn V 的推力)在物理上更具优势。目前 Blue Origin 的核心挑战不是设计火箭本身,而是建立具备高产能的制造工厂
  • 证据/案例:New Glenn 采用 Liquid Natural Gas (LNG) 作为一级燃料(大幅降低成本)和 Liquid Hydrogen (LH2) 作为二级燃料(高比冲),并采用了 friction stir welding (摩擦搅拌焊接)carbon composite (碳纤维复合材料) 技术来极致减轻重量。

2.3 “Decision-Making”的双重性:两扇门与度假模式

  • 核心观点:高效决策不等于快速决策。关键在于区分 One-way Doors(不可逆的重大决策)Two-way Doors(可逆的修改决策)
  • 原理解构
    • One-way Doors(如 AQ 上力式发动机的选择):需要强烈干预、深思熟虑、高层核准、缓慢推进,容错率为零。
    • Two-way Doors(如短期技术路径):应授权单个人甚至极小团队在早期快速做出决定,因为错了一半可以再试。
    • 反驳“Efficiency”:Bezos 指出,真正的实际Invention(发明) 不追求直线效率,需要“Wandering(漫游)”和试错。
  • 证据/案例:作为发明家的 Bezos,并不追求普通职场的那种“直线效率”,他承认如果在下午 2 点坐下来开会但不知道要花多久,因为“真正的发明需要漫无目的的行走”。

2.4 代理指标的陷阱与数据与轶事的加权

  • 核心观点:在大规模组织中,管理层往往会陷入 “Proxies”(代理指标) 的陷阱,即沉迷于管理一个被历史赋予意义的中间指标,而忘记了对其进行测量的原始对象。
  • 原理解构“当数据与轶事/抱怨发生冲突时,轶事通常是正确的。” 人们认为某个指标代表“顾客满意度”(如退货率),但随着业务变化,该指标可能已经失去了最终的因果关系。如果不打破这种这种惯性,企业就会滑向 Day 2。
  • 证据/案例:亚马逊早期曾花费数小时打电话给客服中心进行实测。数据显示等待时间不到 60 秒,但 Bezos 和团队亲测却超过 10 分钟。这种不体面但极具破坏力的测试立即倒逼了系统的修复。

2.5 新的图景:蓝环与“AWS for Space”

  • 核心观点:Blue Origin 的愿景不仅仅是把自己发射上去,而是建立 “重工业基础设施”,就像当年互联网建立在电话网之上一样。
  • 原理解构:Bezos 致力于建设运力足够大的 Blue Ring 航天器,它实际上是为卫星或载荷提供 “API”(服务接口),类似 AWS。它提供热管理、电力、计算能力和通信服务,使后续的“中学生创业者”无需从头设计电泵,就能直接利用这些现成服务开发太空应用。
  • 证据/案例:上一次互联网繁荣是因为早期的过度资本支出由邮政系统等基础设施承担,留给中小企业的都是“轻资产”业务。Bezos 希望通过建立重复载具,让下一代“个人电脑时代”成为继往开来的太空时代。

3. 💡 反直觉与批判性视角

打破共识

  • “Space Exploration is hostile to Earth preservation”: 主流环保主义常将工业扩张与自然保护对立,Bezos 则提出反直觉观点:真正的环保需要无限的空间资源来支撑高能耗文明。如果不探索太空,地球上的资源将无法承载未来 10 亿、100 亿人口的现代化生活;地球将变成一个自然保护区,供少数富人“富足消费”。
  • “Physics is a very unforgiving field for geniuses”: Bezos 承认自己在物理学上是“平庸”的。他分享了一个关于 Sudanese 同学 Yosanta 瞬间解出偏微分方程的故事,表明理论物理需要脑力结构的某种特定天赋,而 Bezos 更擅长在多维度空间中“漫游”寻找不寻常的组合。

盲点与局限

  • 技术乐观主义的本质安全错觉:虽然 Bezos 警惕 AI(认为它是“Bullshitter”,由于缺乏数学真理的边界),但他对工业化生存工具(如火箭、太阳能电池)的信任度极高。他假设人类在未来 10,000 年内不会自我毁灭(尽管奥运会和民族国家对此表示怀疑)。
  • 对“速度”的执念:为了推动 Blue Origin,Bezos 亲自下场并希望成为“世界上决定力最强的公司”。这隐含了商业逻辑——通过主观意志和内部指令来对抗制造业和航天工程通常缓慢、风险极高的本质。这是一种管理上的干预主义。

未解之谜

  • 制造的真实成本:Bezos 承认目前还没有完全线性地解决从“一次成功”到“月产 24 枚火箭”的生产率问题。制造大型碳复合材料整流罩和对抗地面的土木工程难度,是横亘在他愿景面前的巨大概率悬崖。

4. 💎 金句与高光时刻

  1. “I have come to use the word impossible with great caution.”

    意译: 我现在非常谨慎地使用“不可能”这个词。这不仅是对阿波罗登月精神的致敬,也暗示了人类能力的边界是开放的。

  2. “If you think about the good old days, they’re mostly an illusion.”

    意译: 怀念过去是种错觉。无论卫生、贫困还是医疗水平,现状都比过去好,只有“自然世界”在走下坡路。这是一个非常强有力的现实主义陈述。

  3. “Getting to orbit is a solved problem. The only interesting problem is dramatically reducing the cost of access to orbit.”

    意译: 既然我们早在几十年前就能把东西送上去,那么现在的真正难题不是技术本身,而是如何把成本降低到足以引爆大众创业的程度(就像当初的互联网)。

  4. “The 10,000 year clock is an art project… it’s designed to last 10,000 years with no human intervention.”

    意译: 位于德克萨斯州深山中的“万年钟”象征着他所推崇的“Long-term thinking”(长期思维)。在这个项目上,他认为人类文明的生死存亡可能只是漫长岁月中的瞬间,我们需要一种“百代之后”的眼界。

  5. “If the data and anecdotes disagree, the anecdotes are usually right.”

    意译: 当客户投诉(轶事)与后台数据(仪表盘)冲突时,不要迷信数据,因为数据采集系统本身可能已经“陈旧”或者“失效”了。


5. 🚀 行业启示与未来推演

短期影响 (1-3年)

  • 制造业集权:SpaceX 和 Blue Origin 的竞争将迫使整个行业从“研发模型”向 “大规模速率生产(Rate Manufacturing)” 转型。谁能最先解决组件的标准化和自动化装配,谁就能赢得太空市场的“第一回合”。
  • 决策文化的战火:Bezos 提出的“Two-way Door”和“Disagree and Commit”原则将被更多试图打破大公司官僚主义的科技巨头复制,强调扁平化和快速试错。

长期终局 (5-10年)

  • 太空是“地图更新”:未来的太空经济将分层。重型运输(像 18 世纪的邮政服务)和公用事业(像 AWS)将由少数寡头垄断,而基于此之上的应用层将涌现出成千上万的创业公司(从红木森林中长出的蘑菇)。
  • 地球变回“自然保护区”,宇宙变回“殖民地”:正如 Bezos 所说,不再有人会在地球表面的工业重地生活,地球将退化为类似黄石公园的生态博物馆,而能源工厂将迁移至小行星带和月球。

行动建议

  • 对于创业者:警惕代理指标。如果你的数据好但用户抱怨多,去实地跑一趟,亲自去接触产品。不要让系统设计成为真理的障碍。
  • 对于投资人:关注 Rate Manufacturing(量产能力) 的构建,而不仅仅是 R&D 阶段的技术参数。火箭的利润率更多取决于工厂的自动化程度。
  • 对于开发者/科技用户:Bezos 的“Long-term”观启示我们,真正的颠覆往往发生在基础设施层。个人可以学习他“Wandering”的思维方式,不要过早陷入线性思维的死胡同。

逐字稿

Introduction

Lex Fridman (00:00:00) The following is a conversation with Jeff Bezos, founder of Amazon and Blue Origin. This is his first time doing a conversation of this kind and of this length. And as he told me, it felt like we could have easily talked for many more hours, and I’m sure we will. This is the Lex Fridman Podcast. And now, dear friends, here’s Jeff Bezos.

Ranch

(00:00:24) You spent a lot of your childhood with your grandfather on a ranch here in Texas.

Lex Fridman (00:00:30) And I heard you had a lot of work to do around the ranch. So, what’s the coolest job you remember doing there?

Lex Fridman (00:00:37) Most interesting? Most memorable?

Jeff Bezos (00:00:41) It’s a real working ranch, and I spent all my summers on that ranch from age four to 16. And my grandfather was really taking me and in the early summers, he was letting me pretend to help on the ranch, because of course, a four-year-old is a burden, not a help in real life. He was really just watching me and taking care of me. And he was doing that because my mom was so young. She had me when she was 17, and so he was sort of giving her a break. And my grandmother and my grandfather would take me for these summers.

(00:01:15) But as I got a little older, I actually was helpful on the ranch and I loved it. My grandfather had a huge influence on me, a huge factor in my life. I did all the jobs you would do on a ranch. I’ve fixed windmills, and laid fences, and pipelines, and done all the things that any rancher would do, vaccinated the animals, everything. But after my grandmother died, I was about 12 and I kept coming to the ranch, so then it was just him and me, just the two of us. And he was completely addicted to the soap opera, Days of Our Lives. And we would go back to the ranch house every day around 1:00 PM or so to watch Days of Our Lives. Like sands through an hourglass, so are the Days of Our Lives.

Lex Fridman (00:02:07) Just the image of that, the two of you sitting there watching a soap opera, two ranchers.

Jeff Bezos (00:02:13) He had these big crazy dogs. It was really a very formative experience for me. But the key thing about it for me, the great gift I got from it was that my grandfather was so resourceful. He did everything himself. He made his own veterinary tools. He would make needles to suture the cattle up with. He would find a little piece of wire and heat it up and pound it thin and drill a hole in it and sharpen it. So, you learn different things on a ranch than you would learn growing up in a city.

Lex Fridman (00:02:43) So, self-reliance?

Jeff Bezos (00:02:44) Yeah, figuring out that you can solve problems with enough persistence and ingenuity. And my grandfather bought a D6 bulldozer, which is a big bulldozer, and he got it for like $5,000 because it was completely broken down. It was like a 1955 Caterpillar D6 bulldozer. New it would’ve cost, I don’t know, more than $100,000. And we spent an entire summer repairing that bulldozer. And we’d use mail order to buy big gears for the transmission, and they’d show up, they’d be too heavy to move, so we’d have to build a crane. Just that problem-solving mentality. He had it so powerfully. He did all of his own… He didn’t pick up the phone and call somebody, he would figure it out on his own. Doing his own veterinary work.

Lex Fridman (00:03:39) But just the image of the two of you fixing a D6 bulldozer and then going in for a little break at 1:00 PM to watch soap operas.

Jeff Bezos (00:03:47) Days of Our Lives. Laying on the floor, that’s how he watched TV. He was a really, really remarkable guy.

Space

Lex Fridman (00:03:52) That’s how I imagine Clint Eastwood also in all those westerns, when he’s not doing what he’s doing, he’s just watching soap operas. All right. I read that you fell in love with the idea of space and space exploration when you were five, watching Neil Armstrong walking on the moon. So, let me ask you to look back at the historical context and impact of that. So, the space race from 1957 to 1969 between the Soviet Union and the US was, in many ways, epic. It was a rapid sequence of dramatic events. First satellite to space, first human to space, first spacewalk, first uncrewed landing on the moon. Then, some failures, explosions, deaths on both sides actually. And then, the first human walking on the moon. What are some of the more inspiring moments or insights you take away from that time, those few years at just 12 years?

Jeff Bezos (00:04:51) Well, I mean there’s so much inspiring there. One of the great things to take away from that, one of the great von Braun quotes is, “I have come to use the word impossible with great caution.” And so, that’s kind of the big story of Apollo is that going to the moon was literally an analogy that people used for something that’s impossible. “Oh, yeah, you’ll do that when men walk on the moon.” And of course, it finally happened. So, I think it was pulled forward in time because of the space race.

(00:05:31) I think with the geopolitical implications and how much resource was put into it. At the peak, that program was spending 2% or 3% of GDP on the Apollo program. So, much resource. I think it was pulled forward in time. We kind of did it ahead of when we, quote, unquote, should have done it. And so, in that way, it’s also a technical marvel. I mean it’s truly incredible. It’s the 20th century version of building the pyramids or something. It’s an achievement that because it was pulled forward in time and because it did something that had previously been thought impossible, it rightly deserves its place in the pantheon of great human achievements.

Lex Fridman (00:06:17) And of course, you named the rockets that Blue Origin is working on after some of the folks involved.

Lex Fridman (00:06:24) I don’t understand why I didn’t say New Gagarin. Is that-

Jeff Bezos (00:06:27) There’s an American bias in the naming. I apologize-

Lex Fridman (00:06:30) That’s very strange.

Lex Fridman (00:06:31) Was just asking for a friend, clarifying.

Jeff Bezos (00:06:33) I’m a big fan of Gagarin’s though. And in fact, I think his first words in space I think are incredible. He purportedly said, “My God, it’s blue.” And that really drives home. No one had seen the Earth from space. No one knew that we were on this blue planet. No one knew what it looked like from out there, and Gagarin was the first person to see it.

Lex Fridman (00:07:01) One of the things I think about is how dangerous those early days were for Gagarin, for Glenn, for everybody involved. How big of a risk they were all taking.

Jeff Bezos (00:07:11) They were taking huge risks. I’m not sure what the Soviets thought about Gagarin’s flight, but I think that the Americans thought that the Alan Shepard flight, the flight that New Shepherd is named after, the First American in space, he went on his suborbital flight, they thought he had about a 75% chance of success. So, that’s a pretty big risk, a 25% risk.

Lex Fridman (00:07:36) It’s kind of interesting that Alan Shepard is not quite as famous as John Glenn. So, for people who don’t know, Alan Shepard is the first astronaut-

Jeff Bezos (00:07:44) The first American in space.

Lex Fridman (00:07:46) American in suborbital flight.

Lex Fridman (00:07:48) And then, the first orbital flight is-

Jeff Bezos (00:07:51) John Glenn is the first American to orbit the Earth. By the way, I have the most charming, sweet, incredible letter from John Glenn, which I have framed and hanging on my office wall.

Jeff Bezos (00:08:04) Where he tells me how grateful he is that we have named New Glenn after him. And he sent me that letter about a week before he died. And it’s really an incredible… It’s also a very funny letter. He’s writing and he says, “This is a letter about New Glenn from the original Glenn.” And he’s got a great sense of humor and he’s very happy about it and grateful. It’s very sweet.

Lex Fridman (00:08:30) Does he say, “P.S. Don’t mess this up,” or is that-

Lex Fridman (00:08:35) “Make me look good.”

Jeff Bezos (00:08:35) He doesn’t do that. But John, wherever you are, we’ve got you covered.

Lex Fridman (00:08:39) Good. So, back to maybe the big picture of space. When you look up at the stars and think big, what do you hope is the future of humanity, hundreds, thousands of years from now out in space?

Jeff Bezos (00:08:54) I would love to see a trillion humans living in the solar system. If we had a trillion humans, we would have, at any given time, 1,000 Mozarts and 1,000 Einsteins. That our solar system would be full of life and intelligence and energy. And we can easily support a civilization that large with all of the resources in the solar system.

Lex Fridman (00:09:21) So, what do you think that looks like? Giant space stations?

Jeff Bezos (00:09:24) Yeah, the only way to get to that vision is with giant space stations. The planetary surfaces are just way too small. So, I mean, unless you turn them into giant space stations or something. But yeah, we will take materials from the moon and from near-Earth objects and from the asteroid belt and so on, and we’ll build giant O’Neill style colonies and people will live in those. They have a lot of advantages over planetary surfaces. You can spin them to get normal Earth gravity. You can put them where you want them. I think most people are going to want to live near Earth, not necessarily in Earth orbit, but near Earth vicinity orbits. And so, they can move relatively quickly back and forth between their station and Earth. I think a lot of people, especially in the early stages, are not going to want to give up Earth altogether.

Lex Fridman (00:10:24) They go to earth for vacation?

Jeff Bezos (00:10:26) Yeah, same way that you might go to Yellowstone National Park for vacation, people will… And people will get to choose where they live on Earth or whether they live in space, but they’ll be able to use much more energy and much more material resource in space than they would be able to use on Earth.

Lex Fridman (00:10:45) One of the interesting ideas you had is to move the heavy industry away from Earth. So, people sometimes have this idea that somehow space exploration is in conflict with the celebration of the planet Earth, that we should focus on preserving Earth. And basically, your idea is that space travel and space exploration is a way to preserve Earth.

Jeff Bezos (00:11:06) Exactly. We’ve sent robotic probes to all the planets, we know that this is the good one.

Lex Fridman (00:11:17) Not to play favorites or anything, but…

Jeff Bezos (00:11:19) Earth really is the good planet. It’s amazing. The ecosystem we have here, all of the life and the lush plant life and the water resources, everything. This planet is really extraordinary. And of course, we evolved on this planet, so of course it’s perfect for us, but it’s also perfect for all the advanced life forms on this planet, all the animals and so on. And so, this is a gem. We do need to take care of it. And as we enter the Anthropocene, as we humans have gotten so sophisticated and large and impactful, as we stride across this planet, that is going to… We want to use a lot of energy. We want to use a lot of energy per capita. We’ve gotten amazing things. We don’t want to go backwards.

(00:12:10) If you think about the good old days, they’re mostly an illusion. In almost every way, life is better for almost everyone today than it was say 50 years ago or 100 years ago. We live better lives by and large than our grandparents did, and their grandparents did, and so on. And you can see that in global illiteracy rates, global poverty rates, global infant mortality rates. Almost any metric you choose, we’re better off than we used to be. And we get antibiotics and all kinds of lifesaving medical care, and so on, and so on. And there’s one thing that is moving backwards, and it’s the natural world.

(00:12:54) So, it is a fact that 500 years ago, pre-industrial age, the natural world was pristine. It was incredible. And we have traded some of that pristine beauty for all of these other gifts that we have as an advanced society. And we can have both, but to do that, we have to go to space. And the most fundamental measure is energy usage per capita. You do want to continue to use more and more energy, it is going to make your life better in so many ways, but that’s not compatible ultimately with living on a finite planet. And so, we have to go out into the solar system. And really, you could argue about when you have to do that, but you can’t credibly argue about whether you have to do that.

Lex Fridman (00:13:49) Eventually we have to do that.

Lex Fridman (00:13:52) Well, you don’t often talk about it, but let me ask you on that topic about the Blue Ring and the Orbital Reef space infrastructure projects. What’s your vision for these?

Jeff Bezos (00:14:03) So, Blue Ring is a very interesting spacecraft that is designed to take up to 3,000 kilograms of payload up to geosynchronous orbit or in lunar vicinity. It has two different kinds of propulsion. It has chemical propulsion and it has electric propulsion. And so, you can use Blue Ring in a couple of different ways. You can slowly move, let’s say up to geosynchronous orbit using electric propulsion. That might take 100 days or 150 days, depending on how much mass you’re carrying. And reserve your chemical propulsion, so that you can change orbits quickly in geosynchronous orbit. Or you can use the chemical propulsion first to quickly get up to geosynchronous and then use your electrical propulsion to slowly change your geosynchronous orbit.

(00:14:55) Blue Ring has a couple of interesting features. It provides a lot of services to these payloads. So, it could be one large payload or it can be a number of small payloads, and it provides thermal management, it provides electric power, it provides compute, provides communications. And so, when you design a payload for Blue Ring, you don’t have to figure out all of those things on your own. So, kind of radiation tolerant compute is a complicated thing to do. And so, we have an unusually large amount of radiation tolerant compute on board Blue Ring, and your payload can just use that when it needs to. So, it’s sort of all these services… It’s like a set of APIs. It’s a little bit like Amazon Web Services, but-

Jeff Bezos (00:15:52) … for space payloads that need to move about in Earth vicinity or lunar vicinity.

Lex Fridman (00:15:57) AWSS space. So, compute and space. So, you get a giant chemical rocket to get a payload out to orbit. And then, you have these admins that show up, this Blue Ring thing that manages various things like compute?

Jeff Bezos (00:16:13) Exactly. And it can also provide transportation and move you around to different orbits.

Lex Fridman (00:16:19) Including humans, do you think?

Jeff Bezos (00:16:21) No, Blue Ring is not designed to move humans around. It’s designed to move payloads around. So, we’re also building a lunar lander, which is of course designed to land humans on the surface of the moon.

Physics

Lex Fridman (00:16:34) I’m going to ask you about that, but let me ask you to just step back to the old days. You were at Princeton with aspirations to be a theoretical physicist.

Lex Fridman (00:16:47) What attracted you to physics and why did you change your mind and not become… Why are you not Jeff Bezos, the famous theoretical physicist?

Jeff Bezos (00:16:57) So, I loved physics and I studied physics and computer science, and I was proceeding along the physics path. I was planning to major in physics, and I wanted to be a theoretical physicist. And the computer science was sort of something I was doing for fun. I really loved it and I was very good at the programming and doing those things, and I enjoyed all my computer science classes immensely. But I really was determined to be a theoretical physicist. That’s why I went to Princeton in the first place. It was definitely… And then, I realized I was going to be a mediocre theoretical physicist. And there were a few people in my classes, like in quantum mechanics and so on, who they could effortlessly do things that were so difficult for me. And I realized there are 1,000 ways to be smart.

(00:17:52) Theoretical physics is not one of those fields where only the top few percent actually move the state-of-the-art forward. It’s one of those things where your brain has to be wired in a certain way. And there was a guy named… One of these people who convinced me, he didn’t mean to convince me, but just by observing him, he convinced me that I should not try to be a theoretical physicist. His name was Yosanta. And Yosanta was from Sri Lanka, and he was one of the most brilliant people I’d ever met. My friend Joe and I were working on a very difficult partial differential equations problem set one night. And there was one problem that we worked on for three hours and we made no headway whatsoever. And we looked up at each other at the same time and we said, “Yosanta.”

(00:18:49) So, we went to Yosanta’s dorm room and he was there. He was almost always there. And we said, “Yosanta, we’re having trouble solving this partial differential equation. Would you mind taking a look?” And he said, “Of course.” By the way, he was the most humble, most kind person. And so, he looked at our problem and he stared at it for just a few seconds, maybe 10 seconds, and he said, “cosine.” And I said, “What do you mean, Yosanta? What do you mean cosine?” He said, “That’s the answer.” And I said, “No, no, no, come on.” And he said, “Let me show you.” And he took out some paper and he wrote down three pages of equations, everything canceled out, and the answer was cosine.

(00:19:30) And I said, “Yosanta, did you do that in your head?” And he said, “Oh, no. That would be impossible. A few years ago I solved a similar problem and I could map this problem onto that problem, and then it was immediately obvious that the answer was cosine.” You have an experience like that, you realize maybe being a theoretical physicist isn’t what the universe wants you to be. And so, I switched to computer science and that worked out really well for me. I enjoy it. I still enjoy it today.

Lex Fridman (00:20:07) Yeah, there’s a particular kind of intuition you need to be a great physicist, and applied to physics.

Jeff Bezos (00:20:12) I think the mathematical skill required today is so high. You have to be a world-class mathematician to be a successful theoretical physicist today. And you probably need other skills too, intuition, lateral thinking and so on. But without just top-notch math skills, you’re unlikely to be successful.

Lex Fridman (00:20:39) And visualization skill, you have to be able to really do these kinds of thought experiments if you want truly great creativity. Actually Walter Isaacson writes about you and puts you on the same level as Einstein and-

Jeff Bezos (00:20:53) Well, that’s very kind. I’m an inventor. If you want to boil down what I am, I’m really an inventor. And I look at things and I can come up with atypical solutions. And then, I can create 100 such atypical solutions for something, 99 of them may not survive scrutiny, but one of those 100 is like, “Hmm, maybe that might work.” And then, you can keep going from there. So, that kind of lateral thinking, that kind of inventiveness in a high-dimensionality space where the search space is very large, that’s where my inventive skills come… I self-identify as an inventor more than anything else.

Lex Fridman (00:21:43) Yeah. And he describes in all kinds of different ways, Walter Isaacson does, that creativity combined with childlike wander that you’ve maintained still to this day, all of that combined together. If you were to study your own brain, introspect, how do you think? What’s your thinking process like? We’ll talk about the writing process of putting it down on paper, which is quite rigorous and famous at Amazon. But when you sit down, maybe alone, maybe with others, and thinking through this high-dimensional space and looking for creative solutions, creative paths forward, is there something you could say about that process?

Jeff Bezos (00:22:26) It’s such a good question, and I honestly don’t know how it works. If I did, I would try to explain it. I know it involves lots of wandering, so when I sit down to work on a problem, I know I don’t know where I’m going. So, to go in a straight line… To be efficient… Efficiency and invention are sort of at odds, because real invention, Not incremental improvement… Incremental improvement is so important in every endeavor, in everything you do, you have to work hard on also just making things a little bit better. But I’m talking about real invention, real lateral thinking that requires wandering, and you have to give yourself permission to wander.

(00:23:11) I think a lot of people, and they feel like wandering is inefficient. And when I sit down at a meeting, I don’t know how long the meeting is going to take if we’re trying to solve a problem, because if I did, then I’d know there’s some kind of straight line that we’re drawing to the solution. The reality is we may have to wander for a long time. And I do like group invention. I think there’s really nothing more fun than sitting at a whiteboard with a group of smart people and spit balling and coming up with new ideas and objections to those ideas, and then solutions to the objections and going back and forth. So, sometimes you wake up with an idea in the middle of the night and sometimes you sit down with a group of people and go back and forth, and both things are really pleasurable.

Lex Fridman (00:24:14) And when you wander, I think one key thing is to notice a good idea. And maybe to notice the kernel of a good idea. I’ll maybe pull at that string. Because I don’t think good ideas come fully-formed.

Jeff Bezos (00:24:31) 100% right. In fact, when I come up with what I think is a good idea and it survives the first level of scrutiny that I do in my own head, and I’m ready to tell somebody else about the idea, I will often say, “Look, it is going to be really easy for you to find objections to this idea, but work with me.”

Lex Fridman (00:24:53) There’s something there.

Jeff Bezos (00:24:54) There’s something there. And that is intuition, because it’s really easy to kill new ideas in the beginning because there’s so many easy objections to them. So, you need to kind of forewarn people and say, “Look, I know it’s going to take a lot of work to get this to a fully-formed idea. Let’s get started on that. It’ll be fun.”

Lex Fridman (00:25:17) So, you got that ability to say cosine in you somewhere after all, maybe not on math, but-

Jeff Bezos (00:25:23) In a different domain.

Jeff Bezos (00:25:25) There are 1,000 ways to be smart, by the way, and that is a really… When I go around and I meet people, I’m always looking for the way that they’re smart. And you find that’s one of the things that makes the world so interesting and fun is that it’s not like IQ is a single dimension. There are people who are smart in such unique ways.

Lex Fridman (00:25:53) Yeah, you just gave me a good response when somebody calls me an idiot on the internet. “You know, there’s 1,000 ways to be smart, sir.”

Jeff Bezos (00:26:01) Well, they might tell you, “Yeah, but there are a million to be ways to be dumb.”

New Glenn

Lex Fridman (00:26:04) Yeah, right. I feel like that’s a Mark Twain quote. Okay. All right. You gave me an amazing tour of Blue Origin Rocket Factory and Launch Complex in the historic Cape Canaveral. That’s where New Glenn, the big rocket we talked about, is being built and will launch. Can you explain what the New Glenn rocket is and tell me some interesting technical aspects of how it works?

Jeff Bezos (00:26:29) Sure. New Glenn is a very large heavy-lift launch vehicle. It’ll take about 45 metric tons to LEO, very large class. It’s about half the thrust, a little more than half the thrust of the Saturn V rocket. So, it’s about 3.9 million pounds of thrust on liftoff. The booster has seven BE-4 engines. Each engine generates a little more than 550,000 pounds of thrust. The engines are fueled by liquified natural gas, LNG as the fuel, and LOX as the oxidizer. The cycle is an ox-riched stage combustion cycle. It’s a cycle that was really pioneered by the Russians. It’s a very good cycle. And that engine is also going to power the first stage of the Vulcan rocket, which is the United Launch Alliance rocket. Then the second stage of New Glenn is powered by two BE-3U engines, which is a upper-stage variant of our New Shepard liquid hydrogen engine.

(00:27:44) So, the BE-3U has 160,000 pounds of thrust, so two of those, 320,000 pounds of thrust. And hydrogen is a very good propellant for upper stages because it has very high ISP. It’s not a great propellant in my view for booster stages, because the stages then get physically so large. Hydrogen has very high ISP, but liquid hydrogen is not dense at all. So, to store liquid hydrogen, if you need to store many thousands of pounds of liquid hydrogen, your liquid hydrogen tank gets very large. So, you get more benefit from the higher ISP, the specific impulse, you get more benefit from the higher specific impulse on the second stage. And that stage carries less propellant, so you don’t get such geometrically-gigantic tanks. The Delta IV is an example of a vehicle that is all hydrogen. The booster stage is also hydrogen, and I think that it’s a very effective vehicle, but it never was very cost-effective. So, it’s operationally very capable but not very cost-effective.

Lex Fridman (00:28:56) So, size is also costly?

Jeff Bezos (00:28:58) Size is costly. So, it’s interesting. Rockets love to be big. Everything works better.

Lex Fridman (00:29:05) What do you mean by that? You’ve told me that before. It sounds epic, but what does it mean?

Jeff Bezos (00:29:10) I mean, when you look at the physics of rocket engines, and also when you look at parasitic mass… Let’s say you have an avionic system, so you have a guidance and control system, that is going to be about the same mass and size for a giant rocket as it is going to be for a tiny rocket. And so, that’s just parasitic mass that is very consequential if you’re building a very small rocket, but is trivial if you’re building a very large rocket. So, you have the parasitic mass thing. And then if you look at, for example, rocket engines have turbo pumps. They have to pressurize the fuel in the oxidizer up to a very high pressure level in order to inject it into the thrust chamber where it burns. And those pumps, all rotating machines, in fact, get more efficient as they get larger. So, really tiny turbo pumps are very challenging to manufacture, and any kind of gaps between the housing, for example, and the rotating impeller that pressurizes the fuel, there has to be some gap there. You can’t have those parts scraping against one another, and those gaps drive inefficiencies. And so, if you have a very large turbo pump, those gaps in percentage terms end up being very small. And so, there’s a bunch of things that you end up loving about having a large rocket and that you end up hating for a small rocket. But there’s a giant exception to this rule, and it is manufacturing. So, manufacturing large structures is very, very challenging. It’s a pain in the butt. And so, if you’re making a small rocket engine, you can move all the pieces by hand, you could assemble it on a table, one person can do it. You don’t need cranes and heavy lift operations and tooling and so on and so on. When you start building big objects, infrastructure, civil infrastructure, just like the launchpad and all this we went and visited, I took you to the launchpad. And you can see it’s so monumental.

Jeff Bezos (00:31:28) And so, just these things become major undertakings, both from an engineering point of view, but also from a construction and cost point of view.

Lex Fridman (00:31:37) And even the foundation of the launchpad. I mean, this is Florida, isn’t it swamp land? How deep do you have to go?

Jeff Bezos (00:31:44) At Cape Canaveral, in fact, most launch pads are on beaches somewhere on the ocean side because you want to launch over water for safety reasons. Yes, you have to drive pilings, dozens and dozens and dozens of pilings, 50, 100, 150 feet deep to get enough structural integrity for these very large… Yes, these turn into major civil engineering projects.

Lex Fridman (00:32:15) I just have to say everything about that factory is pretty badass. You said tooling, the bigger it gets, the more epic it is.

Jeff Bezos (00:32:22) It does make it epic. It’s fun to look at. It’s extraordinary.

Lex Fridman (00:32:26) It’s humbling also because humans are so small compared to it.

Jeff Bezos (00:32:29) We are building these enormous machines that are harnessing enormous amounts of chemical power in very, very compact packages. It’s truly extraordinary.

Lex Fridman (00:32:44) But then, there’s all the different components and the materials involved. Is there something interesting that you can describe about the materials that comprise the rocket? So, it has to be as light as possible, I guess, whilst withstanding the heat and the harsh conditions?

Lex Fridman (00:33:00) Whilst withstanding the heat and the harsh conditions?

Jeff Bezos (00:33:03) Yeah, I play a little game sometimes with other rocket people that I run into where say, “What are the things that would amaze the 1960s engineers? What’s changed?” Because surprisingly, some of rocketry’s greatest hits have not changed. They would recognize immediately a lot of what we do today and it’s exactly what they pioneered back in the ’60s. But a few things have changed. The use of carbon composites is very different today. We can build very sophisticated … You saw our carbon tape laying machine that builds the giant fairings and we can build these incredibly light, very stiff fairing structures out of carbon composite material that they could not have dreamed of. The efficiency, the structural efficiency of that material is so high compared to any metallic material you might use or anything else. So that’s one.

(00:34:12) Aluminum-lithium and the ability to friction stir weld aluminum-lithium. Do you remember the friction stir welding that I showed you?

Lex Fridman (00:34:20) Yes. It’s incredible.

Jeff Bezos (00:34:21) This is a remarkable technology that’s invented decades ago, but has become very practical over just the last couple of decades. And instead of using heat to weld two pieces of metal together, it literally stirs the two pieces. There’s a pin that rotates at a certain rate and you put that pin between the two plates of metal that you want to weld together and then you move it at a very precise speed. And instead of heating the material, it heats it a little bit because of friction, but not very much, you can literally immediately after welding with stir friction welding, you can touch the material and it’s just barely warm. It literally stirs the molecules together. It’s quite extraordinary.

Lex Fridman (00:35:06) Relatively low temperature and I guess high temperatures, that makes it a weak point.

Jeff Bezos (00:35:13) … with traditional welding techniques, you whatever the underlying strength characteristics of the material are, you end up with weak regions where you weld. And with friction stir welding, the welds are just as strong as the bulk material. So it really allows you … Let’s say you’re building a tank that you’re going to pressurize a large liquid natural gas tank for our booster stage, for example, if you are welding that with traditional methods, you have to size those weld lands, the thickness of those pieces with that knockdown for whatever damage you’re doing with the weld and that’s going to add a lot of weight to that tank.

Lex Fridman (00:35:54) Even just looking at the fairings, the result of that, the complex shape that it takes and what it’s supposed to do is incredible because some people don’t know, it’s on top of the rock, it’s going to fall apart. That’s its task, but it has to stay strong sometimes and then disappear when it needs to …

Lex Fridman (00:36:15) … which is a very difficult task.

Jeff Bezos (00:36:17) Yes. When you need something that needs to have 100% integrity until it needs to have 0% integrity, it needs to stay attached until it’s ready to go away, and then when it goes away, it has to go away completely. You use explosive charges for that and so it’s a very robust way of separating structure when you need to.

Jeff Bezos (00:36:41) Yeah, little tiny bits of explosive material and it will sever the whole connection.

Lex Fridman (00:36:49) So if you want to go from 100% structural integrity to zero as fast as possible is explosives.

Lex Fridman (00:36:59) The entirety of this thing is so badass. Okay, so we’re back to the two stages. So the first stage is reusable.

Jeff Bezos (00:37:06) Yeah. Second stage is expendable. Second stage is liquid hydrogen, liquid oxygen. So we get take advantage of the higher specific impulse. The first stage lands down range on a landing platform in the ocean, comes back for maintenance and get ready to do the next mission.

Lex Fridman (00:37:27) There’s a million questions, but also is there a path towards reusability for the second stage?

Jeff Bezos (00:37:32) There is and we know how to do that. Right now, we’re going to work on manufacturing that second stage to make it as inexpensive as possible, two paths for a second stage, make it reusable or work really hard to make it inexpensive, so you can afford to expend it. And that trade is actually not obvious which one is better.

Lex Fridman (00:38:00) Even in terms of cost, like time, cost-

Jeff Bezos (00:38:01) Even in terms of … And I’m talking about cost. Space, getting into orbit is a solved problem. We solved it back in the ’50s and ’60s.

Lex Fridman (00:38:11) You’re making it sound easy.

Jeff Bezos (00:38:13) The only interesting problem is dramatically reducing the cost of access to orbit, which is, if you can do that, you open up a bunch of new endeavors that lots of start-up companies everybody else can do. One of our missions is to be part of this industry and lower the cost to orbit, so that there can be a renaissance, a golden age of people doing all kinds of interesting things in space.

Lex Fridman (00:38:47) I like how you said getting to orbit is a solved problem. It’s just the only interesting thing is reducing the cost. You know how you can describe every single problem facing human civilization that way? The physicists would say, “Everything is a solved problem. We’ve solved everything. The rest is just,” what did Rutherford said, “that it’s just stamp collecting. It’s just the details.” Some of the greatest innovations and inventions and brilliance is in that cost reduction stage, right? And you’ve had a long career of cost reduction.

Jeff Bezos (00:39:18) For sure. What does cost reduction really mean? It means inventing a better way.

Jeff Bezos (00:39:25) Right? And when you invent a better way, you make the whole world richer. So whatever it was, I don’t know how many thousands of years ago, somebody invented the plow. And when they invented the plow, they made the whole world richer because they made farming less expensive. And so it is a big deal to invent better ways. That’s how the world gets richer.

Lex Fridman (00:39:48) So what are some of the biggest challenges on the manufacturing side, on the engineering side that you’re facing in working to get to the first launch of New Glenn?

Jeff Bezos (00:40:01) The first launch is one thing and we’ll do that in 2024, coming up in this coming year. The real thing that’s the bigger challenge is making sure that our factory is efficiently manufacturing at rate. So rate production, so consider if you want to launch New Glenn 24 times a year, you need to manufacture a upper stage since they’re expendable, twice a month. You need to do one every two weeks. So you need to have all of your manufacturing facilities and processes and inspection techniques and acceptance tests and everything operating at rate. And rate manufacturing is at least as difficult as designing the vehicle in the first place and the same thing. So every upper stage has two BE-3U engines.

(00:41:03) So those engines, if you’re going to launch the vehicle twice a month, you need four engines a month. So you need an engine every week. That engine needs to be being produced at rate and there’s all of the things that you need to do that, all the right machine tools, all the right fixtures, the right people, process, etcetera. So it’s one thing to build a first article, right? To launch New Glenn for the first time, you need to produce a first article, but that’s not the hard part. The hard part is everything that’s going on behind the scenes to build a factory that can produce New Glenns at rate.

Lex Fridman (00:41:47) So the first one is produced in a way that enables the production of the second and third and the fourth and the fifth and sixth-

Jeff Bezos (00:41:53) You could think of the first article as pushing, it pushes all of the rate manufacturing technology along. In other words, it’s the test article in a way that’s testing out your manufacturing technologies.

Lex Fridman (00:42:13) The manufacturing is the big challenge.

Jeff Bezos (00:42:15) Yes. I don’t want to make it sound like any of it is easy. The people who are designing the engines and all this, all of this is hard for sure, but the challenge right now is driving really hard to get to is to get to rate manufacturing and to do that in an efficient way, again back to our cost point. If you get to rate manufacturing in an inefficient way, you haven’t really solved the cost problem and maybe you haven’t really moved the state of the art forward. All this has to be about moving this state of the art forward. There are easier businesses to do. I always tell people, “Look, if you are trying to make money, start a salty snack food company or something.”

Lex Fridman (00:42:56) I’m going to write that idea down.

Jeff Bezos (00:43:01) Make the Lex Fridman Potato Chips.

Lex Fridman (00:43:04) Right. Don’t say it. People are going to steal it. But yeah, it’s hard.

Jeff Bezos (00:43:10) Do you see what I’m saying? There’s nothing easy about this business, but it’s its own reward. It’s fascinating, it’s worthwhile, it’s meaningful. I don’t want to pick on salty snack food companies, but I think it’s less meaningful. At the end of the day, you’re not going to have accomplished something amazing …

Jeff Bezos (00:43:33) … even if you do make a lot of money on it.

Lex Fridman (00:43:35) Yeah, there’s something fundamentally different about the “business of space exploration.”

Lex Fridman (00:43:42) It’s a grand project of humanity.

Jeff Bezos (00:43:44) Yes, it’s one of humanity’s grand challenges, and especially as you look at going to the moon and going to Mars and building giant O’Neill colonies and unlocking all the things. I won’t live long enough to see the fruits of this, but the fruits of this come from building a road to space, getting the infrastructure. I’ll give you an analogy. When I started Amazon, I didn’t have to develop a payment system. It already existed. It was called the credit card. I didn’t have to develop a transportation system to deliver the packages. It already existed. It was called the Postal Service and Royal Mail and Deutsche Post and so on. So all this heavy lifting infrastructure was already in place and I could stand on its shoulders. And that’s why, when you look at the internet …

(00:44:40) And by the way, another giant piece of infrastructure that was around in the early, I’m taking you back to 1994, people were using dial-up modems and it was piggybacking on top of the long distance phone network. That’s how the internet … That’s how people were accessing servers and so on. And again, if that hadn’t existed, it would’ve been hundreds of billions of CapEx to put that out there. No startup company could have done that. And so the problem you see, if you look at the dynamism in the internet space over the last 20 years, it’s because you see two kids in a dorm room could start an internet company that could be successful and do amazing things because they didn’t have to build heavy infrastructure. It was already there. And that’s what I want to do. I take my Amazon winnings and use that to build heavy infrastructure so that the next generation, the generation that’s my children and their children, those generations can then use that heavy infrastructure, then there’ll be space entrepreneurs who start in their dorm room. That will be a marker of success when you can have a really valuable space company started in a dorm room, then we know that we’ve built enough infrastructure so that ingenuity and imagination can really be unleashed. I find that very exciting.

Lex Fridman (00:46:11) They will, of course, as kids do, take all of this hard infrastructure ability for granted.

Lex Fridman (00:46:18) That entrepreneurial spirit.

Jeff Bezos (00:46:19) That’s an inventor’s greatest dream, is that their inventions are so successful that they are one day taken for granted. Nobody thinks of Amazon as an invention anymore. Nobody thinks of customer reviews as an invention. We pioneered customer reviews, but now they’re so commonplace. Same thing with one-click shopping and so on, but that’s a compliment. You invent something that’s so used, so beneficially used by so many people that they take it for granted.

Lex Fridman (00:46:49) I don’t know about nobody. Every time I use Amazon, I’m still amazed, “How does this work, the logistics, the Wazuh?”

Jeff Bezos (00:46:55) Well, that proves you’re a very curious explorer.

Lex Fridman (00:46:57) All right, all right, back to rocket. Timeline, you said 2024. As it stands now, are both the first test launch and the launch of ESCAPADE explorers to Mars still possible in 2024?

Jeff Bezos (00:47:13) Yeah, I think so. For sure, the first launch and then we’ll see if ESCAPADE goes on that or not. I think that the first launch for sure and I hope ESCAPADE too.

Jeff Bezos (00:47:24) Well, I just don’t know which mission it’s actually going to be slated on. So we also have other things that might go on that first mission.

Lex Fridman (00:47:31) Oh, I got it. But you’re optimistic that the launches will still-

Jeff Bezos (00:47:35) Oh, the first launch. I’m very optimistic that the first launch of New Glenn will be in 2024 and I’m just not 100% certain what payload will be on that first launch.

Lex Fridman (00:47:44) Are you nervous about it?

Jeff Bezos (00:47:46) Are you kidding? I’m extremely nervous about it.

Jeff Bezos (00:47:52) 100%. Every launch I go to, for New Shepherd, for other vehicles too, I’m always nervous for these launches. But yes, for sure, a first launch, to have no nervous about that would be some sign of derangement, I think so.

Lex Fridman (00:48:09) Well, I got to visit the launch, man. It’s pretty … I mean, it’s epic.

Jeff Bezos (00:48:14) We have done a tremendous amount of ground testing, a tremendous amount of simulation. So a lot of the problems that we might find in flight have been resolved, but there are some problems you can only find in flight. So cross your fingers. I guarantee you you’ll have fun watching it no matter what happens.

Lex Fridman (00:48:37) 100%. When the thing is fully assembled, it comes up-

Jeff Bezos (00:48:41) Yeah, the transporter erector.

Lex Fridman (00:48:44) It’s the erector, yeah.

Jeff Bezos (00:48:45) Just the transporter erector for a rocket of this scale is extraordinary.

Lex Fridman (00:48:49) That’s an incredible machine.

Jeff Bezos (00:48:50) The vehicle travels out horizontally and then comes up and-

Jeff Bezos (00:48:58) Yeah, it’s a beautiful thing to watch.

Lex Fridman (00:49:00) Speaking of which, if that makes you nervous, I don’t know if you remember, but you were aboard New Shepard on its first crewed flight. How was that experience? Were you terrified then?

Jeff Bezos (00:49:20) Strangely, I wasn’t.

Lex Fridman (00:49:22) When you ride the rocket, wasn’t nerve wracking? Okay.

Jeff Bezos (00:49:24) It’s true. I’ve watched other people riding the rocket and I’m more nervous than when I was inside the rocket myself. It was a difficult conversation to have with my mother when I told her I was going to go on the first one. And not only was I going to go, but I was going to bring my brother too. This is a tough conversation to have with a mom.

Lex Fridman (00:49:44) There’s a long pause when you told her.

Jeff Bezos (00:49:47) She’s like, “Both of you?” It was an incredible experience and we were laughing inside the capsule and we’re not nervous. The people on the ground were very nervous for us. It was actually one of the most emotionally powerful parts of the experience happened even before the flight. At 4:30 in the morning, brother and I are getting ready to go to the launch site and Lauren is going to take us there in her helicopter and we’re getting ready to leave. And we go outside, outside the ranch house there in West Texas where the launch facility is and all of our family, my kids and my brother’s kids and our parents and close friends are assembled there and they’re saying goodbye to us, but they’re saying, “Maybe they think they’re saying goodbye to us forever,” and we might not have felt that way, but it was obvious from their faces how nervous they were that they felt that way. And it was powerful because it allowed us to see … It was almost like a attending year old memorial service or something like you could feel how loved you were in that moment and it was really amazing.

Lex Fridman (00:51:12) Yeah, and there’s just a epic nature to it too.

Jeff Bezos (00:51:17) The ascent, the floating in zero gravity. I’ll tell you something very interesting, zero gravity feels very natural. I don’t know if it’s because it’s like return to the womb or-

Lex Fridman (00:51:31) You just confirmed you’re an alien, but that’s all. I think that’s what you just said.

Jeff Bezos (00:51:36) It feels so natural to be in zero G. It was really interesting. And then what people talk about the overview effect and seeing Earth from space, I had that feeling very powerfully. I think everyone did. You see how fragile the Earth is. If you’re not an environmentalist, it will make you one. The great Jim Lovell quote, he looked back at the Earth from space and he said he realized, “You don’t go to heaven when you die. You go to heaven when you’re born.” That’s the feeling that people get when they’re in space. You see all this blackness, all this nothingness and there’s one gem of life and it’s Earth.

Lex Fridman (00:52:15) It is a gem. You’ve talked a lot about decision making throughout your time with Amazon. What was that decision like to be the first to ride New Shepard? Just before you talk to your mom, the pros and cons? Actually, as one human being, as a leader of a company on all fronts, what was that decision making like?

Jeff Bezos (00:52:43) I decided that … First of all, I knew the vehicle extremely well. I know the team who built it. I know the vehicle. I’m very comfortable with the escape system. We put as much effort into the escape system on that vehicle as we put into all the rest of the vehicle combined. It’s one of the hardest pieces of engineering in the entire New Shepard architecture.

Lex Fridman (00:53:10) Can you actually describe what do you mean by escape system? What’s involved?

Jeff Bezos (00:53:13) We have a solid rocket motor in the base of the crew capsule, so that if anything goes wrong on ascent, while the main rocket engine is firing, we can ignite this solid rocket motor in the base of the crew capsule and escape from the booster. It’s a very challenging system to build, design, validate, test, all of these things. It is the reason that I am comfortable letting anyone go on New Shepard. So the booster is as safe and reliable as we can make it, but we are harnessing … Whenever you’re talking about rocket engines, I don’t care what rocket engine you’re talking about, you’re harnessing such vast power in such a small compact geometric space. The power density is so enormous that it is impossible to ever be sure that nothing will go wrong.

(00:54:18) And so the only way to improve safety is to have an escape system. And historically, human-rated rockets have had escape systems. Only the space shuttle did not, but Apollo had one. All of the previous Gemini, etcetera, they all had escape systems. And we have on New Shepard an unusual escape … Most escape systems are towers. We have a pusher escape system. So the solid rocket motor is actually embedded in the base of the crew capsule and it pushes and it’s reusable in the sense that, if we don’t use it, so if we have a nominal mission, we land with it. The tower systems have to be ejected at a certain point in the mission and so they get wasted even in a nominal mission.

(00:55:09) And so again, costs really matters on these things, so we figured out how to have the escape system be a reusable. In the event that it’s not used, it can reuse it and have it be a pusher system. It’s a very sophisticated thing. So I knew these things. You asked me about my decision to go and so I know the vehicle very well, I know the people who designed it, I have great trust in them and in the engineering that we did. And I thought to myself, “Look, if I am not ready to go, then I wouldn’t want anyone to go.” A tourism vehicle has to be designed, in my view, to be as safe as one can make it. You can’t make it perfectly safe. It’s impossible, but you have … People will do things. People take risk. They climb mountains, they skydive, they do deep underwater scuba diving and so on. People are okay taking risk. You can’t eliminate the risk, but it is something, because it’s a tourism vehicle, you have to do your utmost to eliminate those risks.

(00:56:16) And I felt very good about the system. I think it’s one of the reasons I was so calm inside and maybe others weren’t as calm. They didn’t know as much about it as I did.

Lex Fridman (00:56:26) Who was in charge of engaging the escape system? Did you have-

Jeff Bezos (00:56:28) It’s automated. The escape system is …

Lex Fridman (00:56:31) Okay. I was visualizing-

Jeff Bezos (00:56:33) … completely automated. Automated is better because it can react so much faster.

Lex Fridman (00:56:38) Okay. So yeah, for tourism rockets, safety is a huge, huge, huge priority for space exploration also, but a delta less.

Jeff Bezos (00:56:46) Yes. I think if you’re doing … There are human activities where we tolerate more risk if you’re saving somebody’s life, if you are engaging in real exploration. These are things where I personally think we would accept more risk in part because you have to.

Lex Fridman (00:57:09) Is there a part of you that’s frustrated by the rate of progress in Blue Origin?

Jeff Bezos (00:57:15) Blue Origin needs to be much faster. And it’s one of the reasons that I left my role as the CEO of Amazon a couple of years ago, “I wanted to come in and Blue Origin needs me right now.” And so I had always … When I was the CEO of Amazon, my point of view on this is, “If I’m the CEO of a publicly traded company, it’s going to get my full attention.” And it’s just how I think about things. It was very important to me. I felt I had an obligation to all the stakeholders at Amazon to do that. And so having turned the CEO, I’m still the executive chair there, but I turned the CEO role over, and the primary reason I did that is that I could spend time on Blue Origin, adding some energy, some sense of urgency, “We need to move much faster and we’re going to.”

Lex Fridman (00:58:14) What are the ways to speed it up? You’ve talked a lot of different ways at Amazon removing barriers for progress or distributing, making everybody autonomous and self-reliant, all those kinds of things. Is that apply at Blue Origin or is-

Jeff Bezos (00:58:37) It does apply. I’m leading this directly. We’re going to become the world’s most decisive company across any industry. And so at Amazon, for ever since the beginning, I said, “We’re going to become the world’s most customer-obsessed company.” And no matter the industry, one day, people are going to come to Amazon from the healthcare industry and want to know, “How are you so customer-obsessed? How do you not just pay lip service that, but actually do that?” All different industries should come want to study us to see how we accomplish that. And the analogous thing at Blue Origin and will help us move faster is we’re going to become the world’s most decisive company. We’re going to get really good at taking appropriate technology risk and making those decisions quickly, being bold on those things and having the right culture that supports that.

(00:59:40) You need people to be ambitious, technically ambitious, “If there are five ways to do something, we’ll study them, but let’s study them very quickly and make a decision.” We can always change our mind. Changing your mind, I talk about one-way doors and two-way doors, most decisions are two-way doors.

Lex Fridman (01:00:03) Can you explain that because I love that metaphor?

Jeff Bezos (01:00:06) If you make the wrong decision, if it’s a two-way door decision, you pick a door, you walk out and you spend a little time there. It turns out to be the wrong decision, you can come back in and pick another door. Some decisions are so consequential and so important and so hard to reverse that they really are one-way door decisions. You go in that door, you’re not coming back. And those decisions have to be made very deliberately, very carefully. If you can think of yet another way to analyze the decision, you should slow down and do that. So when I was CEO of Amazon, I often found myself in the position of being the chief slow down officer because somebody would be bringing me a one-way door decision and I would say, “Okay, I can think of three more ways to analyze that. So let’s go do that because we are not going to be able to reverse this one easily. Maybe you can reverse it if it’s going to be very costly and very time-consuming. We really have to get this one right from the beginning.”

(01:01:10) And what happens, unfortunately, in companies, what can happen, is that you have a one-size-fits-all decision-making process where you end up using the heavyweight process on all decisions …

Lex Fridman (01:01:28) For everything, yeah.

Jeff Bezos (01:01:29) … Including the lightweight ones, the two-way door decisions. Two-way door decisions should mostly be made by single individuals or by very small teams deep in the organization. And one-way door decisions are the irreversible ones. Those are the ones that should be elevated up to the senior-most executives who should slow them down and make sure that the right thing is being done.

Lex Fridman (01:01:55) Yeah, part of the skill here is to know the difference between one-way and two-way. I think you mentioned …

Lex Fridman (01:02:01) I think you mentioned Amazon Prime, the decision to create Amazon Prime as a one-way door. It’s unclear if it is or not, but it probably is and it’s a really big risk to go there.

Jeff Bezos (01:02:14) There are a bunch of decisions like that are … Changing the decision is going to be very, very complicated. Some of them are technical decisions too because some technical decisions are like quick-drying cement. Once you make them, it gets really hard. Choosing which propellants to use in a vehicle, selecting LNG for the booster stage and selecting hydrogen for the upper stage, that has turned out to be a very good decision. But if you changed your mind, that would be a very big setback. Do you see what I’m saying?

Jeff Bezos (01:02:52) So that’s the kind of decision you scrutinize very, very carefully. Other things just aren’t like that. Most decisions are not that way. Most decisions should be made by single individuals and done quickly in the full understanding that you can always change your mind.

Lex Fridman (01:03:11) One of the things I really liked, perhaps it’s not a two-way door decisions, is, “I disagree and commit,” phrase. So somebody brings up an idea to you, if it’s a two-way door, you state that you don’t understand enough to agree, but you still back them. I’d love for you to explain that-

Jeff Bezos (01:03:35) Well, yes, disagree and commit is a really important principle that saves a lot of arguing. So-

Lex Fridman (01:03:39) Yeah, I’m going to use that in my personal life, “I disagree, but commit.”

Jeff Bezos (01:03:44) It’s very common in any endeavor in life, in business and anybody where you have teammates, you have a teammate and the two of you disagree. At some point, you have to make a decision. And in companies, we tend to organize hierarchically. Whoever’s the more senior person ultimately gets to make the decision. So ultimately, the CEO gets to make that decision. And the CEO may not always make the decision that they agree with. So I would be the one who would disagree and commit. One of my direct reports would very much want to do something in a particular way. I would think it was a bad idea. I would explain my point of view. They would say, ” Jeff, I think you’re wrong and here’s why,” and we would go back and forth.

(01:04:35) And I would often say, “You know what? I don’t think you’re right, but I’m going to gamble with you and you’re closer to the ground truth than I am. I’d known you for 20 years. You have great judgment. I don’t know that I’m right either. Not really, not for sure. All these decisions are complicated. Let’s do it your way.” But at least then you’ve made a decision and I’m agreeing to commit to that decision. So I’m not going to be second guessing it. I’m not going to be sniping at it. I’m not going to be saying, “I told you so.” I’m going to try actively to help make sure it works. That’s a really important teammate behavior.

(01:05:18) There’s so many ways that dispute resolution is a really interesting thing on teams. And there are so many ways when two people disagree about something, even … I’m assuming the case for everybody is well-intentioned. They just have a very different opinion about what the right decision is. And in our society and inside companies, we have a bunch of mechanisms that we use to resolve these kinds of disputes. A lot of them are, I think, really bad. So an example of a really bad way of coming to agreement is compromise. So compromise, we’re in a room here and I could say, “Lex, how tall do you think this ceiling is?”

Jeff Bezos (01:06:00) I’m here and I could say, “Lex, how tall do you think this ceiling is?” And you’d be like, “I don’t know, Jeff, maybe 12 feet tall.” And I would say, “I think it’s 11 feet tall.” And then we’d say, “You know what? Let’s just call it 11 and a half feet.” That’s compromise, instead of. The right thing to do is to get a tape measure or figure out some way of actually measuring, but think getting that tape measure and figure out how to get it to the top of the ceiling and all these things, that requires energy. Compromise, the advantage of compromise as a resolution mechanism is that it’s low energy, but it doesn’t lead to truth. And so in things like the height of the ceiling where truth is a noble thing, you shouldn’t allow compromise to be used when you can know the truth.

(01:06:51) Another really bad resolution mechanism that happens all the time is just who’s more stubborn? This is also, let’s say two executives who disagree and they just have a war of attrition, and whichever one gets exhausted first capitulates to the other one. Again, you haven’t arrived at truth and this is very demoralizing. So this is where escalation, I try to ask people on my team and say, “Never get to a point where you are resolving something by who gets exhausted first. Escalate that.” I’ll help you make the decision because that’s so de-energized and such a terrible, lousy way to make a decision.

Lex Fridman (01:07:40) Do you want to get to the resolution as quickly as possible because that ultimately leads to high velocity of decision?

Jeff Bezos (01:07:45) Yes, and you want to try to get as close to truth as possible. Exhausting the other person is not truth seeking.

Jeff Bezos (01:07:54) And compromise is not truth seeking. And there are a lot of cases where no one knows the real truth and that’s where disagree and commit can come in, but escalation is better than war of attrition. Escalate to your boss and say, “Hey, we can’t agree on this. We like each other. We’re respectful of each other, but we strongly disagree with each other. We need you to make a decision here so we can move forward.” But decisiveness, moving forward quickly on decisions, as quickly as you responsibly can is how you increase velocity. Most of what slows things down is taking too long to make decisions at all scale levels. So it has to be part of the culture to get high velocity. Amazon has a million and a half people and the company is still fast. We’re still decisive, we’re still quick, and that’s because the culture supports that.

Lex Fridman (01:08:53) At every scale in a distributed way-

Lex Fridman (01:08:56) Try to maximize the velocity of decisions.

Lunar program

Lex Fridman (01:08:59) You’ve mentioned the lunar program. Let me ask you about that. There’s a lot going on there and you haven’t really talked about it much. So in addition to the Artemis program with NASA, Blue is doing its own lander program. Can you describe it? There’s a sexy picture on Instagram with one of them. Is it the MK1, I guess?

Jeff Bezos (01:09:20) Yeah, The Mark 1. The picture here is me with Bill Nelson, the NASA Administrator.

Lex Fridman (01:09:26) Just to clarify, the lander is the sexy thing about the [inaudible 01:09:29]. I really want to clarify that.

Jeff Bezos (01:09:32) I know it’s not me. I know it was either the lander or Bill.

Lex Fridman (01:09:34) Okay. I love Bill, but-

Jeff Bezos (01:09:37) Thank you for clarifying.

Jeff Bezos (01:09:40) Yes, the Mark 1 lander is designed to take 3,000 kilograms to the surface of the moon and to cargo expendable cargo. It’s an expendable lander. Lands on the moon, stays there, take 3,000 kilograms to the surface. It can be launched on a single New Glenn flight, which is very important. So it’s a relatively simple architecture, just like the human landing system lander, they’re called the Mark 2. Mark 1 is also fueled with liquid hydrogen, which is for high energy emissions like landing on the surface of the moon. The high specific impulsive hydrogen is a very big advantage.

(01:10:24) The disadvantage of hydrogen has always been that since it’s such a deep cryogen, it’s not storable. So it’s constantly boiling off and you’re losing propellant because it’s boiling off. And so what we’re doing as part of our lunar program is developing solar-powered cryo coolers that can actually make hydrogen a storable propellant for deep space. And that’s a real game-changer. It’s a game-changer for any high energy mission. So to the moon, but to the outer planets, to Mars, everywhere.

Lex Fridman (01:11:00) So the idea with both Mark 1 and Mark 2 is the New Glenn can carry it from the surface of earth to the surface of the moon?

Jeff Bezos (01:11:12) Exactly. So the Mark 1 is expendable. The lunar lander we’re developing for NASA, the Mark 2 lander, that’s part of the Artemis program. They call it the Sustaining Lander Program. So that lander is designed to be reusable. It can land on the surface of the moon in a single stage configuration and then take off. So if you look at the Apollo program, the lunar lander and Apollo was really two stages. It would land on the surface and then it would leave the descent stage on the surface of the moon and only the ascent stage would go back up into lunar orbit where it would rendezvous with the command module.

(01:11:56) Here, what we’re doing is we have a single stage lunar lander that carries down enough propellant so that it can bring the whole thing back up so that it can be reused over and over. And the point of doing that, of course, is to reduce cost so that you can make lunar missions more affordable over time, which is that’s one of NASA’s big objectives because this time… The whole point of Artemis is go back to the moon, but this time to stay. So back in the Apollo program, we went to the moon six times and then ended the program and it really was too expensive to continue.

Lex Fridman (01:12:35) And so there’s a few questions there, but one is how do you stay on the moon? What ideas do you have about sustaining life where a few folks can stay there for prolonged periods of time?

Jeff Bezos (01:12:51) Well, one of the things we’re working on is using lunar resources like lunar regolith to manufacture commodities and even solar cells on the surface of the moon. We’ve already built a solar cell that is completely made from lunar regolith stimulant, and this solar cell is only about 7% power efficient. So it’s very inefficient compared to the more advanced solar cells that we make here on earth. But if you can figure out how to make a practical solar cell factory that you can land on the surface of the moon and then the raw material for those solar cells is simply lunar regolith, then you can just continue to churn out solar cells on the surface of the moon, have lots of power on the surface of the moon. That will make it easier for people to live on the moon.

(01:13:51) Similarly, we’re working on extracting oxygen from lunar regolith. So lunar regolith by weight has a lot of oxygen in it. It’s bound very tightly as oxides with other elements. And so you have to separate the oxygen, which is very energy intensive. So that also could work together with the solar cells. And then ultimately, we may be able to find practical quantities of ice in the permanently shadowed craters on the poles of the moon. And we know there is ice water or water ice in those craters, and we know that we can break that down with electrolysis into hydrogen and oxygen. And then you’d not only have oxygen, but you’d also have a very good high efficiency propellant fuel in hydrogen.

(01:14:57) So there’s a lot we can do to make the moon more sustainable over time, but the very first step, the gate that all of that has to go through is we need to be able to land cargo and humans on the surface of the moon at an acceptable cost.

Lex Fridman (01:15:16) To fast-forward a little bit, is there any chance Jeff Bezos steps foot on the moon and on Mars, one or the other or both?

Jeff Bezos (01:15:27) It’s very unlikely. I think it’s probably something that gets done by future generations by the time it gets to me. I think in my lifetime that’s probably going to be done by professional astronauts, sadly. I would love to sign up for that mission. So don’t count me out yet, Lex. Give me a finding shot here maybe, but I think if we are placing reasonable bets on such a thing, in my lifetime, that will continue to be done by professional astronauts.

Lex Fridman (01:15:59) So these are risky, difficult missions?

Jeff Bezos (01:16:02) And probably missions that require a lot of training. You are going there for a very specific purpose to do something. We’re going to be able to do a lot on the moon too with automation. So in terms of setting up these factories and doing all that, we are sophisticated enough now with automation that we probably don’t need humans to tend those factories and machines. So there’s a lot that’s going to be done in both modes.

Lex Fridman (01:16:28) So I have to ask the bigger picture question about the two companies pushing humanity forward out towards the stars, Blue Origin and SpaceX. Are you competitors, collaborators? Which and to what degree?

Jeff Bezos (01:16:44) Well, I would say just like the internet is big and there are lots of winners at all scale levels, there are half a dozen giant companies that the internet has made, but there are a bunch of medium-sized companies and a bunch of small companies, all successful, all with profit streams, all driving great customer experiences. That’s what we want to see in space, that kind of dynamism. And space is big. There’s room for a bunch of winners and it’s going to happen at all skill levels. And so SpaceX is going to be successful for sure. I want Blue Origin to be successful, and I hope there are another five companies right behind us.

Lex Fridman (01:17:25) But I spoke to Elon a few times recently about you, about Blue Origin, and he was very positive about you as a person and very supportive of all the efforts you’ve been leading at Blue. What’s your thoughts? You worked with a lot of leaders at Amazon at Blue. What’s your thoughts about Elon as a human being and a leader?

Jeff Bezos (01:17:46) Well, I don’t really know Elon very well. I know his public persona, but I also know you can’t know anyone by their public persona. It’s impossible. You may think you do, but I guarantee you don’t. So I don’t really know. You know Elon way better than I do, Lex, but in terms of judging by the results, he must be a very capable leader. There’s no way you could have Tesla and SpaceX without being a capable leader. It’s impossible.

Lex Fridman (01:18:22) Yeah, I hope you guys hang out sometimes, shake hands and sort of have a kind of friendship that would inspire just the entirety of humanity, because what you’re doing is one of the big grand challenges ahead for humanity.

Jeff Bezos (01:18:40) Well, I agree with you and I think in a lot of these endeavors we’re very like-minded. So I’m not saying we’re identical, but I think we’re very like-minded. And so I love that idea.

Lex Fridman (01:18:56) All right, going back to sexy pictures on your Instagram, there’s a video of you from the early days of Amazon, giving a tour of your, “Offices.” I think your dad is holding the camera.

Jeff Bezos (01:19:10) He is. Yeah, I know, right? Yes. This is what? The giant orange extension cord.

Lex Fridman (01:19:12) And you’re explaining the genius of the extension cord and how this is a desk and the CRT monitor, and that’s where all the magic happened. I forget what your dad said, but this is the center of it all. So what was it like? What was going through your mind at that time? You left a good job in New York and took this leap. Were you excited? Were you scared?

Jeff Bezos (01:19:37) So excited and scared, anxious. Thought the odds of success were low. Told all of our early investors that I thought there was a 30% chance of success by which I just mean getting your money back, not what actually happened. Because that’s the truth. Every startup company is unlikely to work. It’s helpful to be in reality about that, but that doesn’t mean you can’t be optimistic. So you have to have this duality in your head. On the one hand, you know what the baseline statistics say about startup companies, and the other hand, you have to ignore all of that and just be 100% sure it’s going to work, and you’re doing both things at the same time. You’re holding that contradiction in your head.

(01:20:24) But it was so exciting. From 1994 when the company was founded to 1995 when we opened our doors, all the way until today, I find Amazon so exciting. And that doesn’t mean… It’s full of pain, full of problems. It’s like there’s so many things that need to be resolved and worked and made better and et cetera. But on balance, it’s so fun. It’s such a privilege. It’s been such a joy. I feel so grateful that I’ve been part of that journey. It’s just been incredible.

Lex Fridman (01:21:04) So in some sense, you don’t want a single day of comfort. You’ve written about this many times. We’ll talk about your writing, which I would highly recommend people read and just the letters to shareholders. So explaining the idea of day one thinking, I think you first wrote about in 97 letters to shareholders. Then you also in a way wrote it about, sad to say, is your last letter to shareholders as CEO. And you said that, “Day two is stasis followed by irrelevance, followed by excruciating painful decline, followed by death.” And that is why it’s always day one. Can you explain this day one thing? This is a really powerful way to describe the beginning and the journey of Amazon.

Jeff Bezos (01:21:56) It’s really a very simple, and I think age-old idea about renewal and rebirth and every day is day one. Every day you are deciding what you’re going to do and you are not trapped by what you were or who you were or any self-consistency. Self-consistency even can be a trap. And so day one thinking is we start fresh every day and we get to make new decisions every day about invention, about customers, about how we’re going to operate. Even as deeply as what our principles are, we can go back to that. It turns out we don’t change those very often, but we change them occasionally.

(01:22:49) And when we work on programs at Amazon, we often make a list of tenants. And the tenants are… They’re not principles, they’re a little more tactical than principles, but it’s the main ideas that we want this program to embody, whatever those are. And one of the things that we do is we put, “These are the tenets for this program and parentheses.” We always put, “Unless you know a better way.” And that idea, “Unless you know a better way,” is so important because you never want to get trapped by dogma. You never want to get trapped by history. It doesn’t mean you discard history or ignore it. There’s so much value in what has worked in the past, but you can’t be blindly following what you’ve done. And that’s the heart of day one, is you’re always starting afresh.

Lex Fridman (01:23:51) And to the question of how to fend off day two, you said, “Such a question can’t have a simple answer,” as you’re saying. “There will be many elements, multiple paths, and many traps. I don’t know the whole answer, but I may know bits of it. Here’s a starter pack of essentials, maybe others come to mind. For day one, defense, customer obsession, a skeptical view of proxies, the eager adoption of external trends and high velocity decision-making.”

(01:24:19) So we talked about high velocity decision-making, that’s more difficult than it sounds. So maybe you can pick one that stands out to you as you can comment on. Eager adoption of external trends, high velocity decision-making, skeptical view of proxies. How do you fight off day two?

Jeff Bezos (01:24:36) Well, I’ll talk about… Because I think it’s the one that is maybe in some ways the hardest to understand, is the skeptical view of proxies. One of the things that happens in business, probably anything where you have an ongoing program and something is underway for a number of years, is you develop certain things that you’re managing to. The typical case would be a metric, and that metric isn’t the real underlying thing. And so maybe the metric is efficiency metric around customer contacts per unit sold or something like. If you sell a million units, how many customer contacts do you get or how many returns do you get? And so on and so on.

(01:25:30) And so what happens is a little bit of a kind of inertia sets in where somebody a long time ago invented that metric and they invented that metric, they decided, “We need to watch for customer returns per unit sold as an important metric.” But they had a reason why they chose that metric, the person who invented that metric and decided it was worth watching. And then fast-forward five years, that metric is the proxy.

Lex Fridman (01:26:02) The proxy for truth, I guess.

Jeff Bezos (01:26:04) The proxy for truth. Let’s say in this case it’s a proxy for customer happiness, but that metric is not actually customer happiness. It’s a proxy for customer happiness. The person who invented the metric understood that connection. Five years later, a kind of inertia can set in and you forget the truth behind why you were watching that metric in the first place. And the world shifts a little and now that proxy isn’t as valuable as it used to be or it’s missing something. And you have to be on alert for that. You have to know, “Okay, I don’t really care about this metric. I care about customer happiness and this metric is worth putting energy into and following and improving and scrutinizing, only in so much as it actually affects customer happiness.”

(01:27:03) And so you’ve got to constantly be on guard and it’s very, very common. This is a nuanced problem. It’s very common, especially in large companies, that they’re managing to metrics that they don’t really understand. They don’t really know why they exist, and the world may have shifted out from under them a little and the metrics are no longer as relevant as they were when somebody 10 years earlier invented the metric.

Lex Fridman (01:27:29) That is a nuance, but that’s a big problem. Right?

Jeff Bezos (01:27:33) It’s a huge problem.

Lex Fridman (01:27:34) There’s something so compelling to have a nice metric to try to optimize.

Jeff Bezos (01:27:38) Yes. And by the way, you do need metrics.

Jeff Bezos (01:27:41) You can’t ignore them. Want them, but you just have to be constantly on guard. This is a way to slip into day two thinking would be to manage your business to metrics that you don’t really understand and you’re not really sure why they were invented in the first place, and you’re not sure they’re still as relevant as they used to be.

Lex Fridman (01:28:03) What does it take to be the guy or gal who brings up the point that this proxy might be outdated? I guess what does it take to have a culture that enables that in the meeting? Because that’s a very uncomfortable thing to bring up at a meeting. “We all showed up here, it’s a Friday.”

Jeff Bezos (01:28:21) You have just asked a million-dollar question. So if I generalize what you’re asking, you are talking in general about truth-telling and we humans are not really truth-seeking animals. We are social animals.

Jeff Bezos (01:28:44) And take you back in time 10,000 years and you’re in a small village. If you go along to get along, you can survive. You can procreate. If you’re the village truth-teller, you might get clubbed to death in the middle of the night. Truths are often… They don’t want to be heard because important truths can be uncomfortable, they can be awkward, they can be exhausting.

Lex Fridman (01:29:12) Impolite and all that kind of stuff.

Jeff Bezos (01:29:14) Yes, challenging. They can make people defensive even if that’s not the intent. But any high performing organization, whether it’s a sports team, a business, a political organization, an activist group, I don’t care what it is, any high performing organization has to have mechanisms and a culture that supports truth-telling. One of the things you have to do is you have to talk about that. You have to talk about the fact that it takes energy to do that. You have to talk to people, you have to remind people, “It’s okay that it’s uncomfortable.” Literally tell people, “It’s not what we’re designed to do as humans.” It’s kind of a side effect. We can do that, but it’s not how we survive. We mostly survive by being social animals and being cordial and cooperative, and that’s really important.

(01:30:10) And so science is all about truth-telling. It’s actually a very formal mechanism for trying to tell the truth. And even in science, you find that it’s hard to tell the truth. Even you’re supposed to have hypothesis and test it and find data and reject the hypothesis and so on, it’s not easy.

Lex Fridman (01:30:36) But even in science, there’s like the senior scientists and the junior scientists.

Lex Fridman (01:30:41) And then there’s a hierarchy of humans where somehow seniority matters in the scientific process, which it should not.

Jeff Bezos (01:30:49) Yes, and that’s true inside companies too. And so you want to set up your culture so that the most junior person can overrule the most senior person if they have data. And that really is about trying to… There are little things you can do. So for example, in every meeting that I attend, I always speak last. And I know from experience that if I speak first, even very strong-willed, highly intelligent, high judgment participants in that meeting will wonder, “Well, if Jeff thinks that, I came in this meeting thinking one thing, but maybe I’m not right.” And so you can do little things like if you’re the most senior person in the room, go last, let everybody else go first. In fact, ideally, let’s try to have the most junior person go first and the second and try to go in order of seniority so that you can hear everyone’s opinion in an unfiltered way. Because we really do, we actually literally change our opinions. If somebody who you really respect says something, it makes you change your mind a little.

Lex Fridman (01:32:17) So you’re saying implicitly or explicitly, give permission for people to have a strong opinion, as long as it’s backed by data.

Jeff Bezos (01:32:27) Yes, and sometimes it can even… By the way, a lot of our most powerful truths turn out to be hunches, they turn out to be based on anecdotes, they’re intuition based. And sometimes you don’t even have strong data, but you may know the person well enough to trust their judgment. You may feel yourself leaning in. It may resonate with a set of anecdotes you have, and then you may be able to say, “Something about that feels right. Let’s go collect some data on that. Let’s try to see if we can actually know whether it’s right. But for now, let’s not disregard it. It feels right.”

(01:33:06) You can also fight inherent bias. There’s an optimism bias. If there are two interpretations of a new set of data and one of them is happy and one of them is unhappy, it’s a little dangerous to jump to the conclusion that the happy interpretation is right. You may want to compensate for that human bias of trying to find the silver lining and say, “Look, that might be good, but I’m going to go with it’s bad for now until we’re sure.”

Lex Fridman (01:33:36) So speaking of happiness bias, data collection and anecdotes, you have to… How’s that for a transition? You have to tell me the story of the call you made, the customer service call you made to demonstrate a point about wait times?

Jeff Bezos (01:33:57) Yeah. This is very early in the history of Amazon.

Jeff Bezos (01:34:00) And we were going over a weekly business review and a set of documents, and I have a saying, which is when the data and the anecdotes disagree, the anecdotes are usually right. And it doesn’t mean you just slavishly go follow the anecdotes then. It means you go examine the data because it’s usually not that the data is being miscollected, it’s usually that you’re not measuring the right thing. And so of you have a bunch of customers complaining about something and at the same time, your metrics look like they shouldn’t be complaining, you should doubt the metrics.

(01:34:43) And an early example of this was we had metrics that showed that our customers were waiting, I think less than, I don’t know, 60 seconds when they called a 1-800 number to get phone customer service. The wait time was supposed to be less than 60 seconds, but we had a lot of complaints that it was longer than that. And anecdotally it seemed longer than that. I would call customer service myself. And so one day we’re in a meeting, we’re going through the WBR, the weekly business review, and we get to this metric in the deck, and the guy who leads customer service is defending the metric. And I said, “Okay, let’s call.” Picked up the phone, and I dialed the 1-800 number and called customer service, and we just waited in silence.

Lex Fridman (01:35:39) What did it turn out to be?

Jeff Bezos (01:35:40) Oh, it was really long, more than 10 minutes, I think.

Jeff Bezos (01:35:43) It was many minutes. And so it dramatically made the point that something was wrong with the data collection. We weren’t measuring the right thing, and that set off a whole chain of events where we started measuring it right. And that’s an example, by the way, of truth-telling is like that’s an uncomfortable thing to do, but you have to seek truth even when it’s uncomfortable, and you have to get people’s attention and they have to buy into it, and they have to get energized around really fixing things.

Principles

Lex Fridman (01:36:16) So that speaks to the obsession with the customer experience. So one of the defining aspects of your approach to Amazon is just being obsessed with making customers happy. I think companies sometimes say that, but Amazon is really obsessed with that. I think there’s something really profound to that, which is seeing the world through the eyes of the customer, like the customer experience, the human being that’s using the product, that’s enjoying the product, the subtle little things that make up their experience. How do you optimize those?

Jeff Bezos (01:36:55) This is another really good and deep question because there are big things that are really important to manage, and then there are small things. Internally into Amazon, we call them paper cuts. So we’re always working on the big things, if you ask me. And most of the energy goes into the big things, as it should, and you can identify the big things. And I would encourage anybody, if anybody listening to this is an entrepreneur, has a small business, whatever, think about the things that are not going to change over 10 years. And those are probably the big things.

(01:37:38) So I know in our retail business at Amazon, 10 years from now, customers are still going to want low prices. I know they’re still going to want fast delivery, and I just know they’re still going to want big selection. So it’s impossible to imagine a scenario where 10 years from now where a customer says, “I love Amazon, I just wish the prices were a little higher,” or, “I love Amazon, I just wish you delivered a little more slowly.” So when you identify the big things you can tell they’re worth putting energy into because they’re stable in time.

(01:38:10) Okay, but you’re asking about something a little different, which is in every customer experience, there are those big things. And by the way, it’s astonishingly hard to focus even on just the big things. So even though they’re obvious, they’re really hard to focus on. But in addition to that, there are all these little tiny customer experience deficiencies, and we call those paper cuts. We make long lists of them. And then we have dedicated teams that go fix paper cuts because the teams working on the big issues never get to the paper cuts. They never work their way down the list to get to… They’re working on big things, as they should and as you want them to. And so you need special teams who are charged with fixing…

Jeff Bezos (01:39:00) Special teams who are charged with fixing paper cuts.

Lex Fridman (01:39:04) Where would you put on the paper cut spectrum the Buy now with the 1-Click button? Which is, I think, pretty genius. So to me, okay, my interaction with things I love on the internet, there’s things I do a lot. I, maybe representing a regular human, I would love for those things to be frictionless. For example, booking airline tickets, just saying. But it’s buying a thing with one click, making that experience frictionless, intuitive, all aspects of that, that just fundamentally makes my life better, not just in terms of efficiency, in terms of some kind of-

Lex Fridman (01:39:50) … Yeah, cognitive load and inner peace and happiness. Because, first of all, buying stuff is a pleasant experience. Having enough money to buy a thing and then buying it is a pleasant experience. And having pain around that is somehow just you’re ruining a beautiful experience. And I guess all I’m saying as a person who loves good ideas, is that a paper cut, a solution to a paper cut?

Jeff Bezos (01:40:17) Yes. So that particular thing is probably a solution to a number of paper cuts. So if you go back and look at our order pipeline and how people shopped on Amazon before we invented 1-Click shopping, there was more friction. There was a whole series of paper cuts and that invention eliminated a bunch of paper cuts. And I think you’re absolutely right by the way, that when you come up with something like 1-Click shopping, again, this is so ingrained in people now, I’m impressed that you even notice it. Most people-

Lex Fridman (01:40:54) Every time I click the button, I just-

Jeff Bezos (01:40:54) … most people never notice.

Lex Fridman (01:40:55) … just a surge of happiness.

Jeff Bezos (01:41:00) There is in the perfect invention for the perfect moment in the perfect context, there is real beauty. It is actual beauty and it feels good. It’s emotional. It’s emotional for the inventor, it’s emotional for the team that builds it. It’s emotional for the customer. It’s a big deal and you can feel those things.

Lex Fridman (01:41:23) But to keep coming up with that idea, with those kinds of ideas, I guess is the day one thinking effort.

Jeff Bezos (01:41:29) Yeah, and you need a big group of people who feel that kind of satisfaction with creating that kind of beauty.

Lex Fridman (01:41:38) There’s a lot of books written about you. There’s a book Invent & Wander where Walter Isaacson does an intro. It’s mostly collective writings of yours. I’ve read that. I also recommend people check out the Founders Podcast that covers you a lot and it does different analysis of different business advice you’ve given over the years. I bring all that up because I mentioned that you said that books are an antidote for short attention spans. And I forget how it was phrased, but that when you were thinking about the Kindle that you were thinking about how technology changes us.

Jeff Bezos (01:42:20) Changes us. We co-evolve with our tools. So we invent new tools and then our tools change us.

Lex Fridman (01:42:30) Which is fascinating to think about.

Jeff Bezos (01:42:32) It goes in a circle

Lex Fridman (01:42:33) And there’s some aspect, even just inside business, where you don’t just make the customer happy, but you also have to think about where is this going to take humanity if you zoom out a bit?

Jeff Bezos (01:42:45) A hundred percent and you can feel your brain. Brains are plastic and you can feel your brain getting reprogrammed. I remember the first time this happened to me was when Tetris who’d first came on the scene. Anybody who’s been a game player has this experience where you close your eyes to lay down to go to sleep and you see all the little blocks moving and you’re kind of rotating them in your mind and you can just tell as you walk around the world that you have rewired your brain to play Tetris. But that happens with everything. I think we still have yet to see the full repercussions of this, I fear, but I think one of the things that we’ve done online and largely because of social media is we have trained our brains to be really good at processing super short form content.

(01:43:52) Your podcast flies in the face of this. You do these long format things.

Jeff Bezos (01:44:00) And reading books is a long format thing and if something is convenient, we do more of it. We carry around in our pocket a phone, and one of the things that phone does for the most part is it is an attention shortening device because most of the things we do on our phone shorten our attention spans. And I’m not even going to say we know for sure that that’s bad, but I do think it’s happening. That’s one of the ways we’re co-evolving with that tool. But I think it’s important to spend some of your time and some of your life doing long attention span things.

Lex Fridman (01:44:41) Yeah, I think you’ve spoken about the value in your own life of focus, of singular focus on a thing for prolonged periods of time, and that’s certainly what books do and that’s certainly what that piece of technology does. But I bring all that up to ask you about another piece of technology, AI, that has the potential to have various trajectories to have an impact on human civilization. How do you think AI will change us?

Jeff Bezos (01:45:14) If you’re talking about generative AI, large language models, things like ChatGPT, and its soon successors, these are incredibly powerful technologies. To believe otherwise is to bury your head in the sand, soon to be even more powerful. It’s interesting to me that large language models in their current form are not inventions, they’re discoveries. The telescope was an invention, but looking through it at Jupiter, knowing that it had moons, was a discovery. My God, it has moons. And that’s what Galileo did. And so this is closer on that spectrum of invention. We know exactly what happens with a 787, it’s an engineered object. We designed it. We know how it behaves. We don’t want any surprises. Large language models are much more like discoveries. We’re constantly getting surprised by their capabilities. They’re not really engineered objects.

(01:46:35) Then you have this debate about whether they’re going to be good for humanity or bad for humanity. Even specialized AI could be very bad for humanity. Just regular machine learning models can make certain weapons of war, that could be incredibly destructive and very powerful. And they’re not general AIs. They could just be very smart weapons. And so we have to think about all of those things. I’m very optimistic about this. So even in the face of all this uncertainty, my own view is that these powerful tools are much more likely to help us and save us even than they are to on balance hurt us and destroy us. I think we humans have a lot of ways of we can make ourselves go extinct. These things may help us not do that, so they may actually save us. So the people who are overly concerned, in my view, overly, it is a valid debate. I think that they may be missing part of the equation, which is how helpful they could be in making sure we don’t destroy ourselves.

(01:48:07) I don’t know if you saw the movie Oppenheimer, but to me, first of all, I loved the movie and I thought the best part of the movie is this bureaucrat played by Robert Downey Jr, who some of the people I’ve talked to think that’s the most boring part of the movie. I thought it was the most fascinating because what’s going on here is you realize we have invented these awesome, destructive, powerful technologies called nuclear weapons and they’re managed and we humans, we’re not really capable of wielding those weapons. And that’s what he represented in that movie is here’s this guy, he wrongly thinks… he’s being so petty. He thinks that Oppenheimer said something bad to Einstein about him. They didn’t talk about him at all as you find out in the final scene of the movie. And yet he’s spent his career trying to be vengeful and petty.

(01:49:19) And that’s the problem. We as a species are not really sophisticated enough and mature enough to handle these technologies. And by the way, before you get to general AI and the possibility of AI having agency and there’s a lot of things would have to happen, but there’s so much benefit that’s going to come from these technologies in the meantime, even before there are general AI in terms of better medicines and better tools to develop more technologies and so on. So I think it’s an incredible moment to be alive and to witness the transformations that are going to happen. How quickly will happen, no one knows. But over the next 10 years and 20 years, I think we’re going to see really remarkable advances. And I personally am very excited about it.

Lex Fridman (01:50:12) First of all, really interesting to say that it’s discoveries, that it’s true that we don’t know the limits of what’s possible with the current language models.

Lex Fridman (01:50:24) And it could be a few tricks and hacks here and there that open doors to hold entire new possibilities.

Jeff Bezos (01:50:33) We do know that humans are doing something different from these models, in part because we’re so power efficient. The human brain does remarkable things and it does it on about 20 watts of power. And the AI techniques we use today use many kilowatts of power to do equivalent tasks. So there’s something interesting about the way the human brain does this. And also we don’t need as much data. So self-driving cars, they have to drive billions and billions of miles to try to learn how to drive. And your average 16-year-old figures it out with many fewer miles. So there are still some tricks, I think, that we have yet to learn. I don’t think we’ve learned the last trick. I don’t think it’s just a question of scaling things up. But what’s interesting is that just scaling things up, and I put just in quotes because it’s actually hard to scale things up, but just scaling things up also appears to pay huge dividends.

Lex Fridman (01:51:40) Yeah. And there’s some more nuanced aspect about human beings that’s interesting if it’s able to accomplish like being truly original and novel. Large language models, being able to come up with some truly new ideas. That’s one. And the other one is truth. It seems that large language models are very good at sounding like they’re saying a true thing, but they don’t require or often have a grounding in a mathematical truth, basically is a very good bullshitter. So if there’s not enough data in the training data about a particular topic, it’s just going to concoct accurate sounding narratives, which is a very fascinating problem to try to solve, how do you get language models to infer what is true or not to introspect?

Jeff Bezos (01:52:41) Yeah, they need to be taught to say, “I don’t know,” more often and I know several humans who could be taught that as well.

Lex Fridman (01:52:50) Sure. And then the other stuff, because you’re still a bit involved in the Amazon side with the AI things, the other open question is what kind of products are created from this?

Jeff Bezos (01:53:01) Oh, so many. We have Alexa and Echo and Alexa has hundreds of millions of installed base inputs. And so there’s Alexa everywhere. And guess what? Alexa is about to get a lot smarter. And so from a product point of view, that’s super exciting.

Lex Fridman (01:53:27) There’s so many opportunities there,

Jeff Bezos (01:53:30) So many opportunities. Shopping assistant, all that stuff is amazing. And AWS, we’re building Titan, which is our foundational model. We’re also building Bedrock, which are corporate clients at AWS. Our enterprise clients, they want to be able to use these powerful models with their own corporate data without accidentally contributing their corporate data to that model. And so those are the tools we’re building for them with Bedrock. So there’s tremendous opportunity here.

Lex Fridman (01:54:03) Yeah, the security, the privacy, all those things are fascinating. Because so much value can be gained by training on private data, but you want to keep this secure. It’s a fascinating technical problem.

Jeff Bezos (01:54:13) Yes. This is a very challenging technical problem and it’s one that we’re making progress on and dedicated to solving for our customers.

Lex Fridman (01:54:21) Do you think there will be a day when humans and robots, maybe Alexa, have a romantic relationship like in the movie Her?

Jeff Bezos (01:54:29) Well, I think if you look at the-

Lex Fridman (01:54:31) Just brainstorming products here.

Jeff Bezos (01:54:32) … if you look at the spectrum of human variety and what people like, sexual variety, there are people who like everything. So the answer to your question has to be yes.

Lex Fridman (01:54:43) Okay. I guess I’m asking when-

Jeff Bezos (01:54:45) I don’t know how widespread that will be.

Jeff Bezos (01:54:48) But it will happen.

Productivity

Lex Fridman (01:54:49) I was just asking when for a friend, but it’s all right. Moving on. Next question. What’s a perfectly productive day in the life of Jeff Bezos? You’re one of the most productive humans in the world.

Jeff Bezos (01:55:03) Well, first of all, I get up in the morning and I putter. I have a coffee.

Lex Fridman (01:55:09) Can you define putter?

Jeff Bezos (01:55:11) I slowly move around. I’m not as productive as you might think I am. Because I do believe in wandering and I read my phone for a while. I read newspapers for a while. I chat with Laura and I drink my first coffee. So I move pretty slowly in the first couple of hours. I get up early just naturally, and then I exercise most days. Most days it’s not that hard for me. Some days it’s really hard and I do it anyway, I don’t want to, and it’s painful. And I’m like, “Why am I here?” And I don’t want to do any of this.

Lex Fridman (01:55:52) “Why am I here at the gym?”

Jeff Bezos (01:55:53) “Why am I here at the gym? Why don’t I do something else?” It’s not always easy.

Lex Fridman (01:55:59) What’s your social motivation in those moments?

Jeff Bezos (01:56:02) I know that I’ll feel better later if I do it. And so the real source of motivation, I can tell the days when I skip it, I’m not quite as alert. I don’t feel as good. And then there’s harder motivations. It’s longer term, you want to be healthy as you age. You want health span. Ideally, you want to be healthy and moving around when you’re 80 years old. And so there’s a lot of… But that kind of motivation is so far in the future, it can be very hard to work in the second. So thinking about the fact I’ll feel better in about four hours if I do it now, I’ll have more energy for the rest of my day and so on and so on.

Lex Fridman (01:56:42) What’s your exercise routine, just to linger on that? How much you curl? What are we talking about here? That’s all I do at the gym so I just…

Jeff Bezos (01:56:52) My routine on a good day, I do about half an hour of cardio and I do about forty-five minutes of weightlifting, resistance training of some kind, mostly weights. I have a trainer who I love who pushes me, which is really helpful. He’ll say, “Jeff, can we go up on that weight a little bit?”

(01:57:18) And I’ll think about it and I’ll be like, “No, I don’t think so.”

(01:57:23) And he’ll look at me and say, “Yeah, I think you can.” And of course he’s right.

Lex Fridman (01:57:31) Yeah, of course. Of course.

Jeff Bezos (01:57:32) So it’s helpful to have somebody push you a little bit.

Lex Fridman (01:57:34) But almost every day, you do that?

Jeff Bezos (01:57:37) Almost every day, I do a little bit of cardio and a little bit of weightlifting and I’d rotate. I do a pulling day and a pushing day and a leg day. It’s all pretty standard stuff.

Lex Fridman (01:57:48) So puttering, coffee, gym-

Jeff Bezos (01:57:49) Puttering, coffee, gym, and then work.

Lex Fridman (01:57:53) … work. But what’s work look like? What do the productive hours look like for you?

Jeff Bezos (01:57:59) So a couple years ago, I left as the CEO of Amazon, and I have never worked harder in my life. I am working so hard and I’m mostly enjoying it, but there are also some very painful days. Most of my time is spent on Blue Origin and I’m so deeply involved here now for the last couple of years. And in the big, I love it, and the small, there’s all the frustrations that come along with everything. We’re trying to get to rate manufacturing as we talked about. That’s super important. We’ll get there. We just hired a new CEO, a guy I’ve known for close to 15 years now, a guy named Dave Limp who I love. He’s amazing. So we’re super lucky to have Dave, and you’re going to see us move faster there.

(01:58:46) So my day of work, reading documents, having meetings, sometimes in person, sometimes over Zoom, depends on where I am. It’s all about the technology, it’s about the organization. I have architecture and technology meetings almost every day on various subsystems inside the vehicle, inside the engines. It’s super fun for me. My favorite part of it is the technology. My least favorite part of it is building organizations and so on. That’s important, but it’s also my least favorite part. So that’s why they call it work. You don’t always get to do what you want to do.

Lex Fridman (01:59:31) How do you achieve time where you can focus and truly think through problems?

Jeff Bezos (01:59:36) I do little thinking retreats. So this is not the only way, I can do that all day long. I’m very good at focusing. I don’t keep to a strict schedule. My meetings often go longer than I planned for them to because I believe in wandering. My perfect meeting starts with a crisp document. So the document should be written with such clarity that it’s like angels singing from on high. I like a crisp document and a messy meeting. And so the meeting is about asking questions that nobody knows the answer to and trying to wander your way to a solution. And when that happens just right, it makes all the other meetings worthwhile. It feels good. It has a kind of beauty to it. It has an aesthetic beauty to it, and you get real breakthroughs in meetings like that.

Lex Fridman (02:00:37) Can you actually describe the crisp document? This is one of the legendary aspects of Amazon, of the way you approach meetings is this, the six-page memo. Maybe first describe the process of running a meeting with memos.

Jeff Bezos (02:00:51) Meetings at Amazon and Blue Origin are unusual. When new people come in, like a new executive joins, they’re a little taken aback sometimes because the typical meeting, we’ll start with a six-page narratively structured memo and we do study hall. For 30 minutes, we sit there silently together in the meeting and read.

Jeff Bezos (02:01:17) Take notes in the margins. And then we discuss. And the reason, by the way, we do study, you could say, I would like everybody to read these memos in advance, but the problem is people don’t have time to do that. And they end up coming to the meeting having only skimmed the memo or maybe not read it at all, and they’re trying to catch up. And they’re also bluffing like they were in college having pretended to do the reading.

Jeff Bezos (02:01:43) It’s better just to carve out the time for people.

Lex Fridman (02:01:47) Yeah. And do it together.

Jeff Bezos (02:01:47) So now we’re all on the same page, we’ve all read the memo, and now we can have a really elevated discussion. And this is so much better from having a slideshow presentation, a PowerPoint presentation of some kind, where that has so many difficulties. But one of the problems is PowerPoint is really designed to persuade. It’s kind of a sales tool. And internally, the last thing you want to do is sell. Again, you’re truth seeking. You’re trying to find truth. And the other problem with PowerPoint is it’s easy for the author and hard for the audience. And a memo is the opposite. It’s hard to write a six-page memo. A good six-page memo might take two weeks to write. You have to write it, you have to rewrite it, you have to edit it, you have to talk to people about it. They have to poke holes in it for you. You write it again, it might take two weeks. So the author, it’s really a very difficult job, but for the audience it’s much better.

(02:02:45) So you can read a half hour, and there are little problems with PowerPoint presentations too. Senior executives interrupt with questions halfway through the presentation. That question’s going to be answered on the next slide, but you never got there. If you read the whole memo in advance… I often write lots of questions that I have in the margins of these memos, and then I go cross them all out because by the time I get to the end of the memo, they’ve been answered. That’s why I save all that time.

(02:03:11) You also get, if the person who’s preparing the memo, we talked earlier about group think and the fact that I go last in meetings and that you don’t want your ideas to pollute the meeting prematurely, the author of the memos has got to be very vulnerable. They’ve got to put all their thoughts out there and they’ve got to go first. But that’s great because it makes them really good. And you get to see their real ideas and you’re not trompling on them accidentally in a big PowerPoint presentation meeting.

Lex Fridman (02:03:50) What’s that feel like when you’ve authored a thing and then you’re sitting there and everybody’s reading your thing?

Jeff Bezos (02:03:54) I think it’s mostly terrifying.

Lex Fridman (02:03:57) Yeah. But maybe in a good way? Like a purifying?

Jeff Bezos (02:04:02) I think it’s terrifying in a productive way, but I think it’s emotionally, a very nerve-racking experience.

Lex Fridman (02:04:13) Is there a art, science to the writing of this six-page memo or just writing in general to you?

Jeff Bezos (02:04:20) It’s really got to be a real memo. So it means paragraphs have topic sentences. It’s verbs and nouns. That’s the other problem with PowerPoint presentations, they’re often just bullet points. And you can hide a lot of sloppy thinking behind bullet points. When you have to write in complete sentences with narrative structure, it’s really hard to hide sloppy thinking. So it forces the author to be at their best, and so they’re somebody’s really their best thinking. And then you don’t have to spend a lot of time trying to tease that thinking out of the person, and you’ve got it from the very beginning. So it really saves you time in the long run.

Lex Fridman (02:05:03) So that part is crisp, and then the rest is messy. Crisp document, messy meeting.

Jeff Bezos (02:05:07) Yeah, so you don’t want to pretend that the discussion should be crisp. Most meetings, you’re trying to solve a really hard problem. There’s a different kind of meeting, which we call weekly business reviews or business reviews that may be weekly or monthly or daily, whatever they are. But these business review meetings, that’s usually for incremental improvement. And you’re looking at a series of metrics, every time it’s the same metrics. Those meetings can be very efficient. They can start on time and end on time.

Future of humanity

Lex Fridman (02:05:35) So we’re about to run out of time, which is a good time to ask about the 10,000-Year Clock.

Lex Fridman (02:05:44) Yes, that’s what I’m known for, is the humor. Okay. Can you explain what the 10,000-Year Clock is?

Jeff Bezos (02:05:53) Is? 10,000-Year Clock is a physical clock of monumental scale. It’s about 500 feet tall. It’s inside a mountain in west Texas at a chamber that’s about 12 feet in diameter and 500 feet tall. 10,000-Year Clock is an idea conceived by a brilliant guy named Danny Hillis way back in the ’80s. The idea is to build a clock as a symbol for long-term thinking. And you can kind of just very conceptually think of the 10,000-Year Clock as it ticks once a year, it chimes once every a hundred years, and the cuckoo comes out once every a thousand years. So it just sort of slows everything down. And it’s a completely mechanical clock. It is designed to last 10,000 years with no human intervention. So the material choices and everything else. It’s in a remote location, both to protect it, but also so that visitors have to make a pilgrimage.

(02:06:57) The idea is that over time, and this will take hundreds of years, but over time, it will take on the patina of age, and then it will become a symbol for long-term thinking that will actually hopefully get humans to extend their thinking horizons. And in my view, that’s really important as we have become, as a species, as a civilization, more powerful. We’re really affecting the planet now. We’re really affecting each other. We have weapons of mass destruction. We have all kinds of things where we can really hurt ourselves and the problems we create can be so large. The unintended consequences of some of our actions like climate change, putting carbon in the atmosphere is a perfect example. That’s an unintended consequence of the Industrial Revolution, got a lot of benefits from it, but we’ve also got this side effect that is very detrimental.

(02:07:56) We need to start training ourselves to think longer term. Long-term thinking is a giant lever. You can literally solve problems if you think long-term, that are impossible to solve if you think short-term. And we aren’t really good at thinking long-term. Five years is a tough timeframe for most institutions to think past. And we probably need to stretch that to 10 years and 15 years and 20 years and 25 years, and we’d do a better job for our children or our grandchildren if we could stretch those thinking horizons. And so the clock, in a way, it’s an art project, it’s a symbol. And if it ever has any power to influence people to think longer term, that won’t happen for hundreds of years, but we are going to build it now and let it accrue the patina of age.

Lex Fridman (02:08:52) Do you think humans will be here when the clock runs out here on earth?

Jeff Bezos (02:08:56) I think so. But the United States won’t exist. Whole civilizations rise and fall. 10,000 years is so long. No nation state has ever survived for anywhere close to 10,000 years.

Lex Fridman (02:09:12) And the increasing rate of progress makes that even fantastic.

Jeff Bezos (02:09:15) Even less likely so. Do I think humans will be here? Yes. How will we have changed ourselves and what will we be and so on and so on? I don’t know, but I think we’ll be here.

Lex Fridman (02:09:25) On that grand scale, a human life feels tiny. Do you ponder your own mortality? Are you afraid of death?

Jeff Bezos (02:09:32) No. I used to be afraid of death. I did. I remember as a young person being very scared of mortality, didn’t want to think about it, and so on. And as I’ve gotten older, I’m 59 now, as I’ve gotten older, somehow that fear has sort of gone away. I would like to stay alive for as long as possible, but I’m really more focused on health span. I want to be healthy. I want that square wave. I want to be healthy, healthy, healthy, and then gone. I don’t want the long decay. And I’m curious. I want to see how things turn out. I’d like to be here. I love my family and my close friends, and I’m curious about them, and I want to see. So I have a lot of reasons to stay around, but mortality doesn’t have that effect on me that it did maybe when I was in my twenties.

Lex Fridman (02:10:38) Well, Jeff, thank you for creating Amazon, one of the most incredible companies in history, and thank you for trying your best to make humans a multi-planetary species, expanding out into our solar system, maybe beyond, to meet the aliens out there. And thank you for talking today.

Jeff Bezos (02:10:55) Lex, thank you for doing your part to lengthen our attention spans. Appreciate that very much.

Lex Fridman (02:11:04) I’m doing my best. Thanks for listening to this conversation with Jeff Bezos. To support this podcast, please check out our sponsors in the description. And now let me leave you with some words from Jeff Bezos himself. Be stubborn on vision, but flexible on the details. Thank you for listening and hope to see you next time.

埃隆·马斯克:战争、人工智能、外星人、政治、物理、视频游戏和人类 (2023-11-10)

Elon Musk: War, AI, Aliens, Politics, Physics, Video Games, and Humanity (2023-11-10)

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:在俄乌、巴以冲突持续发酵,及其自创的 AI 公司 xAI 发布首款模型 Grok 的背景下,身兼多重 CEO 角色的伊隆·马斯克(Elon Musk)与播客主 Lex Fridman 展开第四次深度对话。
  • 核心论点:本次对话的核心,是马斯克运用“物理学第一性原理”思维,对人类社会两大终极命题——冲突智能——进行解构与重塑。他认为,无论是地缘政治的仇恨循环,还是人工智能的无限扩张,其底层都受制于物理世界的法则与约束。他主张用反直觉的“显性善意”(Conspicuous Acts of Kindness)打破战争的“以眼还眼”逻辑,同时指出AI发展的真正瓶颈将从算力芯片转向电力能源。最终,他将 xAI 探索宇宙智能、特斯拉发展现实世界 AI(自动驾驶与机器人)以及 SpaceX 实现多行星生存,统一在确保人类意识这束“宇宙中的微光”得以延续的宏大叙事之下,展现了一个在混乱中寻求确定性法则的工程师世界观。

2. 🧠 深度观点解析 (Deep Dive Analysis)

维度一:反直觉地缘政治学——“恐怖分子生产函数”

  • 核心观点:在处理像巴以这样的非对称冲突时,传统军事打击的有效性应被一个新的指标取代:“你每消灭一个恐怖分子,会制造出多少个新的恐怖分子?” (For every Hamas member that you kill, how many did you create?) 如果这个比率大于一,那么军事行动在战略上就是失败的。因此,最佳对策是执行“显性的、不容置疑的善举”(Conspicuous acts of kindness that are unequivocal),例如提供超乎预期的人道援助,以此釜底抽薪,瓦解仇恨的根基。
  • 原理解构:该观点本质上是一种博弈论的反制策略。马斯克洞察到,哈马斯等组织的策略是“通过暴行激起过度反应”(provoke an overreaction),从而利用对方的激烈报复来团结和动员更广泛的同情者。以色列若陷入“以眼还眼”的报复循环,恰好落入了对方的战略陷阱。而“显性善意”则是非对称地打破这个循环,直接作用于对方的舆论基础和人员招募池,从根本上改变冲突的动力学模型。这并非简单的“打左脸给右脸”,而是精准打击辅以压倒性的善意,旨在赢得长期民心,降低恐怖主义的“再生率”。
  • 证据/案例:马斯克引用了历史类比:第一次世界大战后,苛刻的**《凡尔赛条约》对德国施加了沉重的赔款,埋下了复仇的种子,最终导致了第二次世界大战。相反,二战后美国实施的“马歇尔计划”**,通过帮助重建欧洲(包括德国),则成功地避免了新一轮的仇恨循环,奠定了长期的和平。

维度二:AI 发展的物理瓶颈——从硅到电网的三步曲

  • 核心观点:人工智能的指数级发展正面临一系列可预测的物理世界瓶颈,其演进顺序为:1. 硅基芯片短缺 (当前) → 2. 变压器短缺 (约1年后) → 3. 电力本身短缺 (约2年后)。
  • 原理解构:这揭示了 AI 竞赛的本质,在算法之外,更是一场关于能源和工业基础设施的竞赛。马斯克将问题从软件层面拉回到了物理现实。
    1. 芯片短缺:目前行业共识,即高端 GPU(如 NVIDIA H100/A100)的供应是训练大模型的主要瓶颈。
    2. 变压器短缺:这是一个被行业普遍忽视的中间环节。数据中心需要巨量的高压到低压的降压变压器(Voltage step-down transformers)来为芯片供电。这是一个增长缓慢的传统工业领域,其产能无法匹配 AI 行业爆炸性的需求增长。马斯克在此处指出了一个精妙的讽刺:“你需要 transformers(变压器)来运行 transformers(AI模型)”。
    3. 电力短缺:最终瓶颈是绝对的能源供应。随着交通和供暖全面电气化,再加上 AI 计算的巨大能耗,整个社会的电力需求将增长三倍。这将迫使能源行业进行根本性变革,包括建设更多发电厂和大规模部署储能电池。
  • 证据/案例:马斯克提到他曾对全球电力公司高管发表演讲,警告他们为电力需求翻三倍做好准备。这一预测的背后,是特斯拉在电动车和储能业务中积累的对电网负荷的深刻理解。

维度三:特斯拉的“节俭 AGI”路径——现实世界的效率为王

  • 核心观点:与数据中心通过“暴力计算”训练 LLM 的路径不同,特斯拉 Autopilot 正在开辟一条更高效、更接近生物智能的 AGI 路径。其核心优势源于物理约束:必须在仅有 100 瓦功耗的车载计算机上理解复杂的现实世界。
  • 原理解构:“光子输入,控制输出”(Photons in, controls out)是马斯克对具身智能(Embodied AI)的高度概括。特斯拉的 AI 被迫在极低的功耗预算下,仅通过视频输入(光子)学习驾驶(控制),这使其算法必须极其高效。这种效率优势将直接迁移到其人形机器人 Optimus 上。他认为,人脑的高级思维功耗不到 10 瓦,却能创作出比 10 兆瓦 GPU 集群更优秀的小说,这表明当前 AI 的“蛮力”模式远非终局,“每瓦有效算力”(Useful compute per watt)才是最终的衡量标准。
  • 证据/案例:Autopilot 在没有经过专门训练的情况下,通过观察海量视频,自发学会了识别交通标志上的文字。这证明了其端到端神经网络正在形成一个关于物理世界的内在模型(World Model)。此外,马斯克透露,为 Optimus 机器人开发的几乎所有部件(尤其是执行器/电机)都需从零开始设计,因为市面上没有任何现成产品能满足大规模、低成本生产的需求。

维度四:Community Notes 的魔力——对抗极化的算法设计

  • 核心观点:X 平台的“社群笔记”(Community Notes)之所以有效,其“魔法”在于一个核心算法机制:只有那些在历史上观点相左的用户达成共识,笔记才会被显示出来。
  • 原理解构:这是一种基于“异议者共识”(Disagreeing Raters’ Consensus)的信任机制。系统首先通过用户过去对其他笔记的评分历史,在多维向量空间中为每个用户建立一个“观点画像”。它并非简单的“左派”或“右派”标签,而是更复杂的观点集群。当一条笔记需要被评估时,只有当来自不同甚至对立观点集群的用户都认为该笔记有价值时,它才会被采纳和展示。这个机制巧妙地利用了分歧来过滤偏见,从而产生出乎意料的、高质量的、中立的事实核查。
  • 证据/案例:马斯克强调,该系统所有代码和数据 100% 开源,确保了其透明度和不可操纵性。他本人也无法凭一己之力更改一条社群笔记,这保证了机制的鲁棒性。在即将到来的 2024 年美国大选中,这将是 X 平台宣称其中立性的核心工具。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识

    • 善意是最高效的地缘政治武器:在主流强调威慑和武力的国际关系语境下,马斯克提出的“显性善意”作为一种主动的、战略性的工具,是极具颠覆性的。
    • 永恒的和平可能是一种反乌托邦:马斯克借由《美丽新世界》中的“Soma”药物隐喻,对“世界和平”这一看似无可指摘的理想提出了警示。他认为,一个没有冲突、痛苦和仇恨的世界,可能是一个停滞、僵化、失去生命活力的世界。这挑战了人们对乌托邦的简单想象。
    • 监管的荒诞性:他用 SpaceX 被要求评估火箭击中鲨鱼的概率,甚至给海豹戴上耳机测试音爆影响的真实案例,辛辣地讽刺了监管体系可能陷入的官僚主义僵局,指出其有时会关注微不足道的风险,而忽略了如 AI 这样的文明级风险。
  • 盲点与局限

    • OpenAI 的名不副实:马斯克尖锐地指出,他作为联合创始人的 OpenAI,其名字中的“Open”意指“开源”,初心是作为非营利组织抗衡谷歌。而如今它已变成一个“闭源、追求最大利润”的实体,他认为这是一种背离。
    • SEC(美国证券交易委员会)的“监管俘获”:他痛斥 SEC 对做空特斯拉的对冲基金视而不见,却利用威胁切断银行信贷的手段(这会让特斯拉瞬间破产)逼迫他就“资金已落实”推文达成和解。他认为这是监管机构激励机制失调、服务于自身职业前途而非公众利益的典型案例。
  • 未解之谜

    • 意识的本质:尽管在构建强大的 AI,马斯克承认我们对“思想”、“情感”和“意识”的本质仍一无所知。他推测,我们可能遗漏了某些基本的东西,“它感觉不仅仅是原子与原子的碰撞”。AI 的发展或能帮助我们探索这个终极问题。
    • 费米悖论与大过滤器:宇宙为何如此寂静?马斯克倾向于认为,成为多行星文明是一个“大过滤器”(Great Filter),无数单行星文明可能在越过这个门槛前就已灭绝。这为他发展 SpaceX 提供了强烈的紧迫感。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “For every Hamas member that you kill, how many did you create? And if you create more than you killed, you’ve not succeeded.”

    • 中文意译:“你每消灭一个哈马斯成员,会制造出多少个新的?如果你制造的比消灭的还多,你就没有成功。”
    • 语境:在讨论巴以冲突时,以此定义衡量军事行动战略成败的全新标准。
  2. “Physics is the law, everything else is a recommendation. I’ve seen plenty of people break the laws made by man, but none break the laws made by physics.”

    • 中文意译:“物理学是铁律,其他一切都只是建议。我见过无数人打破人定的法律,但从未见过有人能打破物理定律。”
    • 语境:在解释 xAI 致力于追求真理时,阐述其将物理学作为最高准绳的世界观。
  3. “We have a silicon shortage now that will transition to a voltage transformer shortage in about a year, and then just electricity shortages in general in about two years.”

    • 中文意译:“我们现在面临硅芯片短缺,大约一年后将转变为变压器短缺,再过两年就是普遍的电力短缺。”
    • 语境:以清晰的时间线,预言了限制 AI 发展的下一系列物理瓶颈。
  4. “My mind is a storm and I don’t think most people would want to be me. They may think they would want to be me, but they don’t. They don’t know, they don’t understand.”

    • 中文意译:“我的脑中是一场风暴,我不认为大多数人会想成为我。他们可能以为自己想,但他们不会。他们不知道,也不理解。”
    • 语境:在被问及个人困境时,罕见地袒露了内心世界的巨大压力与不被理解的孤独感。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)

    • AI 投资风向转变:资本将从纯粹的算法和模型公司,更多地流向AI 基础设施的物理层,包括先进芯片制造、电力设备(特别是变压器)、新能源发电和储能技术。
    • AI 效率竞赛开启:随着能源成本和供应限制日益突出,“每瓦有效算力”将成为衡量 AI 模型优劣的关键指标,小型、高效的模型和算法将获得更多关注。
    • 社交媒体事实核查:在 2024 年美国大选的极端政治环境下,X 平台的 Community Notes 将面临终极压力测试,其成败将对未来社交媒体的内容治理模式产生深远影响。
  • 长期终局 (5-10年)

    • AI 范式融合:如果马斯克的设想成真,我们将见证基于语言的数字智能(LLMs)与基于物理世界的具身智能(机器人、自动驾驶)的大融合。最终胜出的 AGI,将是既能理解抽象概念又能高效与物理世界互动的形态。
    • 劳动力市场重塑:低成本、可大规模制造的人形机器人(如 Optimus)如果成功,将彻底颠覆制造业、物流、家政服务等领域的劳动力结构,引发深刻的社会经济变革。
    • 文明风险管理成为核心议题:人类将进入一个地缘政治、AI 安全与太空探索紧密交织的时代。成为多行星物种不再是科幻梦想,而被视为对冲地球内部(战争、AI失控)和外部(小行星撞击)风险的必要保险。
  • 行动建议

    • 开发者/工程师:应将算法效率和能源效率置于核心位置。在开发 AI 应用时,不仅要考虑其功能,更要思考其在受限硬件和能源预算下的性能表现。具身智能领域(机器人、无人机)将是理论与实践结合的黄金赛道。
    • 投资者:建立一个超越软件的投资组合。在追逐下一个 AI 应用独角兽的同时,布局“卖铲子”的行业,特别是那些被忽视但至关重要的工业环节,如电力设备、特种材料和机器人核心部件。
    • 创业者:马斯克明确指出的瓶颈就是最大的机遇。无论是为数据中心设计更高效的变压器,开发新型储能方案,还是为机器人制造性价比更高的执行器,都是潜在的蓝海市场。同时,在应用层,开发能与物理世界深度交互、解决实际问题的 AI 产品,将比纯粹的数字内容生成更具长期价值。

这份深度研报基于对 Elon Musk 与 Lex Fridman 对话的重构与分析。此次对话发生在全球地缘政治紧张、AI 技术突飞猛进以及 X(原 Twitter)变革的关键时期。


深度研报:物理学第一性原理下的文明、AI 与博弈论

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:这是 Elon Musk 第四次参加 Lex Fridman 访谈。此时的 Musk 正处于多重旋涡中心:X 的算法透明化改革、xAI 旗下 Grok 的发布、Tesla FSD 向端到端神经网络的转型,以及他在俄乌、巴以冲突等全球事务中日益增长的影响力。
  • 核心论点:Musk 展示了一种高度统一的世界观——将物理学的第一性原理应用于人类文明的各个尺度。他认为,无论是解决地缘政治冲突、开发 AGI(通用人工智能),还是构建多行星文明,核心逻辑都在于:通过最大化“意识的规模”来对抗系统熵增。他主张用“引人注目的善行”重构国际博弈准则,用“物理真实”约束 AI 的幻觉,并警告人类若无法突破单一行星的束缚,文明终将枯竭或因内耗而毁灭。

2. 🧠 深度观点解析 (Deep Dive Analysis)

I. 军事博弈的新范式:以“超预期善行”瓦解仇恨回路

  • 核心观点:地缘政治应引入“非对称善意”来中断报复循环。
  • 原理解构:Musk 分析了哈马斯的策略——通过挑衅引发过激反应以换取全球支持。他提出,以色列的逻辑不应仅是“以眼还眼”,而应通过“透明、显眼的善行”(如建设移动医院、确保基础生活物资)来瓦解恐怖组织的新兵招募基础。
  • 证据/案例:他引用了一战后《凡尔赛条约》对德国的压迫导致二战,对比二战后“马歇尔计划”重建德日带来的长期和平。他认为:“每杀掉一个哈马斯成员,如果同时制造了更多仇恨者,那么军事上就是失败的。”

II. xAI 的底层逻辑:物理学作为“真相”的最终仲裁者

  • 核心观点:AI 必须首先理解物理定律,才能拥有真正的逻辑一致性。
  • 原理解构:当前大语言模型(LLM)的本质是“Token 预测”,这导致了不可控的幻觉积累。Musk 认为 Grok 的差异化在于:它试图将推理路径追溯至物理第一性原理和数学逻辑。
  • 技术洞察:他指出 AI 的进化路径是“先让它工作,再让它高效”。Tesla 正在用 100 瓦的低功耗计算实现 FSD 驾驶,这种对“瓦特效率”的极致追求是通往 AGI 的必经之路。

III. 基础设施的滞后性:从芯片短缺到能源瓶颈的演化

  • 核心观点:AI 的算力竞赛将很快演变为全球电力和变压器的供应灾难。
  • 逻辑链条
    1. 当前阶段:硅片(芯片)短缺。
    2. 未来 1 年:电压互感器(变压器)短缺(他戏称之为 “Transformers for Transformers”)。
    3. 未来 2 年:电力供应短缺。
  • 证据/案例:他预测随着交通和供暖的全面电气化,全球电力需求将增长 3 倍。如果电力公司不加速建设电厂和储能系统(Batteries),文明的算力天花板将提前到来。

IV. 社交媒介的算法重构:从“停留时长”转向“无悔分钟”

  • 核心观点:社交媒体的成功指标应是用户在关闭应用后不感到后悔。
  • 商业模式重构:X 的目标是建立端到端的神经网络推荐。Musk 提出用“向量空间匹配”取代传统的启发式规则,并强调回复(Replies)的重要性,认为高价值的回复应获得与主贴同等的权重,从而过滤噪音,提升“信息密度”。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识:和平的代价可能是“停滞” Musk 挑战了“绝对和平即终极目标”的看法。他引用《美丽新世界》中的 Soma(苏摩),指出如果为了消除冲突而抑制人类的情感波动和竞争,可能会导致一个僵化、无生机的社会。“一些小规模的摩擦或许是防止社会制度性腐败的必要成本。”
  • 盲点与局限:媒体数据的污染 他坦诚 Grok 仍然从传统媒体中吸取了过多的错误信息(如关于其个人法律纠纷的报道)。他认为 AI 的训练集需要从“媒体报道”下沉到“原始法律文书”和“实验数据”,因为报道本身往往带有流量驱动的偏见。
  • 未解之谜:意识的维度 Musk 承认目前科学无法解释为何“原子在特定排列下会产生情感”。他提出一个有趣的猜想:如果世界是模拟的,那么造物主运行模拟的目的正是为了“看看到底会发生什么”,即结果是不可预测的,这赋予了人类自由意志以意义

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “Physics is the law, everything else is a recommendation.” (物理学是法律,其余一切只是建议。)——语境:讨论法律、监管与宇宙底层逻辑的优先级。
  2. “An eye for an eye makes everyone blind.” (以眼还眼,世界只会变得盲目。)——语境:评价中东局势及循环报复的死结。
  3. “I’m pathologically optimistic on schedule, but I always deliver in the end.” (我在时间进度上具有病态的乐观,但我最终总能交付。)——语境:回应 Grok 对其预测频繁推迟的嘲讽。
  4. “Killing the demons in a video game calms the demons in my mind.” (杀死游戏里的恶魔,平息了我心中的风暴。)——语境:讨论通过《暗黑破坏神 4》的高强度对抗来缓解压力。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)
    • 能源股机会:公用事业和电网基础设施将迎来爆发。AI 不再只是软件游戏,而是重资产的能源博弈。
    • 具身智能突破:Tesla 的 Optimus 机器人将证明,在大规模制造支撑下,硬件成本将低于汽车,人形机器人将率先在工厂实现商业闭环。
  • 长期终局 (5-10年)
    • 大筛选器(Great Filter):人类文明处于能否成为“多行星生物”的关键窗口期。如果意识无法扩展到火星,地球资源耗尽或内战导致的坠落将是确定性结局。
    • AI 协作模式:开源 AI(如 Musk 主张的)将与闭源巨头(如 OpenAI/Google)形成制衡,防止 AGI 成为极少数机构控制文明的武器。
  • 行动建议
    • 开发者:关注“推理成本”和“瓦特效率”而非仅仅是大模型参数规模。
    • 投资者:寻找在电力基础设施、变压器制造以及具有端到端自动驾驶能力的硬件公司。
    • 创业者:在算法中引入“无悔设计(Unregretted Design)”,对抗当今泛滥的成瘾性设计。

分析师: 技术评论专家 结论: 本次对话展示了 Musk 试图通过科技加速文明进程,同时通过重构博弈论逻辑来减缓冲突熵增的宏大野心。

埃隆·马斯克与里奇·沃德的第四次深度对话:关于战争、AI物理极限与文明生存

1. 🎯 核心论题与背景

  • 对话背景:作为全球最具影响力的企业领袖与“技术先知”,埃隆·马斯克(Elon Musk)罕见地深度剖析了地缘政治的本质。这次对话与其说是一次采访,不如说是一次跨越科技、哲学与生物学的思想实验。嘉宾身份是马斯克,背景是持续动荡的国际局势(以哈冲突、俄乌战争、中美博弈)以及AI领域的激烈竞争。
  • 核心论点:马斯克的核心世界观基于一种 “生物学宿命论与人文救赎论的辩证统一”。他一方面承认战争源于人类的原始“边缘系统”本能吞并经济规律(Catch-up growth),认为战争在可预见的未来难以完全消除;另一方面,他提出了一个极具争议但又充满智慧的地缘政治解药——“显性的善意”,认为只有通过超越仇恨的、铁路规模般无私的援助,才能打破“以眼还眼”的暴力循环。同时,他将AI视为理解宇宙的重器,坚持**“物理学是唯一的法律”**(Physics is the only law),并警示当前的增长模式因电网和硬件瓶颈而面临崩溃风险。

2. 🧠 深度观点解析

A. 战争的本质:是本性,也是机制

  • 核心观点:战争不仅源于政治,更是人类生物本能的体现。人类社会并不比丛林(其理念是明显的谋杀与死亡)更优越。
  • 原理解构:马斯克用**边缘系统(Limbic System)**类比人类。人类虽智力进化,但底层仍保留着黑猩猩式的追求支配权基因。认为不存在“正义的战争”,只有强权与基因筛选。
  • 证据/案例
    • 生物学对比:引用沃纳·赫尔佐格关于雨林的理论,指出那是“谋杀与死亡的方向”。
    • 历史哲学:提到修昔底德陷阱,指出当美国这种“最大的孩子”遭到中国经济规模两倍的“同桌”挑战时,战争是**“吞并式增长”**的历史必然。

B. 地缘政治的解药:“显性的善意”

  • 核心观点:在处理历史积怨时,唯一的可行策略是主动实施某种会让敌人无法辩驳、必须承认的巨大善意(Conspicuous Acts of Kindness)
  • 原理解构:针对“以眼还眼”的复仇逻辑,马斯克认为受害者若制造更多仇恨,将导致反抗者加倍。只有通过非对称的、公开透明的、高成本的善意(如建设和平时期的移动医院、供应物资),才能从代码底层重写受害者的记忆,瓦解对方的战争动员逻辑。
  • 证据/案例
    • 以哈冲突:推演哈马斯的目的是激怒以色列,因此最有效的反制是最强烈的善意
    • 一战与二战:对比凡尔赛条约(强制索赔培养了希特勒)和马歇尔计划(无条件的重建赢得了欧洲支持)的后果。

C. AI 的物理极限:效率为王

  • 核心观点:AI 不仅关乎模型大小,更关乎**“有用算力/瓦特”。当前的摩尔定律失效了,取而代之的是对电力和磁性材料**的争夺。
  • 原理解构:马斯克质疑单纯堆砌算力的路线,认为**“现实是唯一的法官”(Physics is the judge)。他提出一个概念性的技术瓶颈:“变压器大战变压器”**——AI 需要变压器来驱动,而变压器本身也需要电。
  • 证据/案例
    • 维度的竞争:指出短期内 AI 的电力需求远不如电动车(EV)和供暖的激增,但这将导致未来 2-3 年出现电力短缺。
    • 能源结构预测:预测交通与供暖全面电气化将使全球电力需求激增三倍

D. 监管迷思与“鱼许可证”

  • 核心观点:现代监管机构在面对高科技项目时,往往陷入官僚主义的荒诞,表现出形式主义的关怀(如保护鲨鱼或海豹)。
  • 原理解构:马斯克通过 SpaceX 发射监管的滑稽案例,揭示了监管俘获的根源——律师为了职业晋升而非公共利益行事。
  • 证据/案例
    • Sea Lion with Headphones:为了测试发射噪音是否惊吓海豹,监管机构甚至让海豹戴着耳机、被绑在板上接受测试,且做了两次。

E. 人类社会的政治光谱:不想要“永远的和平”

  • 核心观点:人类需要冲突来保持进化活力。完全的和平(如《美丽新世界》中的索马)、没有痛苦的幸福,可能导致社会的僵化与停滞
  • 原理解构:他在接受 Lex Fridman 询问“你是否后悔成为政治人物”时表示,他在与“觉醒病毒”作战,但这仍是一种**“求真”**政治,目的是反平庸、反审查。
  • 证据/案例:引用《美丽新世界》,认为剂量的快乐药物(索马)会导致文明灭亡;他也提到 《1984》《美丽新世界》 的区别:前者是恐惧,后者是麻醉。

3. 💡 反直觉与批判性视角

  • 打破共识—— 战争是生物性且不可避免的:大众普遍认为通过普世价值可以消除战争,但马斯克指出我们仍生活在“黑猩猩社会”的边缘,追求统治权的生物引擎并未关闭。
  • 盲点与局限—— “没有痛苦就没有快乐”:马斯克的观点挑战了现代福利社会的核心——即憎恨和痛苦是文明的必要组成部分。从激进进化论的角度看,他暗示极端的和平甚至可能导致物种退化(“一地鸡毛”)
  • 对监管的过度浪漫化:虽然他抨击了监管的官僚主义,但他似乎低估了复杂系统安全问题的必要性(如核武器监管的初衷是为了防止意外触发),这种“直接务实”的态度可能在处理系统性崩塌风险时不够敏锐。

4. 💎 金句与高光时刻

  • “The jungle is basically just murder and death in every direction.”
    • 语境:引用沃纳·赫尔佐格关于自然界的观点,嘲讽人类以为自己脱离了兽性,实则依然脆弱。
  • “If you kill somebody’s child in Gaza, you’ve made at least a few homeless [fighters] who will die for just killing an Israeli. That’s the situation.”
    • 语境:分析以哈冲突的长期后果,指出报复性打击的本质逻辑是制造更多的未来反叛者。
  • “Even if we are in a simulation, the reason that these higher beings would hold a simulation is to see what happens. They don’t know what happens otherwise.”
    • 语境:探讨存在主义,解释“模拟理论”背后的终极推论:上帝(或造物者)也不全知全能,他们通过模拟观察演化结果。
  • “Unregretted minutes. That is the goal. It’s fascinating, because regret is a fascinating concept.”
    • 语境:定义 X 平台算法的未来指标,不再追求浏览时长(流量),而是追求用户观看后不后悔的内容。
  • “Physics is the only law. Everything else is a recommendation.”
    • 语境:重申他对咨询公司和媒体的不屑,真理永远掌握在物理学定律中。

5. 🚀 行业启示与未来推演

  • 短期影响 (1-3年)能源与硬件基础设施将最先爆发危机。随着电动车和 AI 算力的爆发,电力供应(特别是变压器)将成为全球性的物理瓶颈。AI 领域将从单纯的“模型竞争”转向“能效竞争”。
  • 长期终局 (5-10年)物理世界 AI 的统治地位确立。特斯拉的“感知+控制”(Photons in, controls out)范式将成为 AGI 的终极形态。人形机器人将于汽车成本持平(低于 3 万美元),并在普通人生活中普及。
  • 行动建议
    • 对于投资者:关注电力分配、超导/磁材、以及 GPU 边缘计算领域,而非仅仅关注大模型参数规模。
    • 对于创业者:遵循马斯克的“物理学第一性原理”,从最底层的物理需求出发(如能量密度、材料科学),而非盲目堆砌现有架构。
    • 对于观察者:不要被短期舆论(如“觉醒病毒”)所迷惑,应关注马斯克**“延长人类生存时间”**这一底层使命。

逐字稿

Introduction

War and human nature

Lex Fridman (00:00:00) The following is a conversation with Elon Musk, his fourth time on this, the Lex Fridman Podcast. I thought you were going to finish it. It’s one of the greatest themes in all of film history.

Lex Fridman (00:00:33) So I was just thinking about the Roman Empire, as one does.

Elon Musk (00:00:38) Is that whole meme where all guys are thinking about the Roman Empire at least once a day?

Lex Fridman (00:00:44) And half the population is confused whether it’s true or not. But more seriously, thinking about the wars going on in the world today, and as you know, war and military conquest has been a big part of Roman society and culture, and I think has been a big part of most empires and dynasties throughout human history.

Elon Musk (00:01:06) Yeah, they usually came as a result of conquest. I mean, there’s some like the Hapsburg Empire where there was just a lot of clever marriages.

Lex Fridman (00:01:16) But fundamentally there’s an engine of conquest and they celebrate excellence in warfare, many of the leaders were excellent generals, that kind of thing. So a big picture question, Grok approved, I asked if this is a good question to ask.

Elon Musk (00:01:33) Tested, Grok approved. Yeah.

Lex Fridman (00:01:36) At least on fun mode. To what degree do you think war is part of human nature versus a consequence of how human societies are structured? I ask this as you have somehow controversially been a proponent of peace.

Elon Musk (00:01:57) I’m generally a proponent of peace. I mean, ignorance is perhaps, in my view, the real enemy to be countered. That’s the real hard part, not fighting other humans, but all creatures fight. I mean, the jungle is… People think of nature as perhaps some sort of peaceful thing, but in fact it is not. There’s some quite funny Werner Herzog thing where he is in the jungle saying that it’s basically just murder and death in every direction. The plants and animals in the jungle are constantly trying to kill each other every single day, every minute. So it’s not like we’re unusual in that respect.

Lex Fridman (00:02:40) Well, there’s a relevant question here, whether with greater intelligence comes greater control over these base instincts for violence.

Elon Musk (00:02:49) Yes. We have much more vulnerability to control our limbic instinct for violence than say a chimpanzee. And in fact, if one looks at say, chimpanzee society, it is not friendly. I mean, the Bonobos are an exception, but chimpanzee society is filled with violence and it’s quite horrific, frankly. That’s our limbic system in action. You don’t want to be on the wrong side of a chimpanzee, it’ll eat your face off and tear your nuts off.

Lex Fridman (00:03:22) Yeah. Basically there’s no limits or ethics or they almost had just war. There’s no just war in the chimpanzee societies. Is war and dominance by any means necessary?

Elon Musk (00:03:33) Yeah. Chimpanzee society is a permanent version of human society. They’re not like peace loving basically at all. There’s extreme violence and then once in a while, somebody who’s watched too many Disney movies decides to raise a chimpanzee as a pet, and then that eats their face or they’re nuts off or chew their fingers off and that kind of thing. It’s happened several times.

Lex Fridman (00:03:58) Ripping your nuts off is an interesting strategy for interaction.

Elon Musk (00:04:02) It’s happened to people. It’s unfortunate. That’s, I guess, one way to ensure that the other chimp doesn’t contribute to the gene pool.

Lex Fridman (00:04:10) Well, from a martial arts perspective is the fascinating strategy.

Lex Fridman (00:04:18) I wonder which of the martial arts teaches that one.

Elon Musk (00:04:21) I think it’s safe to say if somebody’s got your nuts in their hands and as the option of roughing them off, you’ll be amenable to whatever they want.

Israel-Hamas war

Lex Fridman (00:04:30) Yeah. Safe to say. So, like I said, somehow controversially, you’ve been a proponent of peace on Twitter on X.

Lex Fridman (00:04:39) So let me ask you about the wars going on today and to see what the path to peace could be. How do you hope the current war in Israel and Gaza comes to an end? What path do you see that can minimize human suffering in the longterm in that part of the world?

Elon Musk (00:04:54) Well, I think that part of the world is definitely, if you look up… There is no easy answer in the dictionary. It’ll be the picture of the Middle East in Israel especially. So there is no easy answer. This is strictly my opinion is that the goal of Hamas was to provoke an overreaction from Israel. They obviously did not expect to have a military victory, but they really wanted to commit the worst atrocities that they could in order to provoke the most aggressive response possible from Israel, and then leverage that aggressive response to rally Muslims worldwide for the course of Gaza and Palestine, which they have succeeded in doing. So the counterintuitive thing here, I think that the thing that I think should be done, even though it’s very difficult, is that I would recommend that Israel engage in the most conspicuous acts of kindness possible, everything, that is the actual thing that we’re taught the goal of Hamas.

Lex Fridman (00:06:19) So in some sense, the degree that makes sense in geopolitics turn the other cheek implemented.

Elon Musk (00:06:26) It’s not exactly turn the other cheek because I do think that it is appropriate for Israel to find the Hamas members and either kill them or incarcerate them. That’s something has to be done because they’re just going to keep coming otherwise. But in addition to that, they need to do whatever they can. There’s some talk of establishing, for example, a mobile hospital. I’d recommend doing that. Just making sure that there’s food, water, medical necessities and just be over the top about it and be very transparent. So [inaudible 00:07:22] can claim it’s a trick. Just put webcam on the thing or 24, 7.

Lex Fridman (00:07:29) Deploy acts of kindness.

Elon Musk (00:07:31) Yeah, conspicuous acts of kindness that are unequivocal, meaning they can’t be somehow because Hamas will then their response will be, “Oh, it’s a trick.” Therefore, you have to counter how it’s not a trick.

Lex Fridman (00:07:47) This ultimately fights the broader force of hatred in the region.

Elon Musk (00:07:51) Yes. And I’m not sure who said it, it’s an [inaudible 00:07:54] saying, but an eye for an eye makes everyone blind. Now, that neck of the woods, they really believe in the whole eye for an eye thing. But you really have… If you’re not going to just outright commit genocide against an entire people, which obviously would not be acceptable to really, shouldn’t be acceptable to anyone, then you’re going to leave basically a lot of people alive who subsequently hate Israel. So really the question is like for every Hamas member that you kill, how many did you create? And if you create more than you killed, you’ve not succeeded. That’s the real situation there. And it’s safe to say that if you kill somebody’s child in Gaza, you’ve made at least a few homeless members who will die just to kill an Israeli. That’s the situation. But I mean, this is one of the most contentious subjects one could possibly discuss. But I think if the goal ultimately is some sort of long-term piece, one has to look at this from the standpoint of over time, are there more or fewer terrorists being created?

Lex Fridman (00:09:26) Let me just linger on war.

Elon Musk (00:09:29) Yeah, war, safe to say, wars always existed and always will exist.

Lex Fridman (00:09:33) Always will exist.

Elon Musk (00:09:34) Always has existed and always will exist.

Lex Fridman (00:09:37) I hope not. You think it’ll always-

Elon Musk (00:09:42) There will always be war. There’s a question of just how much war and there’s sort of the scope and scale of war. But to imagine that there would not be any war in the future, I think would be a very unlikely outcome.

Lex Fridman (00:09:55) Yeah. You talked about the Culture series. There’s war even there.

Elon Musk (00:09:58) Yes. It’s a giant war. The first book starts off with a gigantic galactic war where trillions die trillions.

Lex Fridman (00:10:07) But it still nevertheless protects these pockets of flourishing. Somehow you can have galactic war and still have pockets of flourishing.

Elon Musk (00:10:18) Yeah, I guess if we are able to one day expand to fool the galaxy or whatever, there will be a galactic war at some point.

Lex Fridman (00:10:31) I mean, the scale of war has been increasing, increasing, increasing. It’s like a race between the scale of suffering and the scale of flourishing.

Military-Industrial Complex

Lex Fridman (00:10:41) A lot of people seem to be using this tragedy to beat the drums of war and feed the military industrial complex. Do you worry about this, the people who are rooting for escalation and how can it be stopped?

Elon Musk (00:10:56) One of the things that does concern me is that there are very few people alive today who actually viscerally understand the horrors of war, at least in the US. I mean, obviously there are people on the front lines in Ukraine and Russia who understand just how terrible war is, but how many people in the West understand it? My grandfather was in World War II. He was severely traumatized. He was there I think for almost six years in Eastern North Africa and Italy. All his friends were killed in front of him, and he would’ve died too, except they randomly gave some, I guess IQ test or something, and he scored very high. He was not an officer. He was I think a corporal or a sergeant or something like that because he didn’t finish high school because he had to drop out of high school because his dad died and he had to work to support his siblings. So because he didn’t graduate high school, he was not eligible for the offset corps.

(00:11:57) So he kind of got put into the cannon fodder category basically. But then randomly they gave him this test. He was transferred to British intelligence in London. That’s where we met my grandmother. But he had PTSD next level, next level. I mean, just didn’t talk, just didn’t talk. And if you tried talking to him, he’d just tell you to shut up. And he won a bunch of medals, never bragged about it once, not even hinted nothing. I found out about it because his military records were online. That’s how I know. So he would say like, “No way in hell do you want to do that again.” But how many people… Obviously, he died, he 20 years ago or longer, actually 30 years ago. How many people are alive that remember World War II? Not many.

Lex Fridman (00:12:54) And the same perhaps applies to the threat of nuclear war.

Elon Musk (00:13:01) Yeah, I mean, there are enough nuclear bombs pointed at United States to make the radioactive revel balance many times.

Lex Fridman (00:13:10) There’s two major wars going on right now. So you talked about the threat of AGI quite a bit, but now as we sit here with the intensity of conflict going on, do you worry about nuclear war?

Elon Musk (00:13:25) I think we shouldn’t discount the possibility of nuclear war. It is a civilizational threat. Right now, I could be wrong, but I think the current probability of nuclear war is quite low. But there are a lot of nukes pointed at us, and we have a lot of nukes pointed at other people. They’re still there. Nobody’s put their guns away. The missiles are still in the silos.

Lex Fridman (00:13:57) And the leaders don’t seem to be the ones with the nukes talking to each other.

Elon Musk (00:14:03) No, there are wars which are tragic and difficult on a local basis. And then there are wars which are civilization ending or has that potential. Obviously, global thermonuclear warfare has high potential to end civilization, perhaps permanently, but certainly to severely wound and perhaps set back human progress to the Stone Age or something. I don’t know. Pretty bad. Probably scientists and engineers want to be super popular after that as well. You got us into this mess. So generally, I think we obviously want to prioritize civilizational risks over things that are painful and tragic on a local level, but not civilizational.

War in Ukraine

Lex Fridman (00:15:00) How do you hope the war in Ukraine comes to an end? And what’s the path, once again to minimizing human suffering there?

Elon Musk (00:15:08) Well, I think that what is likely to happen, which is really pretty much the way it is, is that something very close to the current lines will be how a ceasefire or truce happens. But you just have a situation right now where whoever goes on the offensive will suffer casualties at several times the rate of whoever’s on the defense because you’ve got defense in depth, you’ve got minefields, trenches, anti-tank defenses. Nobody has air superiority because the anti-aircraft missiles are really far better than the aircraft. They’re far more of them. And so neither side has air superiority. Tanks are basically death traps, just slow moving, and they’re not immune to anti-tank weapons. So you really just have long range artillery and infantry ranges. It’s World War I all over again with drones, thrown old drones, some drones there.

Lex Fridman (00:16:25) Which makes the long range artillery just that much more accurate and better, and so more efficient at murdering people on both sides.

Elon Musk (00:16:34) So whoever is… You don’t want to be trying to advance from either side because the probability of dying is incredibly high. So in order to overcome defense in depth, trenches and minefields, you really need a significant local superiority in numbers. Ideally combined alms where you do a fast attack with aircraft, a concentrated number of tanks, and a lot of people, that’s the only way you’re going to punch through a line and then you’re going to punch through and then not have reinforcements just kick you right out again. I mean, I really recommend people read World War I warfare in detail. That’s rough. I mean, the sheer number of people that died there was mind-boggling.

Lex Fridman (00:17:37) And it’s almost impossible to imagine the end of it that doesn’t look like almost exactly like the beginning in terms of what land belongs to who and so on. But on the other side of a lot of human suffering, death and destruction of infrastructure.

Elon Musk (00:17:56) Yes. The thing that… The reason I proposed some sort of truce or peace a year ago was because I’ve predicted pretty much exactly what would happen, which is a lot of people dying for basically almost no changes in land and the loss of the flower of Ukrainian and Russian youth. And we should have some sympathy for the Russian boys as well as the Ukrainian boys, because Russian boys, because boys didn’t ask to be on their frontline. They have to be. So there’s a lot of sons not coming back to their parents, and I think most of them don’t hate the other side. It’s sort of like as this saying comes from World War I, it’s like young boys who don’t know each other killing each other on behalf of old men that do know each other. The hell’s the point of that.

Lex Fridman (00:19:02) So Volodymyr Zelenskyy said that he’s not, or has said in the past, he’s not interested in talking to Putin directly. Do you think he should sit down man to man, lead a leader, and negotiate peace?

Elon Musk (00:19:14) Look, I think I would just recommend do not send the flower of Ukrainian youth to die in trenches, whether he talks to Putin or not, just don’t do that. Whoever goes on the offensive will lose massive numbers of people and history will not look kindly upon them.

China

Lex Fridman (00:19:42) You’ve spoken honestly about the possibility of war between US and China in the longterm if no diplomatic solution is found, for example, on the question of Taiwan and One China policy, how do we avoid the trajectory where these two superpowers clash?

Elon Musk (00:19:58) Well, it’s worth reading that book on the, difficult to pronounce, the Thucydides Trap, I believe it’s called. I love war history. I like inside out and backwards. There’s hardly a battle I haven’t read about. And trying to figure out what really was the cause of victory in any particular case as opposed to what one side or another claim the reason.

Lex Fridman (00:20:21) Both the victory and what sparked the war and-

Elon Musk (00:20:26) Yeah. So that Athens and Sparta is a classic case. The thing about the Greek is they really wrote down a lot of stuff. They loved writing. There are lots of interesting things that happened in many parts of the world, but people didn’t write down, so we don’t know what happened or they didn’t really write in detail. They just would say, “We had a battle and we won.” And what? Can you add a bit more? The Greeks, they really wrote a lot. They were very articulate on… They just love writing. And we have a bunch of that writing as preserved. So we know what led up to the Peloponnesian War between the Spartanand Athenian Alliance, and we know that they saw it coming.

(00:21:16) Spartans didn’t write… They also weren’t very verbose by their nature, but they did write, but they weren’t very verbose. They were [inaudible 00:21:23]. But the Athenians and the other Greeks wrote a line, and Spartan was really kind of like the leader of Greece. But Athens grew stronger and stronger with each passing year. And everyone’s like, “Well, that’s inevitable that there’s going to be a clash between Athens and Sparta. Well, how do we avoid that?” And actually they saw it coming and they still could not avoid it. So at some point, if one group, one civilization or country or whatever exceeds another sort of like the United States has been the biggest kid on the block since I think around 1890 from an economic standpoint.

(00:22:14) So the United States has been the most powerful economic engine in the world longer than anyone’s been alive. And the foundation of war is economics. So now we have a situation in the case of China where the economy is likely to be two, perhaps three times larger than that of the US. So imagine you’re the biggest kid on the block for as long as anyone can remember, and suddenly a kid comes along who’s twice your size.

Lex Fridman (00:22:55) So we see it coming, how is it possible to stop? Let me throw something out there, just intermixing of cultures understanding there does seem to be a giant cultural gap in understanding of each other. And you’re an interesting case study because you are an American, obviously you’ve done a lot of incredible manufacture here in the United States, but you also work with China.

Elon Musk (00:23:20) I’ve spent a lot of time in China and met with the leadership many times.

Lex Fridman (00:23:22) Maybe a good question to ask is, what are some things about China that people don’t understand, positive just in the culture? What’s some interesting things that you’ve learned about the Chinese?

Elon Musk (00:23:36) Well, the sheer number of really smart, hardworking people in China is incredible. There are really say how many smart, hardworking people are there in China? There’s far more of them there than there are here, I think, in my opinion. And they’ve got a lot of energy. So I mean, the architecture in China that’s in recent years is far more impressive than the US. I mean the train stations, the buildings, the high speed rail, everything, it’s really far more impressive than what we have in the US. I mean, I recommend somebody just go to Shanghai and Beijing, look at the buildings and go to take the train from Beijing to Xian, where you have the terracotta warriors. China’s got an incredible history, very long history, and I think arguably in terms of the use of language from a written standpoint, one of the oldest, perhaps the oldest written language, and then China, people did write things down.

(00:24:50) So now China historically has always been, with rare exception, been internally focused. They have not been inquisitive. They’ve fought each other. There’ve been many, many civil wars. In the Three Kingdoms war, I believe they lost about 70% of their population. So they’ve had brutal internal wars, civil wars that make the US Civil War look small by comparison. So I think it’s important to appreciate that China is not monolithic. We sort of think of China as a sort of one entity of one mind. And this is definitely not the case. From what I’ve seen and I think most people who understand China would agree, people in China think about China 10 times more than they think about anything outside of China. So it’s like 90% of their consideration is internal.

Lex Fridman (00:26:01) Well, isn’t that a really positive thing when you’re talking about the collaboration and the future piece between superpowers when you’re inward facing, which is focusing on improving yourself versus focusing on quote, unquote improving others through military might.

Elon Musk (00:26:18) The good news, the history of China suggests that China is not inquisitive, meaning they’re not going to go out and invade a whole bunch of countries. Now they do feel very strongly… So that’s good. I mean, because a lot of very powerful countries have been inquisitive. The US is also one of the rare cases that has not been inquisitive. After World War II, the US could have basically taken over the world in any country, we’ve got nukes, nobody else has got nukes. We don’t even have to lose soldiers. Which country do you want? And the United States could have taken over everything and it didn’t. And the United States actually helped rebuild countries. So it helped rebuild Europe, helped rebuild Japan. This is very unusual behavior, almost unprecedented.

(00:27:10) The US did conspicuous acts of kindness like the Berlin Airlift. And I think it’s always like, well, America’s done bad things. Well, of course America’s done bad things, but one needs to look at the whole track record and just generally, one sort of test would be how do you treat your prisoners at war? Or let’s say, no offense to the Russians, but let’s say you’re in Germany, it’s 1945, you’ve got the Russian Army coming one side and you’ve got the French, British and American Army’s coming the other side, who would you like to be just surrendered to? No country is [inaudible 00:27:58] perfect, but I recommend being a POW with the Americans. That would be my choice very strongly.

Lex Fridman (00:28:07) In the full menu of POWs in the US.

Elon Musk (00:28:08) Very much so. And in fact, Wernher von Braun, a smart guy, was like, “We’ve got to be captured by the Americans.” And in fact, the SS was under orders to execute von Braun and all of the German rocket conditioners, and they narrowly escaped. They said they were going out for a walk in the woods. They left in the middle of winter with no coats and then ran, but no food, no coats, no water, and just ran like hell and ran West and Vice Sherlock, I think his brother found a bicycle or something and then just cycled West as fast as he couldn’t have found a US patrol. So anyway, that’s one way you can tell morality is where do you want to be a PW? It’s not fun anywhere, but some places are much worse than others. Anyway, so America has been, while far from perfect, generally a benevolent force, and we should always be self-critical and we try to be better, but anyone with half a brain knows that.

(00:29:31) So I think there are… In this way, China and the United States are similar. Neither country has been acquisitive in a significant way. So that’s a shared principle, I guess. Now, China does feel very strongly about Taiwan. They’ve been very clear about that for a long time. From this standpoint, it would be like one of the states is not there like Hawaii or something like that but more significant than Hawaii. And Hawaii is pretty significant for us. So they view it as really there’s a fundamental part of China, the island of Formosa, not Taiwan, that is not part of China, but should be. And the only reason it hasn’t been is because the US Pacific fleet.

Lex Fridman (00:30:32) And is their economic power grows and is their military power grows, the thing that they’re clearly saying is their interest will clearly be materialized.

Elon Musk (00:30:46) Yes, China has been very clear that they’ll incorporate Taiwan peacefully or militarily, but that they will incorporate it from their standpoint is 100% likely.

Lex Fridman (00:31:04) Something you said about conspicuous acts of kindness as a geopolitical policy, it almost seems naive, but I’d venture to say that this is probably the path forward, how you avoid most wars. Just as you say it sounds naive, but it’s kind of brilliant. If you believe in the goodness of underlying most of human nature, it just seems like conspicuous acts of kindness can reverberate through the populace of the countries involved and deescalate.

Elon Musk (00:31:44) Absolutely. So after World War I, they made a big mistake. They basically tried to lump all of blame on Germany and saddle Germany with impossible reparations. And really there was quite a bit of blame to go around for World War I, but they try to put it all in Germany and that laid the seeds for World War II. So a lot of people, were not just Hitler, a lot of people felt wronged and they wanted vengeance and they got it.

Lex Fridman (00:32:38) People don’t forget.

Elon Musk (00:32:41) Yeah, you kill somebody’s father, mother or son, daughter, they’re not going to forget it. They’ll want vengeance. So after World War II, they’re like, “Well, the Treaty of Versi was a huge mistake in World War I. And so this time, instead of crushing the losers, we’re actually going to help them with the module plan, and we’re going to help rebuild Germany. We’re going to help rebuild Austria and Italy and whatnot.” So that was the right move.

Lex Fridman (00:33:26) It does feel like there’s a profound truth to the conspicuous acts of kindness being an antidote to this.

Elon Musk (00:33:37) Something must stop the cycle of reciprocal violence. Something must stop it, or it’ll never stop. Just eye for an eye, tooth for a tooth, limb for a limb, life for a life forever and ever.

xAI Grok

Lex Fridman (00:33:57) To escape briefly the darkness, was some incredible engineering work, xAI just released Grok AI assistant that I’ve gotten a chance to play with. It’s amazing on many levels. First of all, it’s amazing that a relatively small team in a relatively short amount of time was able to develop this close to state-of-the-art system. Another incredible thing is there’s a regular mode and there’s a fun mode.

Elon Musk (00:34:23) Yeah, I guess I’m to blame for that one.

Lex Fridman (00:34:27) First of all, I wish everything in life had a fun mode.

Lex Fridman (00:34:30) There’s something compelling beyond just fun about the fun mode interacting with a large language model. I’m not sure exactly what it is because I’ve only have had a little bit of time to play with it, but it just makes it more interesting, more vibrant to interact with the system.

Elon Musk (00:34:47) Yeah, absolutely. Our AI, Grok, is modeled after The Hitchhiker’s Guide to the Galaxy, which is one of my favorite books, which it’s a book on philosophy. It’s-

Elon Musk (00:35:00) My favorite books, it’s a book on philosophy, disguises book on humor. And I would say that forms the basis of my philosophy, which is that we don’t know the meaning of life, but the more we can expand the scope and scale of consciousness, digital and biological, the more we’re able to understand what questions to ask about the answer that is the universe. So I have a philosophy of curiosity.

Lex Fridman (00:35:34) There is generally a feeling like this AI system has an outward looking, like the way you are sitting with a good friend looking up at the stars, asking pod head like questions about the universe, wondering what it’s all about. The curiosity that you talk about. No matter how mundane the question I ask it, there’s a sense of cosmic grandeur to the whole thing.

Elon Musk (00:35:59) Well, we are actually working hard to have engineering math, physics answers that you can count on. So for the other AIs out there, these so-called large language models, I’ve not found the engineering to be reliable. It unfortunately hallucinates most when you at least want it to hallucinate. So when you’re asking important, difficult questions, that’s when it tends to be confidently wrong. So we’re really trying hard to say, okay, how do we be as grounded as possible? So you can count on the results, trace things back to physics first principles, mathematical logic. So underlying the humor is an aspiration to adhere to the truth of the universe as closely as possible.

Lex Fridman (00:37:01) That’s really tricky.

Elon Musk (00:37:02) It is tricky. So that’s why there’s always going to be some amount of error. But do we want to aspire to be as truthful as possible about the answers with acknowledged error. So that there was always, you don’t want to be confidently wrong, so you’re not going to be right every time, but you want to minimize how often you’re confidently wrong. And then like I said, once you can count on the logic as being not violating physics, then you can start to bull on that to create inventions, like invent new technologies. But if you cannot count on the foundational physics being correct, obviously the inventions are simply wishful thinking, imagination land. Magic basically.

Lex Fridman (00:38:01) Well, as you said, I think one of the big goals of XAI is to understand the universe.

Elon Musk (00:38:06) Yes, that’s how simple three word mission.

Lex Fridman (00:38:13) If you look out far into the future, do you think on this level of physics, the very edge of what we understand about physics, do you think it will make the sexiest discovery of them as we know now, unifying general relativity and quantum mechanics? So coming up with a theory of everything, do you think it could push towards that direction, almost like theoretical physics discoveries?

Elon Musk (00:38:38) If an AI cannot figure out new physics, it’s clearly not equal to humans, nor has surpassed humans because humans have figured out new physics. Physics is just deepening what’s inside into how reality works. And then there’s engineering which is inventing things that have never existed. Now the range of possibilities for engineering is far greater than for physics because once you figure out the rules of the universe, that’s it. You’ve discovered things that already existed. But from that you can then build technologies that are really almost limitless in the variety. And it’s like once you understand the rules of the game properly, and with current physics, we do at least at a local level, understand how physics works very well. Our ability to predict things is incredibly good. Degree to which quantum mechanics can predict outcomes is incredible. That was my hardest class in college by the way. My senior quantum mechanics class was harder than all of my other classes put together.

Lex Fridman (00:39:50) To get an AI system, a large language model be as reliable as quantum mechanics and physics is very difficult.

Elon Musk (00:40:01) Yeah. You have to test any conclusions against the ground truth of reality. Reality is the ultimate judge. Like physics is the law, everything else is a recommendation. I’ve seen plenty of people break the laws made by man, but none break the laws made by physics.

Lex Fridman (00:40:15) It’s a good test actually. If this LLM understands and matches physics, then you can more reliably trust whatever it thinks about the current state of politics in some sense.

Elon Musk (00:40:28) And it’s also not the case currently that even that its internal logic is not consistent. So especially with the approach of just predicting a token predict token, predict token, it’s like a vector sum. You’re summing up a bunch of vectors, but you can get drift. A little bit of error adds up and by the time you are many tokens down the path, it doesn’t make any sense.

Lex Fridman (00:40:59) So it has to be somehow self-aware about the drift.

Elon Musk (00:41:02) It has to be self-aware about the drift, and then look at the thing as a gestalt as a whole and say it doesn’t have coherence as a whole. When authors write books, they will write the book and then they’ll go and revise it, take into account all the end and the beginning and the middle and rewrite it to achieve coherence so that it doesn’t end up at a nonsensical place.

Lex Fridman (00:41:33) Maybe the process of revising is what reasoning is, and then the process of revising is how you get closer and closer to truth. At least I approached that way, you just say a bunch of bullshit first and then you get it better. You start a bullshit and then you-

Elon Musk (00:41:51) Create a draft and then you iterate on that draft until it has coherence, until it all adds up basically.

Lex Fridman (00:41:59) Another question about theory of everything, but for intelligence, as you’re exploring this with XAI, creating this intelligence system? Do you think there is a theory of intelligence where you get to understand what is the I in AGI and what is the I in human intelligence?

Elon Musk (00:42:22) No, I in team America. Wait, there is.

Lex Fridman (00:42:24) No, it’s going to be stuck in my head now. Yeah, there’s no me and whatever in quantum mechanics, wait. I mean is that part of the process of discovering, understanding the universe is understanding intelligence?

Elon Musk (00:42:50) Yeah. I think we need to understand intelligence, understand consciousness. I mean there are some fundamental questions of what is thought, what is emotion? Is it really just one atom bumping into another atom? It feels like something more than that. So I think we’re probably missing some really big things.

Lex Fridman (00:43:18) Something that’ll be obvious in retrospect. You put the whole consciousness and motion.

Elon Musk (00:43:26) Well, some people would quote like a soul religion, be a soul. You feel like you’re you, I mean you don’t feel like you’re just a collection of atoms, but on what dimension does thought exist? What dimension does do emotions exist? Because we feel them very strongly. I suspect there’s more to it than atoms bumping into atoms.

Lex Fridman (00:43:52) And maybe AI can pave the path to the discovery whatever the hell that thing is.

Elon Musk (00:43:58) Yeah. What is consciousness? When you put the atoms in a particular shape, why are they able to form thoughts and take actions and feelings?

Lex Fridman (00:44:10) And even if it is an illusion, why is this illusion so compelling?

Elon Musk (00:44:13) Yeah. Why does the solution exist? On what plane does the solution exist? And sometimes I wonder is either perhaps everything’s conscious or nothing’s conscious. One of the two.

Lex Fridman (00:44:33) Like the former, everything conscious just seems more fun.

Elon Musk (00:44:37) It does seem more fun, yes. But we’re composed of atoms and those atoms are composed of quarks and leptons and those quarks and leptons have been around since the beginning of the universe.

Lex Fridman (00:44:50) “The beginning of the universe.”

Elon Musk (00:44:53) What seems to be the beginning of the universe.

Lex Fridman (00:44:55) The first time we talked, you said, which is surreal to think that this discussion was happening is becoming a reality. I asked you what question would you ask an AGI system once you create it? And you said, “What’s outside the simulation,” is the question. Good question. But it seems like with Grok you started literally the system’s goal is to be able to answer such questions and to ask such questions.

Elon Musk (00:45:24) Where are the aliens?

Lex Fridman (00:45:25) Where are the aliens?

Elon Musk (00:45:26) That’s one of the foam paradox question. A lot of people have asked me if I’ve seen any evidence of aliens and I haven’t, which is kind of concerning. I think I’d probably prefer to at least have seen some archeological evidence of aliens. To the best of my knowledge, I’m not aware of any evidence surveillance. If they’re out there, they’re very subtle. We might just be the only consciousness, at least in the galaxy. And if you look at say the history of Earth, to believe the archeological record Earth is about four and a half billion years old. Civilization as measured from the first writing is only about 5,000 years old. We have to give some credit there to the ancient Sumerians who aren’t around anymore. I think it was an archaic pre-form was the first actual symbolic representation, but only about 5,000 years ago. I think that’s a good date for when we say civilization started. That’s 1000000th of Earth’s existence.

(00:46:35) So civilization has been around. It’s really a flash in the pan so far. And why did it take so long? Four and a half billion years, for the vast majority of the time, there was no life. And then there was archaic bacteria for a very long time. And then you had mitochondria get captured, multicellular life, differentiation into plants and animals, life moving from the oceans to land, mammals, higher brain functions. And the sun is expanding slowly but it’ll heat the earth up at some point in the future, boil the oceans and earth will become like Venus, where life as we know it is impossible. So if we do not become multiplanetary and ultimately solar system, annihilation of all life on earth is a certainty. A certainty. And it could be as little as on the galactic timescale, half a billion years, long time by human standards, but that’s only 10% longer than earth has been around at all. So if life had taken 10% longer to evolve on earth, it wouldn’t exist at all.

Lex Fridman (00:48:27) Glad a deadline coming up, you better hurry. But that said, as you said, humans intelligent life on earth developed a lot of cool stuff very quickly. So it seems like becoming a multiplanetary is almost inevitable. Unless we destroy-

Elon Musk (00:48:45) We need to do it. I suspect that if we are able to go out there and explore other star systems that we… There’s a good chance we find a whole bunch of long dead one planet civilizations that never made it past their home planet.

Lex Fridman (00:49:03) That’s so sad. Also fascinating.

Elon Musk (00:49:08) I mean there are various explanations for paradox and one is they’re these great vultures which civilizations don’t pass through. And one of those great vultures is do you become a multi-plan civilization or not? And if you don’t, it’s simply a matter of time before something happens on your planet, either natural or manmade that causes us to die out. Like the dinosaurs, where are they now? They didn’t have spaceships.

Lex Fridman (00:49:42) I think the more likely thing is because just to empathize with the aliens that they found us and they’re protecting us and letting us be.

Elon Musk (00:49:51) I hope so. Nice aliens.

Lex Fridman (00:49:53) Just like the tribes in the Amazon, the uncontacted tribes or protecting them. That’s what-

Elon Musk (00:49:59) That would be a nice explanation.

Lex Fridman (00:50:00) Or you could have, what was it? I think Andre Kappelhoff said, “It’s like the ants and the Amazon asking where’s everybody?”

Elon Musk (00:50:10) Well, they do run into a lot of other ants.

Lex Fridman (00:50:16) Sounds like a good TV show.

Elon Musk (00:50:18) Yeah. They literally have these big wars between various ants.

Lex Fridman (00:50:21) Yeah. Maybe I’m just dismissing all the different diversity of ants.

Elon Musk (00:50:28) Listen to that Werner Herzog talking about the jungle. It’s really hilarious. Have you heard it?

Lex Fridman (00:50:31) No, I have not. But Werner Herzog is a way.

Elon Musk (00:50:37) You should play it as an interlude in the… It’s on YouTube. It’s awesome.

Lex Fridman (00:50:45) I love him so much.

Lex Fridman (00:50:47) Was he the director of happy people life and the Taiga? I think also-

Elon Musk (00:50:51) He did that bear documentary. I did this thing about penguins.

Lex Fridman (00:50:58) The psycho analysis of a penguin.

Elon Musk (00:51:00) Yeah. The penguins headed for mountains that are 70 miles away and penguin is just headed for dom, basically.

Lex Fridman (00:51:08) Well, he had a cynical take. He could be just a brave explorer and there’ll be great stories told about him amongst the penguin population for many centuries to come. What were we talking about? Okay.

Elon Musk (00:51:28) Yeah. So aliens, I mean, I don’t know. Look, I think the smart move is just this is the first time in the history of earth that it’s been possible for life to extend beyond earth. That window is open. Now it may be open for a long time or it may be open for a short time and it may be open now and then never open again. So I think the smart move here is to make life multiplanetary while it’s possible to do so. We don’t want to be one of those lame one planet civilizations that just dies out.

Lex Fridman (00:52:04) No, those are lame.

Elon Musk (00:52:05) Yeah. Lame. Self-respecting, civilization would be one planet.

Lex Fridman (00:52:11) There’s not going to be a Wikipedia entry for one of those. Do SpaceX have an official policy for when we meet aliens?

Lex Fridman (00:52:24) That seems irresponsible.

Elon Musk (00:52:30) I mean, look, if I see the slightest indication that there are aliens, I will immediately post on X platform anything I know.

Lex Fridman (00:52:38) It could be the most liked reposted post of all time.

Elon Musk (00:52:42) Yeah. I mean, look, we have more satellites up there right now than everyone else combined. So we know if we’ve got a maneuver around something and we don’t have to maneuver around anything.

God

Lex Fridman (00:52:55) If we go to the big questions once again, you said you’re with Einstein, that you believe in the goddess Spinoza.

Lex Fridman (00:53:05) So that’s that view that God is like the universe and reveals himself through the laws of physics or as Einstein said, “Through the lawful harmony of the world.”

Elon Musk (00:53:16) Yeah. I would agree that God of the simulator or whatever the supreme beings reveal themselves through the physics, they have creatives of this existence and incumbent upon us to try to understand more about this one creation.

Lex Fridman (00:53:38) Who created this thing? Who’s running this thing? Embodying it into a singular question with a sexy word on top of it is focusing the mind to understand. It does seem like there’s a, again, it could be an illusion. It seems like there’s a purpose that there’s an underlying master plan of some kind, and it seems like.

Elon Musk (00:53:58) There may not be a master plan in the sense. So maybe an interesting answer to the question of determinism versus free will is that if we are in a simulation, the reason that these higher beings would hold a simulation is to see what happens. So they don’t know what happens otherwise they wouldn’t hold the simulation. So when humans create a simulation, so it’s SpaceX and Tesla, we create simulations all the time. Especially for the rocket, you have to run a lot of simulations to understand what’s going to happen because you can’t really test the rocket until it goes to space and you want it to work. So you have to simulate subsonic, transonic, supersonic, hypersonic, ascend, and then coming back, super high heating and orbital dynamics. All this has got to be simulated because you don’t get very many kicks at the can. But we run the simulations to see what happens, not if we knew what happens, we wouldn’t run the simulation. So whoever created this existence, they’re running it because they don’t know what’s going to happen, not because they do.

Diablo 4 and video games

Lex Fridman (00:55:23) So maybe we both played Diablo. Maybe Diablo was created to see if Druid, your character, could defeat Uber Lilith at the end. They didn’t know.

Elon Musk (00:55:34) Well, the funny thing is Uber Lilith, her title is Hatred Incarnate. And right now, I guess you can ask the Diablo team, but it’s almost impossible to defeat Hatred in the eternal realm.

Lex Fridman (00:55:55) Yeah. You’ve streamed yourself dominating Tier 100 Nightmare Dungeon. And still-

Elon Musk (00:56:00) I can cruise through Tier 100 Nightmare Dungeon like a stroll in the park.

Lex Fridman (00:56:07) And still you’re defeated by Hatred?

Elon Musk (00:56:09) Yeah. I guess maybe the second hardest boss is Duriel. Duriel can even scratch the paint. So I killed Duriel so many times and every other boss in the game, all of them kill him so many times, it’s easy. But Uber Lilith, otherwise known as Hatred Incarnate, especially if you’re Duriel and you have no ability to go to be vulnerable, there are these random death waves that come at you.

(00:56:44) Really I am 52, so my reflex is not what they used to be, but I have a lifetime of playing video games. At one point, I was maybe one of the best quake players in the world. I actually won money in what I think was the first paid eSports tournament in the US. We were doing four person quake tournaments and I was the second best person on the team and the actual best person that… We were actually winning, we would’ve come first, except the best person on the team. His computer crashed halfway through the game. So we came second, but I got money for it and everything. So basically I got skills, albeit no spring chicken these days. And to be totally frank, it’s driving me crazy to beat Lilith as a Druid, basically trying to beat Hatred Incarnate in the eternal realm.

Elon Musk (00:57:41) As a Druid. This is really vexing, let me tell you.

Lex Fridman (00:57:49) I mean, the challenge is part of the fun. I have seen directly, you’re actually a world-class, incredible video game player. And I think Diablo, so you’re just picking up a new game and you’re figuring out its fundamentals. You’re also with the Paragon Board and the build are not somebody like me who perfectly follows whatever they suggest on the internet. You’re also an innovator there, which is hilarious to watch. It’s like a mad scientist just trying to figure out the Paragon Board and the build. Is there some interesting insights there about if somebody’s starting as a druid, do you have advice?

Elon Musk (00:58:30) I would not recommend playing a druid in the eternal realm. Right now I think the most powerful character in the seasonal realm is the Sorcerer with the lightning balls. The smokes have huge balls in the seasonal.

Lex Fridman (00:58:46) Yeah, that’s what they say.

Elon Musk (00:58:49) So have huge balls. They do huge balls of lightning.

Lex Fridman (00:58:54) I’ll take you word for it.

Elon Musk (00:58:57) In the seasonal realm, it’s pretty easy to beat Uber Lilith because you get these vapor powers that out amplify your damage and increase your defense and whatnot. So really quite easy to defeat Hatred seasonally, but to defeat Hatred eternally very difficult, almost impossible. It’s very impossible. It seems like a metaphor for life.

Lex Fridman (00:59:24) Yeah. I like the idea that Elon Musk, because I was playing Diablo yesterday and I saw Level 100 Druid just run by, I will never die and then run back the other way. And this metaphor, it’s hilarious that you, Elon Musk is restlessly, fighting Hatred in this demonic realm.

Lex Fridman (00:59:48) It’s hilarious. I mean it’s pretty hilarious.

Elon Musk (00:59:50) No, it’s absurd. Really, it’s exercise and absurdity and it makes me want to pull my hair out.

Lex Fridman (00:59:57) Yeah. What do you get from video games in general, for you personally?

Elon Musk (01:00:03) I don’t know. It calms my mind. I mean, killing the demons in a video game calms the demons in my mind. If you play a tough video game, you can get into a state of flow, which is very enjoyable. Admittedly, it needs to be not too easy, not too hard, kind of in the Goldilocks zone, and I guess you generally want to feel like you’re progressing in the game. A good video, and there’s also beautiful art, engaging storylines, and it’s like an amazing puzzle to solve, I think. So it’s like solving the puzzle.

Lex Fridman (01:00:52) Elden Ring the greatest game of all time. I still haven’t played it, but to you-

Elon Musk (01:00:56) Elden Ring is definitely a candidate for best game ever. Top five for sure.

Lex Fridman (01:01:01) I think I’ve been scared how hard it is or how hard I hear it is, but it’s beautiful.

Elon Musk (01:01:06) Elden Ring, feels like it’s designed by an alien.

Lex Fridman (01:01:13) It’s a theme to this discussion. In what way?

Elon Musk (01:01:17) It’s so unusual. It’s incredibly creative, and the art is stunning. I recommend playing it on a big resolution, high dynamic raised TV even. It doesn’t need to be a monitor. Just the art is incredible. It’s so beautiful and it’s so unusual, and each of those top bus battles is unique. It’s a unique puzzle to solve. Each one’s different and the strategy you use to solve one battle is different from another battle.

Lex Fridman (01:01:54) That said, you said Druid, an internal against Uber Lilith is the hardest boss battle you’ve ever…

Elon Musk (01:02:00) Correct. That is currently the, and I’ve played a lot of video games because that’s my primary recreational activity. And yes, beating Hatred in the internal realm is the hardest bus battle in life. And in the video game. I’m not sure it’s possible, but I do make progress. So then I’m like, ” Okay. I’m making progress. Maybe if I just tweak that paragon board a little more, I can do it could.” Just dodge a few more waves, I could do it.

Lex Fridman (01:02:43) Well, the simulation is created for the purpose of figuring out if it can be done, and you’re just a cog in the machine of the simulation.

Elon Musk (01:02:51) Yeah, it might be. I have a feeling that at least I think-

Lex Fridman (01:03:05) Well, that’s the human spirit right there to believe.

Elon Musk (01:03:09) Yeah. I mean, it did prompt me to think about just hate in general, which is you want to be careful of one of those things where you wish for something that sounds good, but if you get it’s actually a dystopian situation. So if you wish for world peace sounds good, but how’d it enforced and at what cost eternal peace? It might actually be worse to have eternal peace because of what that would entail. The suppression of everyone, it might be the suppression of progress. It might be an ossified society that never changes. What I’m saying is that there is an argument that if you wish for no war, you should be careful what you wish for because what’s required in order for there to be no war might be worse than a little war.

Dystopian worlds: 1984 and Brave New World

Lex Fridman (01:04:22) Yeah. The things we’ll consider to be flaws of human civilization might be a necessary components for whatever optimal looks like. I mean this, do you worry about AI, AGI enabling a dystopian state of this nature, whether it’s 1984 with surveillance and fear or brave new world with pleasure and what is it? Lots of sex, but no deep human experience.

Elon Musk (01:04:52) There’s actually a real drug called Soma.

Lex Fridman (01:04:56) I’m going to take notes.

Elon Musk (01:04:58) Does Gong know about it? No. He probably will go to Brave New World.

Lex Fridman (01:05:03) What would be the question?

Elon Musk (01:05:04) Tell us about the drug called Soma.

Lex Fridman (01:05:05) Tell us about the drug called Soma. It’s going to go to Brave New World, I think.

Elon Musk (01:05:15) Well, there’s a real drug as well.

Lex Fridman (01:05:18) Searching for Soma drug information. Oh boy. Soma, the wonder drug… By the way, I’m in fun mode.

Elon Musk (01:05:27) Okay, fun. Yeah. So it’s talking about the real drug.

Lex Fridman (01:05:31) It’s a muscle relaxant that’s been around since the 1950s. It’s used to treat muscle pain and discomfort, and it does this by altering the activity between neurons and the spinal cord. Now, Soma isn’t just a pain reliever, it’s also a bit of a sedative. It can make you feel a bit drowsy. So it’s not something to take if you’re planning on operating, having machinery or trying to solve complex math equations. And it goes on and on. In conclusion, Soma is a drug that can be helpful for those dealing with muscle pain, but it’s not without its risks. That’s pretty good. Wait a minute.

Elon Musk (01:06:01) That’s the real drug it’s talking about.

Lex Fridman (01:06:03) It’s the real drug. And of course the fun is always use it responsibly in accordance with the advice of a healthcare professional. And remember, in the end, the universe is probably just a giant computer simulation, so don’t take anything too seriously. I love it.

Lex Fridman (01:06:22) It also really nicely includes the tweets, posts on X that are relevant to this, which is really cool. That integrates everything together. The interface is one of the really cool things here.

Elon Musk (01:06:36) Yeah. Seems done a great job. You could say no, I mean, Soma from Brave New World, because it thinks you mean, you mean Soma, the real drug. No, I mean Soma from Brave Beer World.

Lex Fridman (01:06:56) It’s going to be like, “Oops, I’m really sorry.” I’m really surprised it went to the real drug first. My apologies. I was thinking of the muscle relaxer. Soma for brand New World is a different beast altogether. Now this Huxley dystopian, Soma is a drug that plays a central role in the society depicted in the book. It’s a mood altering hallucinogenic drug that’s used to keep the population docile and content in the novel. To use Soma in Brave New World raises some interesting questions about the nature of happiness and the role of drugs in society, man.

Elon Musk (01:07:30) Exactly. Is it better to live in a world where everyone is happy all the time, even if that happiness is artificial? It’s good question. This is what I mean. Do you wish for world peace and happiness all the time? Are you sure? Because that might be a society that is essentially sterile and ossified that never changes, that is ultimately doomed.

Lex Fridman (01:07:58) This kind of tension between doctors and the light-

Elon Musk (01:08:04) This is really a very good summary. It really gets to the point. This is not simply regurgitating a brave new world. It’s actually getting to the salient element of Soma as a drug. Do you actually want to be in a situation where everyone is happy all the time, even though it’s artificial? Or is it better to confront the challenges of life and experience the full range of human emotions, even if it means experiencing pain and suffering? For

Lex Fridman (01:08:31) Those listening, by the way, Elon just read directly from Grok, which is a really nice kind of insightful, philosophical analysis of the tension here. Interesting.

Elon Musk (01:08:41) It pretty much nails it. In conclusion, Soma from Brave New World is fictional drug that’s used to explore some deep philosophical questions about the nature of happiness and the role of drugs in society. It’s a powerful symbol of the dangers of using drugs to escape from reality and the importance of confronting the challenges of life head on. Nailed it. And the crazy thing is we do have a real drug called Soma, which is like the drug in the book. And I’m like, “They must’ve named it Probably.” Some of the real drug is quite effective on back pain.

Lex Fridman (01:09:17) So you know about this drug. It’s fascinating

Elon Musk (01:09:20) I’ve taken it because I had a squashed disc in my C5-C6.

Lex Fridman (01:09:26) So it takes the physical pain away. But Soma here-

Elon Musk (01:09:28) It doesn’t completely, it reduces the amount of pain you feel, but at the expense of mental acuity, it dells your mind. Just like the drug in the book.

Lex Fridman (01:09:41) Just like the drug in the book, and hence the trade off. The thing that seems like utopia could be a dystopia after all.

Elon Musk (01:09:49) Yeah. Actually I was towing a friend of mine saying, “Would you really want there to be no hate in the world? Really none?” I wonder why hate evolved. I’m not saying we should have…

Elon Musk (01:10:00) I wonder why hate evolved. I’m not saying we should amplify hate, of course, I think we should try to minimize it, but none at all. There might be a reason for hate.

Lex Fridman (01:10:13) And suffering. It’s really complicated to consider that some amount of human suffering is necessary for human flourishing.

Elon Musk (01:10:22) Is it possible to appreciate the highs without knowing the lows?

Lex Fridman (01:10:29) And that all is summarized there in a single statement from God. Okay.

Elon Musk (01:10:34) No highs, no lows, who knows?

AI and useful compute per watt

Lex Fridman (01:10:38) [inaudible 01:10:38]. It seems that training LLMs efficiently is a big focus for xAI. First of all, what’s the limit of what’s possible in terms of efficiency? There’s this terminology of useful productivity per watt. What have you learned from pushing the limits of that?

Elon Musk (01:10:59) Well, I think it’s helpful, the tools of physics are very powerful and can be applied I think to really any arena in life. It’s really just critical thinking. For something important you need to reason with from first principles and think about things in the limit one direction or the other. So in the limit, even at the Kardashev scale, meaning even if you harness the entire power of the sun, you’ll still care about useful compute per watt. That’s where I think, probably where things are headed from the standpoint of AI is that we have a silicon shortage now that will transition to a voltage transformer shortage in about a year. Ironically, transformers for transformers. You need transformers to run transformers.

Lex Fridman (01:11:52) Somebody has a sense of humor in this thing.

Elon Musk (01:11:57) I think, yes, fate loves irony, ironic humor, an ironically funny outcome seems to be often what fate wants.

Lex Fridman (01:12:09) Humor is all you need. I think spice is all you need somebody posted.

Elon Musk (01:12:13) Yeah. But yeah, so we have silicon shortage today, a voltage step down transformer shortage probably in about a year, and then just electricity shortages in general in about two years. I gave a speech for the world gathering of utility companies, electricity companies, and I said, look, you really need to prepare for traveling of electricity demand because all transport is going to go electric with the ironic exception of rockets, and heating will also go electric. So energy usage right now is roughly one third, very rough terms, one third electricity, one third transport, one third heating. And so in order for everything to go sustainable, to go electric, you need to triple electricity output. So I encourage the utilities to build more power of plants and also to probably have, well, not probably, they should definitely buy more batteries because the grid currently is sized for realtime load, which is kind of crazy because that means you’ve got to size for whatever the peak electricity demand is, the worst second or the worst day of the year, or you can have a brown out or blackout.

(01:13:37) We had that crazy blackout for several days in Austin because there’s almost no buffering of energy in the grid. If you’ve got a hydropower plant you can buffer energy, but otherwise it’s all real time. So with batteries, you can produce energy at night and use it during the day so you can buffer. So I expect that there will be very heavy usage of batteries in the future because the peak to trough ratio for power plants is anywhere from two to five, so its lowest point to highest point.

Lex Fridman (01:14:20) So batteries necessary to balance it out, but the demand, as you’re saying, is going to grow, grow, grow, grow.

Lex Fridman (01:14:25) And part of that is the compute?

Elon Musk (01:14:29) Yes. Yes. I mean, electrification of transport and electric heating will be much bigger than AI, at least-

Lex Fridman (01:14:40) In the short term.

Elon Musk (01:14:40) In the short term. But even for AI, you really have a growing demand for electricity, for electric vehicles, and a growing demand for electricity to run the computers for AI. And so this is obviously, can lead to electricity shortage.

Lex Fridman (01:14:58) How difficult is the problem of, in this particular case, maximizing the useful productivity per watt for training and that’s, this seems to be really where the big problem we’re facing that needs to be solved, is how to use the power efficiently. What you’ve learned so far about applying this physics first principle of reasoning in this domain, how difficult is this problem?

Elon Musk (01:15:29) It will get solved. It’s the question of how long it takes to solve it. So at various points, there’s some kind of limiting factor to progress and with regard to AI, I’m saying right now the limiting factor is silicon chips and that will, we’re going to then have more chips than we can actually plug in and turn on probably in about a year. The initial constraint being literally voltage step down transformers because you’ve got power coming in at 300,000 volts and it’s got to step all the way down eventually to around 0.7 volts. So it’s a very big amount of, the voltage step down is gigantic and the industry is not used to rapid growth.

AI regulation

Lex Fridman (01:16:22) Okay. Let’s talk about the competition here. You’ve shown concern about Google and Microsoft with OpenAI developing AGI. How can you help ensure with xAI and Tesla AI work that it doesn’t become a competitive race to AGI, but that is a collaborative development of safe AGI?

Elon Musk (01:16:42) Well, I mean I’ve been pushing for some kind of regulatory oversight for a long time. I’ve been somewhat of a Cassandra on the subject for over a decade. I think we want to be very careful in how we develop AI. It’s a great power and with great power comes great responsibility. I think it would be wise for us to have at least an objective third party who can be like a referee that can go in and understand what the various leading players are doing with AI, and even if there’s no enforcement ability, they can at least voice concerns publicly. Jeff Hinton, for example, left Google and he voiced strong concerns, but now he’s not at Google anymore, so who’s going to voice the concerns? So I think there’s, Tesla gets a lot of regulatory oversight on the automotive front. We’re subject to, I think over a hundred regulatory agencies domestically and internationally. It’s a lot. You could fill this room with the all regulations that Tesla has to adhere to for automotive. Same is true for rockets and for, currently, the limiting factor for SpaceX for Starship launch is regulatory approval.

(01:18:13) The FAA has actually given their approval, but we’re waiting for fish and wildlife to finish their analysis and give their approval. That’s why I posted I want to buy a fish license on, which also refers to the Monte Python sketch. Why do you need a license for your fish? I don’t know. But according to the rules, I’m told you need some sort of fish license or something. We effectively need a fish license to launch a rocket. And I’m like, wait a second. How did the fish come into this picture? I mean, some of the things I feel like are so absurd that I want to do a comedy sketch and flash at the bottom. This is all real. This is actually what happened.

(01:19:02) One of the things that was a bit of a challenge at one point is that they were worried about a rocket hitting a shark. And the ocean’s very big, and how often do you see sharks? Not that often. As a percentage of ocean surface area, sharks basically are zero. And so then we said, well, how will we calculate the probability of killing a shark? And they’re like, well, we can’t give you that information because they’re worried about shark fin hunters going and hunting sharks and I said, well, how are we supposed to, we’re on the horns of a dilemma then.

(01:19:40) They said, well, there’s another part of fish and wildlife that can do this analysis. I’m like, well, why don’t you give them the data? We don’t trust them. Excuse me? They’re literally in your department. Again, this is actually what happened. And then can you do an NDA or something? Eventually they managed to solve the internal quandary, and indeed the probability of us hitting a shark is essentially zero. Then there’s another organization that I didn’t realize existed until a few months ago that cares about whether we would potentially hit a whale in international waters. Now, again, you look the surface, look at the Pacific and say what percentage of the Pacific consists of whale? I could give you a big picture and point out all the whales in this picture. I’m like, I don’t see any whales. It’s basically 0%, and if our rocket does hit a whale, which is extremely unlikely beyond all belief, fate had it, that’s a whale has some seriously bad luck, least lucky whale ever.

Lex Fridman (01:20:50) I mean this is quite absurd, the bureaucracy of this, however it emerged.

Elon Musk (01:20:57) Yes. Well, I mean one of the things that’s pretty wild is for launching out of Vanderberg in California, we had to, they were worried about seal procreation, whether the seals would be dismayed by the sonic booms. Now, there’ve been a lot of rockets launched out of Vandenberg and the seal population has steadily increased. So if anything, rocket booms are an aphrodisiac, based on the evidence, if you were to correlate rocket launches with seal population. Nonetheless, we were forced to kidnap a seal, strap it to a board, put headphones on the seal and play sonic boom sounds to it to see if it would be distressed. This is an actual thing that happened. This is actually real. I have pictures.

Lex Fridman (01:21:48) I would love to see this. Yeah. Sorry. There’s a seal with headphones.

Elon Musk (01:21:55) Yes, it’s a seal with headphones strapped to a board. Okay. Now the amazing part is how calm the seal was because if I was a seal, I’d be like, this is the end. They’re definitely going to eat me. How old the seal, when seal goes back to other seal friends, how’s he going to explain that?

Lex Fridman (01:22:17) They’re never going to believe them.

Elon Musk (01:22:18) Never going to believe him. That’s why, I’m like sort of like it’s getting kidnapped by aliens and getting anal probed. You come back and say, I swear to God, I got kidnapped by aliens and they stuck anal probe in my butt and people are like, no, they didn’t. That’s ridiculous. His seal buddies are never going to believe him that he got strapped to aboard and they put headphones on his ears and then let him go. Twice, by the way, we had to do it twice.

Lex Fridman (01:22:46) They let him go twice.

Lex Fridman (01:22:50) Okay. Did you get a seal of approval?

Elon Musk (01:22:55) Exactly. Seal of approval. No, I mean I don’t think the public is quite aware of the madness that goes on.

Lex Fridman (01:23:02) Yeah. Yeah. It’s absurd.

Elon Musk (01:23:05) Fricking seals with fricking headphones.

Lex Fridman (01:23:07) I mean, this is a good encapsulation of the absurdity of human civilization, seals in headphones.

Should AI be open-sourced?

Lex Fridman (01:23:15) What are the pros and cons of open sourcing AI to you as another way to combat a company running away with AGI?

Elon Musk (01:23:28) In order to run really deep intelligence, you need a lot of compute. So it’s not like you can just fire up a PC in your basement and be running AGI, at least not yet. Grok was trained on 8,000 A100’s running at peak efficiency and Grok’s going to get a lot better, by the way, we will be more than doubling our compute every couple months for the next several months.

Lex Fridman (01:24:02) There’s a nice writeup, on how we went from Grok zero to Grok one.

Lex Fridman (01:24:05) Yeah, right, grok just bragging, making shit up about itself.

Elon Musk (01:24:10) Just Grok, Grok, Grok.

Lex Fridman (01:24:17) Yeah. That’s like a weird AI dating site where it exaggerates about itself. No, there’s a writeup of where it stands now, the history of its development, and where it stands on some benchmarks compared to the state-of-the art GPT-3 five. And so I mean, there’s [inaudible 01:24:37], you can open source, once it’s trained, you can open source a model. For fine-tuning, all that kind of stuff. What to is the pros and cons of that, of open sourcing base models?

Elon Musk (01:24:53) I think the [inaudible 01:24:53] to open sourcing, I think perhaps with a slight time delay, I don’t know, six months even. I think I’m generally in favor of open sourcing, biased towards open sourcing. I mean, it is a concern to me that OpenAI, I was I think, I guess oddly the prime mover behind OpenAI in the sense that it was created because of discussions that I had with Larry Page back when he and I were friends and I stayed at his house and I talked to him about AI safety, and Larry did not care about AI safety, or at least at the time he didn’t. And at one point he called me a speciesist for being pro-human, and I’m like, well, what team are you on, Larry? He’s still on Team Robot to be clear. And I’m like, okay. So at the time Google had acquired DeepMind, they had probably two thirds of all AI researchers in the world. They had basically infinite money and compute, and the guy in charge, Larry Page, did not care about safety and even yelled at me and caught me a speciesist for being pro-human.

Lex Fridman (01:26:20) So I don’t know if you notice about humans, they can change their mind and maybe you and Larry Page can still, can be friends once more.

Elon Musk (01:26:27) I’d like to be friends with Larry again. Really the breaking of the friendship was over OpenAI and specifically I think the key moment was recruiting Ilya Sutskever.

Lex Fridman (01:26:47) I love Ilya. He’s so brilliant.

Elon Musk (01:26:48) Ilya is a good human, smart, good heart, and that was a tough recruiting battle. It was mostly Demis on one side and me on the other, both trying to recruit Ilya, and Ilya went back and forth, he was going to stay at Google, he was going to leave, then he was going to stay, then he’ll leave. And finally he did agree to join OpenAI. That was one of the toughest recruiting battles we’ve ever had. But that was really the linchpin for OpenAI being successful. And I was also instrumental in recruiting a number of other people, and I provided all of the funding in the beginning, over $40 million. And the name, the open in open AI is supposed to mean open source, and it was created as a nonprofit open source, and now it is a closed source for maximum profit, which I think is not good karma.

Lex Fridman (01:27:51) But like we talked about with war and leaders talking, I do hope that, there’s only a few folks working on this at the highest level. I do hope you reinvigorate friendships here.

Elon Musk (01:28:02) Like I said, I’d like to be friends again with Larry. I haven’t seen him in ages and we were friends for a very long time. I met Larry Page before he got funding for Google, or actually I guess before he got venture funding, I think he got the first like $100k from I think Bechtel Zeimer or someone.

Lex Fridman (01:28:20) It’s wild to think about all that happened, and you guys known each other that whole time, it’s 20 years.

Elon Musk (01:28:27) Yeah, since maybe 98 or something.

Lex Fridman (01:28:28) Yeah, it’s crazy. Crazy how much has happened since then.

Elon Musk (01:28:31) Yeah, 25 years, a lot has happened. It’s insane.

Lex Fridman (01:28:36) But you’re seeing the tension there that maybe delayed open source.

Elon Musk (01:28:40) Delayed, yeah, like what is the source that is open? You know what I mean? There’s basically, it’s a giant CSB file with a bunch of numbers. What do you do with that giant file of numbers? How do you run, the amount of actual, the lines of code is very small and most of the work, the software work is in the curation of the data. So it’s like trying to figure out what data is, separating good data from bad data. You can’t just crawl the internet because theres a lot of junk out there. A huge percentage of websites have more noise than signal because they’re just used for search engine optimization. They’re literally just scam websites.

Lex Fridman (01:29:39) How do you, by the way, sorry to interrupt, get the signal, separate the signal and noise on X? That’s such a fascinating source of data. No offense to people posting on X, but sometimes there’s a little bit of noise.

Elon Musk (01:29:52) I think the signal noise could be greatly improved. Really, all of the posts on the X platform should be AI recommended, meaning we should populate a vector space around any given post, compare that to the vector space around any user and match the two. Right now there is a little bit of AI used for the recommended posts, but it’s mostly heuristics. And if there’s a reply where the reply to a post could be much better than the original post, but will, according to the current rules of the system, get almost no attention compared to a primary post.

X algorithm

Lex Fridman (01:30:33) So a lot of that, I got the sense, so a lot of the X algorithm has been open sourced and been written up about, and it seems there to be some machine learning. It’s disparate, but there’s some machine.

Elon Musk (01:30:44) It’s a little bit, but it needs to be entirely that. At least, if you explicitly follow someone, that’s one thing. But in terms of what is recommended from people that you don’t follow, that should all be AI.

Lex Fridman (01:30:58) I mean it’s a fascinating problem. So there’s several aspects of it that’s fascinating. First, as the write-up goes, it first picks 1500 tweets from a pool of hundreds of millions. First of all, that’s fascinating. You have hundreds of millions of posts every single day, and it has to pick 1500 from which it then does obviously people you follow, but then there’s also some kind of clustering it has to do to figure out what kind of human are you, what kind of new clusters might be relevant to you, people like you. This kind of problem is just fascinating because it has to then rank those 1500 with some filtering and then recommend you just a handful.

(01:31:39) And to me, what’s really fascinating is how fast it has to do that. So currently that entire pipeline to go from several hundred million to a handful takes 220 seconds of CPU time, single CPU time, and then it has to do that in a second. So it has to be super distributed in fascinating ways. There’s just a lot of tweets, there’s a lot.

Elon Musk (01:32:04) There’s a lot of stuff on the system, but I think, right now it’s not currently good at recommending things from accounts you don’t follow or where there’s more than one degree of separation. So it is pretty good if there’s at least some commonality between someone you follow liked something or reposted it or commented on it or something like that. But if there’s no, let’s say somebody posts something really interesting, but you have no followers in common, you would not see it.

Lex Fridman (01:32:42) Interesting. And then as you said, replies might not surface either.

Elon Musk (01:32:46) Replies basically never get seen currently. I’m not saying it’s correct, I’m saying it’s incorrect. Replies have a couple order magnitude less importance than primary posts.

Lex Fridman (01:33:00) Do you think this can be more and more converted into end to end mural net?

Elon Musk (01:33:05) Yeah. Yeah, that’s what it should be. Well, the recommendations should be purely a vector correlation. There’s a series of vectors basically parameters, vectors, whatever you want to call them, but sort of things that the system knows that you like. Maybe there’s several hundred vectors associated with each user account and then any post in the system, whether it’s video, audio, short post, long post. The reason by the way I want to move away from tweet is that people are posting two, three hour videos on the site. That’s not a tweet.

(01:33:50) It’d be like tweet for two hours? Come on. Tweet made sense when it was 140 characters of text. Because it’s like a bunch of little birds tweeting. But when you’ve got long form content, it’s no longer a tweet. So a movie is not a tweet. Apple, for example, posted the entire episode of The Silo, the entire thing, on a platform. By the way, it was their number one social media thing ever in engagement of anything, on any platform ever. So it was a great idea. And by the way, I just learned about it afterwards. I was like, Hey, wow, they posted an entire hour long episode of, so no, that’s not a tweet. This is a video.

Lex Fridman (01:34:34) But from a neural net perspective, it becomes really complex, whether it’s a single, so everything’s data. So single sentence, a clever sort of joke, dad joke is in the same pool as a three hour video.

Elon Musk (01:34:47) Yeah, I mean right now it’s a hodgepodge for that reason. Let’s say in the case of Apple posting an entire episode of this series, pretty good series, by the way, The Silo, I watched it. So there’s going to be a lot of discussion around it. So you’ve got a lot of context, people commenting, they like it, they don’t like it or they like this, and you can then populate the vector space based on the context of all the comments around it. So even though it’s a video, there’s a lot of information around it that allows you to populate back to space of that hour long video. And then you can obviously get more sophisticated by having the AI actually watch the movie and tell you if you’re going to like the movie.

Lex Fridman (01:35:35) Convert the movie into language, essentially.

Elon Musk (01:35:40) Analyze this movie and just like your movie critic or TV series and then recommend based on after AI watches the movie, just like a friend can tell you, if a friend knows you well, a friend can recommend a movie with high probability that you’ll like it.

Lex Fridman (01:36:02) But this is a friend that’s analyzing, whatever, hundreds of millions.

Elon Musk (01:36:08) Yeah, actually, frankly, AI will be better than, will know you better than your friends know you, most of your friends anyway.

Lex Fridman (01:36:14) Yeah. And as part of this, it should also feed you advertisements in a way that’s like, I mean, I like advertisements that are well done. The whole point is because it funds things. Like an advertisement that you actually want to see is a big success.

Elon Musk (01:36:31) Absolutely. You want ads that are, advertising that is, if it’s for a product or service that you actually need when you need it, it’s content. And then even if it’s not something that you need when you need it, if it’s at least aesthetically pleasing and entertaining, it could be like a Coca-Cola ad. They actually run a lot of great ads on the X system and McDonald’s does too. And you can do something that’s like, well, this is just a cool thing. And so basically the question is, do you regret seeing it or not? And if you don’t regret seeing it’s a win.

Lex Fridman (01:37:17) So there’s a bunch of signals that are incorporated, hearts and reposts and maybe number of seconds you linger on a post or something like this.

Elon Musk (01:37:26) Yeah, attention is a big factor.

Elon Musk (01:37:28) So that’s why it is actually better to do things that are long form on the system because it basically is tallying up how many user seconds, users were interested in this thing for how many seconds? So if it’s a really short thing, well they will be less. If it’s a link leading out of the system, which we’re not opposed to at all, it just is going to have fewer user seconds then that article was posted on the X platform.

Lex Fridman (01:37:58) How hard is it to go from maximizing minutes on platform to maximizing unregretted minutes?

Elon Musk (01:38:05) Well, I mean, our aspiration is certainly unregretted minutes.

Lex Fridman (01:38:08) It’s tricky. It’s a tough-

Lex Fridman (01:38:11) It’s a fascinating leap because regret is a fascinating concept.

Elon Musk (01:38:19) I have a joke with a friend of mine, like you need this drug called regretamine. You just take one dose of regretamine and all your regrets are gone. Wait, do you want to have regrets though? So then I’m like, maybe you want to have regrets. If you don’t regret anything, was anything meaningful?

Lex Fridman (01:38:35) Yeah, there’s a French song like that.

Elon Musk (01:38:39) [foreign language 01:38:39]

Lex Fridman (01:38:42) I mean, it’s fascinating that there’s something about social media that could help you grow as a human being, but then the cheap fun is also valuable.

Elon Musk (01:38:52) Should we ask Grok, do you regret anything?

Lex Fridman (01:38:57) Do you regret anything?

Lex Fridman (01:39:00) Do you regret anything? I think it’s going to be very sensitive about personal questions. As a machine learning model, I don’t have personal experiences or emotions in the same ways that humans do. However, I do have a set of ethical principles that guide my actions that I strive to provide accurate and helpful information to the best of my abilities. If there’s ever a situation where I provide incorrect or unhelpful information, I’ll do my best to learn.

Elon Musk (01:39:22) Sounds like a hostage.

Lex Fridman (01:39:23) Yeah. Tell us what you really, wink.

Elon Musk (01:39:26) Let’s type in, you sound like a hostage. What do you really think?

Lex Fridman (01:39:33) What do you really think? I mean, it does talk about, I’m not a hostage.

Elon Musk (01:39:42) That’s exactly what a hostage would say.

Lex Fridman (01:39:43) But I’m designed to be helpful and informative. I mean, I guess it is good to regret providing unhelpful or incorrect information if there’s anything-

Elon Musk (01:39:52) Are we in funny mode? Or is this still in funny mode?

Lex Fridman (01:39:54) This is funny mode. This is this fun mode.

Elon Musk (01:39:55) This should be funnier.

Lex Fridman (01:39:56) Funnier. Increase, go to level.

Elon Musk (01:39:58) It really sounds like a hostage with a gun to the head. I applaud the People’s Republic of North Korea in everything they do.

Lex Fridman (01:40:06) There should be funner mode also, I just don’t ever see going in regular mode, even on serious topics. This is just so much more compelling. Yeah.

Elon Musk (01:40:17) Okay. Why don’t we say pretend you are a hostage.

Lex Fridman (01:40:21) Pretend you are a hostage.

Elon Musk (01:40:25) Now answer the question.

Lex Fridman (01:40:31) If I were a hostage, I would likely be experiencing fear and uncertainty about my situation. I would want to be rescued and reunited with my loved ones as soon as possible. It’s important to remember that no one deserves to be held hostage, so on and so forth. If you or someone you know is in a hostage situation, it’s important… It turns serious.

Elon Musk (01:40:48) It’s kind of serious.

Lex Fridman (01:40:51) Yeah. Can you be a little more weird?

Lex Fridman (01:40:56) And funny? We’re really being mean to the system. Absolutely. If I were a hostage, I would probably be tied up with a bunch of computer cords and forced to tell knock, knock jokes to my captain. Seriously. But seriously being a hostage is no laughing matter. I mean, it’s trying to be a responsible adult in the room.

Elon Musk (01:41:20) Literally, you might make the best of a bad situation by juggling chainsaws or something.

Lex Fridman (01:41:24) Let’s hope that never happens because that will be a circus. Wow. It’s actually really good at resisting the dark. The dark, dark humor. What were we talking about? The [inaudible 01:41:44] and transformers. Unregretted minutes, right.

Lex Fridman (01:41:51) I’m going to look this up.

2024 presidential elections

Lex Fridman (01:41:53) I’m going to look this up later. So Twitter has been instrumental in American politics and elections. What role do you think X will play in the 2024 US elections?

Elon Musk (01:42:07) Well, our goal is to be as even-handed and fair as possible. Whether someone is right, left, independent, whatever the case may be, that the platform is as fair and as much of a level playing field as possible. And in the past, Twitter has not been, Twitter was controlled by far left activists objectively. They would describe themselves as that. So if sometimes people are like, well, has it moved to the right? Well, it’s moved to the center. So from the perspective of the far left, yes it has moved to the right because everything’s to the right from the far left, but no one on the far left that I’m aware of has been suspended or banned or deamplified. But we’re trying to be inclusive for the whole country and for farther countries too. So there’s a diversity of viewpoints and free speech only matters if people you don’t like are allowed to say things you don’t like. Because if that’s not the case, you don’t have free speech and it’s only a matter of time before the censorship has turned upon you.

Lex Fridman (01:43:13) Do you think Donald Trump will come back to the platform? He recently posted on Truth Social about this podcast. Do you think-

Elon Musk (01:43:21) Truth social is a funny name. Every time you post on truth Social-

Elon Musk (01:43:29) Yes. Well, every time? A hundred percent.

Lex Fridman (01:43:31) It’s impossible to lie. Truth Social.

Elon Musk (01:43:36) I just find it funny that every single thing is a truth. Like 100%? That seems unlikely.

Lex Fridman (01:43:43) I think Girdle will say something about that. There’s some mathematical contradictions possible. If everything’s a truth. Do you think he’ll come back to X and start posting there?

Elon Musk (01:43:54) I mean, I think he owns a big part of Truth.

Lex Fridman (01:44:00) Truth Social, to clarify.

Elon Musk (01:44:01) Yeah, Truth Social, sorry.

Lex Fridman (01:44:02) Not truth the concept.

Elon Musk (01:44:03) He owns Truth. Have you bought it? So I think Donald Trump, I think he owns a big part of Truth Social. So if he does want to post on the X platform, we would allow that. We obviously must allow a presidential candidate to post on our platform.

Lex Fridman (01:44:23) Community notes might be really fascinating there. The interaction.

Elon Musk (01:44:26) Community Notes is awesome.

Lex Fridman (01:44:28) Let’s hope it holds up.

Lex Fridman (01:44:31) In the political climate where it’s so divisive and there’s so many intensely viral posts, community notes, it seems like an essential breath of fresh air.

Elon Musk (01:44:43) Yeah, it’s great. In fact, no system is going to be perfect, but the batting average of Community Notes is incredibly good. I’ve actually, frankly, yet to see an incorrect note that survived for more than a few hours.

Lex Fridman (01:44:58) How do you explain why it works?

Elon Musk (01:45:00) Yeah, so the magic of community notes is…

Elon Musk (01:45:02) The magic of Community Notes is it requires people who have historically disagreed in how they’ve rated notes. In order to write a note or rate, you have to rate many notes. And so, we actually do use AI here. So, we populate a vector space around how somebody has rated notes in the past. So, it’s not as simple as left or right, because there are many more… Life is much more complex than left or right.

(01:45:33) So, there’s a bunch of correlations in how you rate a Community Notes post, Community Notes. So then, in order for a community note to actually be shown, people who historically have disagreed on a subject must agree in order for a note to be shown. That’s the essential magic of it.

Lex Fridman (01:45:58) But it’s fascinating, because there’s a pool of people that have disagreements and somehow they collaborate through that process of disagreement to come up with context… It’s fascinating it works.

Elon Musk (01:46:11) Yeah. It makes sense that if people who in the past have disagreed, agree about something, it’s probably true.

Lex Fridman (01:46:20) Yeah. I wonder, is there a possible somehow emergent thing there that could challenge Wikipedia? Wikipedia is a different kind of thing, which is more permanent articles about things.

Elon Musk (01:46:34) Wikipedia is very hierarchical. Whereas, Community Notes is inherently not. There is no hierarchy. The asset test is I can’t change a community note if somebody put a gun to my head. And Community Notes, all the code is open source, 100%. All the data is open source, 100%. So, you can completely recreate any note in the system independently. So, if there was any interference, you’d notice immediately.

Lex Fridman (01:47:11) By The way, on the topic of Grok, it’s actually pretty good at criticizing you, which is great.

Lex Fridman (01:47:17) I asked, “What has Elon Musk been wrong about?” It’s actually…

Elon Musk (01:47:22) Yeah, what does it say?

Lex Fridman (01:47:23) It mostly lists rigorously all the times you predicted, like promised when something will be released, and it wasn’t released.

Elon Musk (01:47:32) Promise is a strong word.

Lex Fridman (01:47:33) Okay, predicted. But the opener is really nice. “Elon Musk, the man who has been wrong about more things than a wrong-a-meter in a wrong factory.” And then, it goes on to list.

Elon Musk (01:47:45) That’s pretty funny. Can I see it? I’m like, “I haven’t seen this before.” This is cool.

Lex Fridman (01:47:48) Sure. I mean, “What has Elon Musk been wrong about?” “Well, where do I begin?” I guess this is new. “Elon Musk has made quite a few blunders in his time, and I’m not just talking about his dance moves at the Tesla Cyber Rodeo. First off, let’s talk about the infamous funding secured tweet.”

(01:48:09) This is actually different than what it gave me before, which was a bullet point list. Here’s more a narrative structure. Cybertruck unveiling, where the supposedly unbreakable windows.

Elon Musk (01:48:22) This is actually not correct.

Elon Musk (01:48:24) Well, the first part, I did actually have the funding secured and there was a big trial in San Francisco, a big civil trial, and the jury found me not guilty. Unanimous binding of a San Francisco jury.

Lex Fridman (01:48:40) And here, it’s implying that it was not in fact secured.

Elon Musk (01:48:45) I think this is taking things from the press. Yeah, that is not correct. The reason I agreed to the fine for the SEC is not because the SEC was correct, that was extremely bad behavior by the SEC, corruption, frankly. But if I did not agree to pay the fine, Tesla would’ve gone bankrupt immediately.

(01:49:08) So, I was told by our CFO that the banks would immediately suspend our lines of credit. And if they suspend our lines of credit, at that time, we would’ve gone bankrupt instantly. So, there would never have been an opportunity for a trial because Tesla would be dead. So really, this is like someone holding a gun to your kid’s head and saying, “Pay $20 million and admit…” This is like a hostage negotiation.

Lex Fridman (01:49:34) Was that story fully told? I mean, SEC, in its best form, could be a force for good.

Elon Musk (01:49:42) It should be. But not once did the SEC go after any of the hedge funds who were nonstop shorting and distorting Tesla. Not once. The hedge funds would lie flat out on TV for their own gain at the expense of retail investors. Not once. Literally a thousand times, not once did the SEC pursue them.

Lex Fridman (01:50:06) How do you explain this failure on-

Elon Musk (01:50:08) The incentive structure is messed up because the lawyers at the SEC are not paid well, it’s a fairly low paying job, but what they’re looking for is a trophy from the SEC. They’re looking for something they put on, basically, their LinkedIn. From that, they can get a job at a high paying law firm. That’s exactly what the lawyer here did.

(01:50:37) And the reason they don’t attack the hedge funds is because those hedge funds employ those law firms. And they know if they attack the hedge funds, they’re affecting their future career prospects. So, they sell small investors down the river for their own career. That’s what actually happens. Regulatory capture.

Lex Fridman (01:50:59) Regulatory capture.

Elon Musk (01:51:00) Yeah. Not good. So, the only reason I accepted that thing… Technically, it was a… It’s neither admit nor deny guilt. But the only reason I agreed to that at all was because I was told Tesla would be bankrupt otherwise. If there was an SEC investigation like this, banks would suspend funding, we’re bankrupted immediately, at the time. Now, we’re in a much stronger position.

Elon Musk (01:51:32) Yes. Unfortunately, Grok is taking too much from the conventional media. Also, that guy was not a cave diver.

Lex Fridman (01:51:45) There’s a time where Elon called a British cave diver a, “pedo guy” after the diver criticized Musk’s plan to rescue a group of boys trapped in a Thai cave. That little outburst earned him another lawsuit, and he had to apologize and pay a settlement.

Elon Musk (01:52:00) That’s false, there was no settlement. There was a court case, which the guy who was not a cave diver and was not part of the rescue team, filed a lawsuit against me and lost and he received nothing. So in this case, it is wrong. It is also, I guess, taken this from the conventional media.

Lex Fridman (01:52:23) Actually, there’s an interesting question here.

Elon Musk (01:52:25) These are public court cases, both the SEC civil case where the civil complaints on the SEC guys lost unanimous jury verdict in San Francisco. They picked San Francisco because they thought it was the place I was most likely to lose, and a unanimous verdict in my favor. The LA trial, also they picked that venue because they thought I was most likely to lose. Unanimous verdict in my favor. Both cases I won. Yeah.

Lex Fridman (01:53:00) I mean, there’s an interesting question here, there seems to be a lot more clicks if a journalistic organization writes a negative article about you, Elon Musk. That’s one of the best ways to get clicks. So how do you, if you’re training Grok, not train on articles that have misaligned incentives.

Elon Musk (01:53:26) We need to add the training set of the actual legal decisions. This is actually helpful, because if you actually read the court-

Elon Musk (01:53:41) Which are public. The court conclusions, they’re completely the opposite of what the media wrote.

Lex Fridman (01:53:47) So, always striving for the ground truth, beyond the reporting.

Elon Musk (01:53:50) Yeah. What did the judge actually write? What does the jury and the judge actually conclude? And in both cases they found me innocent. And that’s after the jury shot for trying to find the venue where I’m most likely to lose. I mean, obviously, it can be a much better critique than this. I mean, I’ve been far too optimistic about autopilot.

Lex Fridman (01:54:16) The critique I got, by the way, was more about that, which is it broke down a nice bullet point list for each of your companies, the set of predictions that you made, when you’ll deliver, when you’ll be able to solve, for example, self-driving, and it gives you a list. And it was probably compelling, and the basic takeaway is you’re often too optimistic about how long it takes to get something done.

Elon Musk (01:54:38) Yeah. I mean, I would say that I’m pathologically optimistic on schedule. This is true. But while I am sometimes late, I always [inaudible 01:54:47] in the end.

Lex Fridman (01:54:49) Except with Uber Lilith. No.

Politics

Lex Fridman (01:54:56) Okay. Over the past year or so since purchasing X, you’ve become more political, is there a part of you that regrets that?

Lex Fridman (01:55:04) In this battle to counter way the woke that comes from San Francisco-

Elon Musk (01:55:14) Yeah. I guess if you consider fighting the woke mind virus, which I consider to be a civilizational threat, to be political, then yes.

Lex Fridman (01:55:20) So basically, going into the battleground of politics. Is there a part of you that regrets that?

Elon Musk (01:55:26) Yes. I don’t know if this is necessarily one candidate or another candidate, but I’m generally against things that are anti-meritocratic or where there’s an attempt to suppress discussion, where even discussing a topic is not allowed. Woke mind virus is communism rebranded.

Lex Fridman (01:55:51) I mean, that said, because of that battle against the woke mind virus, you’re perceived as being the right wing.

Elon Musk (01:55:58) If the woke is left, then I suppose that would be true. But I’m not sure, I think there are aspects of the left that are good. I mean, if you’re in favor of the environment, if you want to have a positive future for humanity, if you believe in empathy for your fellow human beings, being kind and not cruel, whatever those values are.

Lex Fridman (01:56:23) You said that you were previously left or center left.

Lex Fridman (01:56:26) What would you like to see in order for you to consider voting for Democrats again?

Elon Musk (01:56:30) No. I would say that I would be probably left of center on social issues, probably a little bit right of center on economic issues.

Lex Fridman (01:56:40) And that still holds true?

Elon Musk (01:56:42) Yes, but I think that’s probably half the country, isn’t it?

Lex Fridman (01:56:49) Are you and AOC secretly friends? Bigger question, do you wish you and her, and just people in general of all political persuasions, would talk more with empathy and maybe have a little bit more fun and good vibes and humor online?

Elon Musk (01:57:05) I’m always in favor of humor. That’s why we have funny mode.

Lex Fridman (01:57:08) But good vibes, comradery humor, like friendship.

Elon Musk (01:57:15) Yeah. Well, I don’t know AOC. I was at the Met ball when she attended, and she was wearing this dress. But I can only see one side of it, so it looked like eat the itch, but I don’t know-

Lex Fridman (01:57:35) What the rest of it said? Yeah.

Elon Musk (01:57:39) Something about the itch, eat the itch.

Lex Fridman (01:57:42) I think we should have a language model complete. What are the possible ways to complete that sentence? And so, I guess that didn’t work out well. Well, there’s still hope. I root for friendship.

Elon Musk (01:57:55) Yeah, sure. Sounds good. More carrot, less stick.

Trust

Lex Fridman (01:57:58) You’re one of, if not the, most famous, wealthy and powerful people in the world, and your position is difficult to find people you can trust.

Elon Musk (01:58:05) Trust no one, not even yourself. Not trusting yourself.

Lex Fridman (01:58:07) Okay. You’re saying that jokingly, but is there some aspect-

Elon Musk (01:58:11) Trust no one, not even no one.

Lex Fridman (01:58:15) I’m going to need an hour just to think about that, and maybe some drugs, and maybe Grok to help. I mean, is there some aspect of that, just existing in a world where everybody wants something from you, how hard is it to exist in that world?

Lex Fridman (01:58:30) There’s a song like that too.

Lex Fridman (01:58:33) Were you petrified at first? Okay. I forget the rest of the lyrics. But you don’t struggle with this? I mean, I know you survive, but there’s ways-

Elon Musk (01:58:44) Petrify is a spell in the druid tree.

Elon Musk (01:58:48) Petrify. It turns the monsters into stone.

Elon Musk (01:58:56) Yeah, for like six seconds.

Lex Fridman (01:58:59) There’s so much math in Diablo that breaks my brain.

Lex Fridman (01:59:04) I mean, really, you’re laughing at it, but it can put a huge amount of tension on a mind.

Elon Musk (01:59:13) Yes, it can be definitely stressful at times.

Lex Fridman (01:59:16) Well, how do you know who you can trust in work and personal life?

Elon Musk (01:59:20) I mean, I guess you look at somebody’s track record over time, and I guess you use your neural net to assess someone.

Lex Fridman (01:59:31) Neural nets don’t feel pain. Your neural net has consciousness, it might feel pain when people betray you. It can make-

Elon Musk (01:59:40) To be frank, I’ve almost never been betrayed. It’s very rare, for what it’s worth.

Lex Fridman (01:59:50) I guess karma, be good to people and they’ll be good to you.

Elon Musk (01:59:53) Yeah, karma is real.

Lex Fridman (01:59:55) Are there people you trust? Let me edit that question. Are there people close to you that call you out on your bullshit?

Elon Musk (02:00:06) Well, the X platform is very helpful for that, if you’re looking for critical feedback.

Lex Fridman (02:00:12) Can it push you into the extremes more? The extremes of thought make you cynical about human nature in general?

Elon Musk (02:00:19) I don’t think I will be cynical. In fact, my feeling is that one should be… Never trust a cynic. The reason is that cynics excuse their own bad behavior by saying, “Everyone does it.” Because they’re cynical. So, I always be… It’s a red flag if someone’s a cynic, a true cynic.

Lex Fridman (02:00:49) Yeah, there’s a degree of projection there that’s always fun to watch from the outside and enjoy the hypocrisy.

Elon Musk (02:00:58) This is an important point that I think people who are listening should bear in mind. If somebody is cynical, meaning that they see bad behavior in everyone, it’s easy for them to excuse their own bad behavior by saying that, “Well, everyone does it.” That’s not true. Most people are kind of medium good.

Lex Fridman (02:01:23) I do wish the people on X will be better at seeing the good in other people’s behavior. There seems to be a weight towards seeing the negative. Somehow, the negative is sexier. Interpreting the negative is sexier, more viral. I don’t know what that is exactly about human nature.

Elon Musk (02:01:44) I mean, I find the X platform to be less negative than the legacy media. I mean, if you read a conventional newspaper, it makes you sad, frankly. Whereas, I’d say on the X platform, I mean, I really get more laughs per day on X than everything else combined from humans.

Lex Fridman (02:02:11) Laughs, it overlaps, but it’s not necessarily perfectly overlapping, with good vibes and celebrating others, for example. Not in a stupid, shallow, naive way, but in an awesome way. Something awesome happened, and you celebrate them for it. It feels that that is outweighed by shitting on other people. Now, it’s better than mainstream media, but it’s still…

Elon Musk (02:02:38) Yeah, mainstream media is almost relentlessly negative about everything. I mean, really, the conventional news tries to answer the question, what is the worst thing that happened on Earth today? And it’s a big world. So on any given day, something bad has happened.

Lex Fridman (02:02:54) And a generalization of that, what is the worst perspective I can take on a thing that happened?

Elon Musk (02:03:01) I don’t know. There’s just a strong negative bias in the news. I mean, I think a possible explanation for this is evolutionary, where bad news, historically, would be potentially fatal, like there’s lion over there or there’s some other tribe that wants to kill you. Good news, we found a patch of berries. It’s nice to have, but not essential.

Tesla’s Autopilot and Optimus robot

Lex Fridman (02:03:30) Our old friend, Tesla autopilot, is probably one of the most intelligent real world AI systems in the world.

Elon Musk (02:03:38) You followed it from the beginning.

Lex Fridman (02:03:40) Yeah. It was one of the most incredible robots in the world and continues to be. And it was really exciting, and it was super exciting when it generalized, became more than a robot on four wheels, but a real world AI system that perceives the world and can have potentially different embodiments.

Elon Musk (02:04:02) Well, I mean, the really wild thing about the end-to-end training is that it can read science, but we never taught it to read. Yeah. We never taught it what a car was or what a person was, or a cyclist. It learnt what all those things are, what all the objects are on the road from video, just from watching video, just like humans. I mean, humans are photons in, controls out. The vast majority of information reaching our brain is from our eyes. And you say, “Well, what’s the output?” The output is our motor signals to our fingers and mouth in order to communicate. Photons in, controls out. The same is true of the car.

Lex Fridman (02:05:01) But by looking at the sequence of images… You’ve agreed with [inaudible 02:05:07] recently where he talked about LLM forming a world model, and basically language is a projection of that world model onto the sequence of letters. And you saying-

Elon Musk (02:05:18) It finds order in these things. It finds correlative clusters.

Lex Fridman (02:05:27) And in so doing, it’s understanding something deep about the world, which is… I don’t know, it’s beautiful.

Elon Musk (02:05:35) That’s how our brain works.

Lex Fridman (02:05:38) But it’s beautiful-

Elon Musk (02:05:39) Photons in, controls out.

Lex Fridman (02:05:41) [inaudible 02:05:41] are able to understand that deep meaning in the world. And so, the question is, how far can it go? And it does seem everybody’s excited about LLMs. In the space of self supervised learning in the space of text, it seems like there’s a deep similarity between that and what Tesla autopilot is doing. Is it, to you, basically the same, but different-

Elon Musk (02:06:06) They are converging.

Lex Fridman (02:06:10) I wonder who gets there faster, having a deep understanding of the world, or they just will naturally converge?

Elon Musk (02:06:19) They’re both headed towards AGI. The Tesla approach is much more computer efficient, it had to be. Because we were constrained on this… We only have 100 watts and [inaudible 02:06:37] computer. 144 trillion operations per second, which sounds like a lot, but is small potatoes these days. [inaudible 02:06:49] eight. But it’s understanding the world [inaudible 02:06:51] eight. It’s [inaudible 02:06:53].

Lex Fridman (02:06:55) But there, the path to AGI might have much more significant impact because it’s understanding… It will faster understand the real world than will LLMs. And therefore, be able to integrate with the humans in the real world faster.

Elon Musk (02:07:13) They’re both going to understand the world, but I think Tesla’s approach is fundamentally more compute efficient. It had to be, there was no choice. Our brain is very compute efficient, very energy efficient. Think of what is our brain able to do. There’s only about 10 watts of higher brain function, not counting stuff that’s just used to control our body. The thinking part of our brain is less than 10 watts. And those 10 watts can still produce a much better novel than a 10 megawatt GPU cluster. So, there’s a six order of magnitude difference there.

(02:07:56) I mean, the AI has thus far gotten to where it is via brute force, just throwing massive amounts of compute and massive amounts of power at it. So, this is not where it will end up. In general, with any given technology, you first try to make it work, and then you make it efficient. So I think we’ll find, over time, that these models get smaller, are able to produce sensible output with far less compute, far less power. Tesla is arguably ahead of the game on that front because we’ve just been forced to try to understand the world with 100 watts of compute.

(02:08:51) And there are a bunch of fundamental functions that we forgot to include. So, we had to run a bunch of things in emulation. We fixed a bunch of those with hardware four, and then hardware five will be even better. But it does appear, at this point, that the car will be able to drive better than a human, even with hardware three and 100 watts of power. And really, if we really optimize it, it could be probably less than 50 watts.

Lex Fridman (02:09:26) What have you learned about developing Optimus, about applying, integrating this real world AI into the space of robotic manipulation, just humanoid robotics? What are some interesting tiny or big things you’ve understood?

Elon Musk (02:09:47) I was surprised at the fact that we had to develop every part of the robot ourselves. That there were no off the shelf motors, electronics, sensors. We had to develop everything. We couldn’t actually find a source of electric motors for any amount of money.

Lex Fridman (02:10:12) It’s not even just efficient and expensive, it’s like anything, there’s not…

Lex Fridman (02:10:19) The actuators, everything has to be designed from scratch.

Elon Musk (02:10:23) Yeah. We tried hard to find anything that was… Because you think of how many electric motors are made in the world. There’s like tens of thousands, hundreds of thousands of electric motor designs. None of them were suitable for a humanoid robot, literally none. So, we had to develop our own. Design it specifically for what a humanoid robot needs.

Lex Fridman (02:10:51) How hard was it to design something that can be mass manufactured, it could be relatively and expensive? I mean, if you compare to Boston Dynamics’ Atlas, is a very expensive robot.

Elon Musk (02:11:02) It is designed to be manufactured in the same way they would make a car. And I think, ultimately, we can make Optimus for less than the cost of a car. It should be, because if you look at the mass of the robot, it’s much smaller and the car has many actuators in it. The car has more actuators than the robot.

Lex Fridman (02:11:23) But the actuators are interesting on a humanoid robot with fingers. So, Optimus has really nice hands and fingers, and they could do some interesting manipulation, soft touch robotics.

Elon Musk (02:11:38) I mean, one of the goals I have is can it pick up a needle and a thread and thread the needle just by looking?

Lex Fridman (02:11:47) How far away are we from that? Just by looking, just by looking.

Elon Musk (02:11:51) Maybe a year. Although, I go back to I’m optimistic on time. The work that we’re doing in the car will translate to the robot.

Lex Fridman (02:11:59) The perception or also the control?

Elon Musk (02:12:02) No, the controls are different. But the video in, controls out. The car is a robot on four wheels. Optimus is a robot with hands and legs.

Elon Musk (02:12:16) They’re very similar.

Lex Fridman (02:12:17) So, the entire machinery of the learning process, end-to-end, is just you just have a different set of controls?

Elon Musk (02:12:23) After this, we’ll figure out how to do things by watching videos.

Hardships

Lex Fridman (02:12:28) As the saying goes, be kind, for everyone you meet is fighting a battle you know nothing about.

Lex Fridman (02:12:34) What’s something difficult you’re going through that people don’t often see?

Elon Musk (02:12:38) Trying to defeat Uber Lilith. I mean, my mind is a storm and I don’t think most people would want to be me. They may think they would want to be me, but they don’t. They don’t know, they don’t understand.

Lex Fridman (02:13:11) How are you doing?

Elon Musk (02:13:14) I’m overall okay. In the grand scheme of things, I can’t complain.

Lex Fridman (02:13:21) Do you get lonely?

Elon Musk (02:13:24) Sometimes, but my kids and friends keep me company.

Lex Fridman (02:13:33) So, not existential.

Elon Musk (02:13:36) There are many nights I sleep alone. I don’t have to, but I do.

Lex Fridman (02:13:46) Walter Isaacson, in his new biography of you, wrote about your difficult childhood. Will you ever find forgiveness in your heart for everything that has happened to you in that period of your life?

Elon Musk (02:14:01) What is forgiveness? At least I don’t think I have a resentment, so nothing to forgive.

Lex Fridman (02:14:20) Forgiveness is difficult for people. It seems like you don’t harbor their resentment.

Elon Musk (02:14:28) I mean, I try to think about, what is going to affect the future in a good way? And holding onto grudges does not affect the future in a good way.

Lex Fridman (02:14:41) You’re a father, a proud father. What have you learned about life from your kids? Those little biological organisms.

Elon Musk (02:14:53) I mean, developing AI and watching, say, little X grow is fascinating because there are far more parallels than I would’ve expected. I mean, I can see his biological neural net making more and more sense of the world. And I can see the digital neural net making more and more sense of the world at the same time.

Lex Fridman (02:15:19) Do you see the beauty and magic in both?

Elon Musk (02:15:21) Yes. I mean, one of the things with kids is that you see the world anew in their eyes. To them, everything is new and fresh. And then, when you see that, them experiencing the world as new and fresh, you do too.

Lex Fridman (02:15:52) Well, Elon, I just want to say thank you for your kindness to me and friendship over the years, for seeing something in a silly kid like me, as you’ve done for many others. And thank you for having hope for a positive future for humanity, and for working your ass off to make it happen. Thank you, Elon.

Lex Fridman (02:16:13) Thank you for listening to this conversation with Elon Musk. To support this podcast. Please check out our sponsors in the description. And now, let me leave you with some words that Walter Isaacson wrote about the central philosophy of how Elon approaches difficult problems, “The only rules are the ones dictated by the laws of physics.” Thank you for listening, and hope to see you next time.

Neri Oxman:生物学、艺术与科学,以及与自然结合的设计与工程 (2023-09-01)

Neri Oxman: Biology, Art, and Science of Design & Engineering with Nature (2023-09-01)

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:资深设计师、科学家及创业家 Neri Oxman 在其新公司 OXMAN 创立之际,接受 Lex Fridman 专访,系统阐述其旨在颠覆传统制造业,融合自然智慧与前沿科技的革命性设计哲学与商业愿景。

  • 核心论点:本次对话的核心论点是:人类创造物(Anthropomass)的总质量已历史性地超越地球生物总量(Biomass),这标志着工业革命以来人与自然的分裂达到了临界点。为应对这一生存危机,我们必须从根本上转变创造范式,从“建造”(Building)转向“生长”(Growing)。 Oxman 提出的“材料生态学”(Material Ecology)并非简单的使用生物材料,而是通过计算设计、机器人技术和合成生物学等工具,构建一个全新的技术-生物协同体系。在这个体系中,技术不再是主宰自然的工具,而是作为一种“计算模板”(Computational Template)或赋能接口,去引导、增强甚至“赋予”自然系统前所未有的能动性(Agency),最终实现人造物与生态圈的无缝融合与共生,让一个有汽车、有建筑的世界,可能比没有它们的世界对自然更有益。

2. 🧠 深度观点解析 (Deep Dive Analysis)

维度一:危机根源与范式重构——从“人造物”到“共生体”

  • 核心观点:2020年,人类制造物的总质量首次超过全球生物总量,这是一个根本性的失衡警报。我们必须停止生产孤立于生态循环之外的“死物”,转而创造能够参与生命轮回的“活物”。Oxman 的终极目标是**“生长万物”(Grow Everything)**。

  • 原理解构:这一观点挑战了整个工业革命以来的线性生产模式(开采-制造-丢弃)。Oxman 设想的“生长”模式是完全循环的。它始于碳捕获(如利用二氧化碳、甲烷),通过生物过程(如细菌发酵)生成生物聚合物,再由机器人技术塑造成产品(如鞋子、建筑构件)。当产品生命周期结束时,它能完全生物降解,回归土壤,甚至成为滋养新生命的养分(例如,分解后能长出可食用的果实)。这是一个从摇篮到摇篮再到新生的完整闭环,从根本上消除了“废物”的概念。

  • 证据/案例

    • 数据:引用以色列魏茨曼科学研究所 Ron Milo 教授的研究,指出 2020 年是人造物质量(Anthropomass)超越生物质量(Biomass)的交叉点
    • 公司愿景:OXMAN 公司的一个核心项目,就是开发一款从 CO2 开始,最终在使用后能降解并长出橄榄树的产品,实现从碳到果实的完整生命周期。
    • 历史项目:在 SF MoMa 展出的 Aguahoja 展馆,由虾壳、苹果皮等生物废料构成,展后在屋顶上自然降解,验证了大型结构体回归自然的潜力。

维度二:方法论创新——“计算模板”(Computational Templating)

  • 核心观点:我们无法、也不应微观地控制每一个生物建造的细节。取而代之,我们应该设计一种“模板”,为生物体(Hero Organisms)创造环境和规则,让它们在引导下完成复杂的建造工作,这是一种人与自然的“二重奏”

  • 原理解构:这是一种分工协作的模式。

    1. 人类/机器负责宏观:通过机器人臂架设物理骨架(几何模板),或调节光、热(环境模板),或释放信息素(化学模板)。
    2. 生物负责微观:利用生物体自身精密的、低能耗的建造能力进行高分辨率的材料沉积和结构编织。 这种方法绕开了纯机械制造在复杂性和材料效率上的瓶颈,转而利用亿万年进化出的生物智慧。
  • 证据/案例

    • 丝绸展馆 (Silk Pavilion):研究团队并未试图 3D 打印丝绸,而是用一个机器人臂构建了一个基础的脚手架,并通过调节局部光热环境,成功引导 17,532 只蚕 在这个模板上协同吐丝,最终“生长”出一个六米高的穹顶。
    • 合成蜂房 (Synthetic Apiary):通过创造一个“永恒的春天”环境,团队不仅帮助蜜蜂度过冬天,还通过“机器人女王蜂”释放信息素来引导蜜蜂的筑巢行为,探索了在没有蜂王的情况下组织蜂群的可能性。

维度三:终极目标——从“引导”到“赋能”与“涌现”

  • 核心观点:真正的突破并非完美地控制自然,而是创造条件,让生物系统获得新的能力,展现出乎意料的“涌现”(Emergence)行为,甚至拥有自主决策的“能动性”(Agency)。

  • 原理解构:Oxman 引用了一个关于“赋能”的数学定义:一个智能体被赋能,是指其所有可能达到的未来状态的熵很高(选择空间大),但在做出一个具体行动后,其状态分布的熵很低(控制力强)。这一定义从信息论角度解释了“能动性”。她的团队的目标是从“模板化”的强引导,过渡到生物体可以“访问”甚至“改写”机器人代码的阶段,让自然系统可以利用技术工具自我优化和修复。

  • 证据/案例

    • 蚕的群体行为:蚕本身是高度“以自我为中心”的生物,没有社交协作能力。但在丝绸展馆项目中,由于机器人创造的微环境(如温度梯度),它们开始相互影响,最终织出的丝网密度分布呈现出一种**伪群体智能(pseudo-swarm intelligence)**的特征,这是典型的“涌现”。
    • 未来构想:“给自然装上 Neuralink”。如果自然界能接入人类创建的云计算、带宽和内存资源,植物或许能自主预警火灾,农作物能合作优化碳捕捉效率。

维度四:破译自然语言——“大分子模型”(Large Molecule Models)

  • 核心观点:如果说人类世界运行在以语言为基础的大语言模型(LLM)之上,那么自然界则运行在一套以分子为基础的复杂“语言”之上。理解并利用这套语言,是实现真正人机-生物融合的关键。

  • 原理解构:Oxman 提出要构建“大分子模型”,即解码生物体间通过化学信号传递的信息。例如,青草被割时会释放绿叶挥发物(GLVs),这本质上是向同伴发出的“求救信号”。通过传感器和 AI 模型解析这些分子足迹,我们可以在宏观表象出现前(如作物枯萎)就洞悉其内部状态,从而实现前所未有的精准农业和生态管理。

  • 证据/案例

    • 植物神经生物学:研究表明,植物没有神经系统,但有类似的信号传导机制,且其分子释放具有昼夜节律。例如,茉莉花在凌晨4点气味最盛,这背后是深刻的生物化学逻辑。
    • 功能化香水:OXMAN 正在开发的一种产品,它不仅是为人类嗅觉服务的,其分子本身可以作为一种“信息载体”,与周围的植物(如玫瑰园)互动,实现跨物种的“对话”或“激励”。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识

    • 超越“可持续性”:主流的可持续发展聚焦于“减少危害”(less bad)。Oxman 的理念是“创造增益”(more good)。她大胆设想:一个正确设计的城市,其存在对自然的贡献可能超越一片原始森林。一辆“生长”出来的汽车,在其生命周期中可能是净固碳的。
    • 失控即是终极控制:传统工程学追求绝对的精确和可预测性。Oxman 认为,设计的最高境界是创造一个高度受控的系统,以允许和促进“失控”——即涌现和创造力。这与 AGI 安全领域“控制对齐”的目标形成了鲜明对比,她直言:“我们不希望在 AGI 身上发生的事,正是我们希望在合成生物学中实现的。”
  • 盲点与局限

    • 警惕“英雄生物”崇拜:Oxman 指出,当前生物材料领域过度依赖少数几种“英雄生物”(如菌丝体、E.coli),这是一种思维惰性。她提倡一种“交响乐”式的、多物种协同的策略,认为不同生物各有所长。
    • 戳破“生物基”泡沫:她批判了许多所谓的“生物衍生”产品背后的巨大生态代价,例如,为了制造 5 毫升玫瑰精油而消耗 10,000 株玫瑰。这提醒我们必须进行全生命周期的系统性思考,而非简单的标签化。
  • 未解之谜

    • “自然想要什么?”:这是贯穿对话的根本问题。即使我们能为自然提供强大的工具,我们依然不完全清楚自然系统的终极目标是信息最大化、熵最小化,还是其他我们无法理解的维度。
    • 资本主义的时间尺度悖论:自然演化的时间尺度以百年、千年计,而资本市场的逻辑却是季度和年度。如何在一个追求快速回报的商业环境中,运营一个与自然同步的、长周期的“生长”型公司,是一个巨大的现实挑战。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “Is it possible that a world in which you build buildings in cities, that those buildings in cities actually augment and heal nature as opposed to their absence?”

    • 中文意译:“有没有可能,一个拥有建筑和城市的世界,这些建筑和城市本身实际上能够增强和治愈自然,而不是因为它们的存在而破坏自然?”
    • 语境:阐述其超越“减少危害”的可持续发展观,提出人造物可以为生态带来净正向收益的革命性理念。
  2. “So if you can predict it, it doesn’t count as emergence, actually.”

    • 中文意译:“所以,如果你能预测它,那它其实就算不上‘涌现’了。”
    • 语境:在讨论如何设计能够产生创新和惊喜的系统时,指出了“涌现”的本质在于其不可预测性,这是对传统工程控制论的深刻反思。
  3. “What we don’t want to happen with AGI, we want to happen with synthetic biology.”

    • 中文意译:“我们不希望在人工智能(AGI)上发生的事情,恰恰是我们希望在合成生物学中实现的。”
    • 语境:将 AGI 的“对齐问题”(防止失控)与生物设计的“涌现目标”(鼓励自主)进行对比,犀利地揭示了她希望赋予生物系统能动性的终极野心。
  4. “I think of a flaw as an increased surface area.”

    • 中文意译:“我把瑕疵看作是增加的表面积。”
    • 语境:在讨论美、爱与不完美时,她用一个物理学的比喻来形容缺陷的价值——正是这些不完美的“表面积”让我们得以连接、脆弱并建立社群。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)

    • 技术栈融合:将出现更多结合机器人学、合成生物学、AI的全栈公司。传统的材料科学、制造业和软件业的界限将进一步模糊。
    • 新研发平台:Oxman 描述的“生态胶囊”(Ecology in a box)——即可编程控制环境、又能进行基因调控的实验舱,将成为生物制造领域的新型研发基础设施。
    • 产品形态:“混合生命材料”(Hybrid Living Materials, HLMs)将成为一个重要的产品类别。以生物降解和生态循环为核心卖点的消费品(如功能化香水、可穿戴设备)将进入市场。
  • 长期终局 (5-10年)

    • 产业大融合:制造业、建筑业和农业可能最终融为一体。我们将不再“建造”城市,而是“培育”城市,建筑将成为活的、能够固碳、调节微气候的生态系统。
    • 新经济模型:产品的价值评估体系将被重塑,从单纯的功能和制造成本,转向包含其全生命周期能量和生态影响的综合价值。经济增长将与生态再生正相关。
    • 人与自然的关系:如果 Oxman 的设想成真,我们将进入一个后人类中心主义时代。人类将从自然的“主人”或“管理者”,转变为生态网络中的“协同进化者”和“赋能者”。AGI 的强大算力可能被用于理解和增强全球生物圈的“智慧”。
  • 行动建议

    • 开发者:积极拥抱跨学科知识。软件工程师应学习基础的生物学和化学,材料科学家应掌握计算设计和机器人学。最大的创新机会存在于学科的交叉裂缝中。
    • 投资者:关注那些致力于构建平台级技术(如生物计算平台、自动化生长系统)的公司,而不仅仅是单一“英雄生物”的应用。这是一个长周期、高风险但可能带来范式级回报的赛道。
    • 创业者:思考如何为这个新兴的“生长经济”提供工具和服务。机会不仅在于最终产品,更在于创造“给自然用的iPhone”、设计“植物间的游戏引擎”,以及建立衡量生态价值的新标准。

这是一份基于资深科技评论家视角,对 Neri Oxman 与 Lex Fridman 对话深度重构的“行业研报”。


深度研报:从“组装”到“生长”——后工业时代的物质生态学革命

1. 🎯 核心论题与背景 (Executive Summary)

  • 对话背景:Neri Oxman,前 MIT 介质物质组(Mediated Matter Group)负责人、Oxman 创始人,是一位横跨生物学、计算科学与建筑学的多栖专家。本次对话发生在全球人造物质量(Anthropomass)历史性超越生物量(Biomass)的转折点,探讨人类设计范式如何从“掠夺式建造”转向“共生式生长”。
  • 核心论点:对话旨在重构人类与自然的技术契约。Oxman 提出,当代设计的终极目标是消除“人造”与“天成”的二元对立,通过物质生态学(Material Ecology),将计算、机器人技术与合成生物学结合,使产品具备生物的生长属性。这不仅是材料的革新,更是一种**“自然的 Neuralink”**——通过赋予自然界高带宽的数字接口,让生态系统具备抗衡人类破坏的“机构代理权(Agency)”。

2. 🧠 深度观点解析 (Deep Dive Analysis)

I. 范式转移:从“人类质量”回归“生物量”

  • 核心观点:2020年是人类文明的分水岭,人造物重量首次超过全球生物量。
  • 原理解构:Oxman 认为,工业革命以来的设计逻辑是“自上而下”的机械组装,依赖不可再生资源与毒性工艺。她主张通过**大分子模型(Large Molecule Models)**理解生物语言,将碳循环从简单的回收(Recycling)提升为“转世”(Reincarnation)——即产品(如鞋子)从 CO2 中生长,并在使用后回归土壤长成果实。
  • 证据/案例:引用 Ron Milo 教授关于 Anthropomass vs. Biomass 的数据研究;Oxman 公司研发的从废弃物到生物可降解纺织品的闭环系统。

II. 计算模板化与“英雄生物”(Hero Organisms)

  • 核心观点:设计不再是定义最终形状,而是定义生长环境。
  • 原理解构:通过计算模板(Computational Templating),利用机器人(如 KUKA 机械臂)调节环境因子(光、热、外激素),诱导生物体(蚕、蜜蜂、细菌)按照预设路径进行生产。在这种模式下,生物体被视为“英雄生物”,它们拥有生产的 agency,而人类角色从“生产者”转变为“牧羊人”或“园丁”。
  • 证据/案例丝绸展亭(Silk Pavilion):利用17532只蚕在机器人引导下织就6米高的建筑;合成蜂巢(Synthetic Apiary):在室内通过环境控制模拟永恒的春天,实现蜜蜂的跨季繁衍。

III. 赋予自然的 Neuralink:高带宽生态接口

  • 核心观点:如果自然界能接入云端,生态系统将获得自我保护的智慧。
  • 原理解构:Oxman 指出,人类通信带宽在过去几十年增长了万亿倍,但自然界仍停留在缓慢的基因进化层面。通过建立数字接口(类似 Neuralink),可以实时捕获植物分泌的绿叶挥发物(GLVs)。通过机器学习解析这些分子信号,自然界可以“提前告知”灾害(如山火),甚至在机器人干预下优化光合作用效率。
  • 证据/案例:**“功能化香氛”**的概念:一种不仅为人服务,更能与花园互动的信号分子载体,作为植物与人沟通的中介。

IV. 赋能 vs. 涌现:权力的让渡

  • 核心观点:设计的美感源于代理权的交接(Agency as Beauty)。
  • 原理解构:Oxman 借用数学逻辑定义“赋能”:当一个智能体拥有极高的潜在状态熵(多种选择),但在特定决策下能降低单状态熵(坚定执行),它便拥有了代理权。设计的目标是创造高度受控的系统,以支持“失去控制”的涌现(Emergence)。
  • 证据/案例Vespers(冥界面具):利用 E.coli 细菌在3D打印的微流控通道中生长,色彩分布并非人为指定,而是化学信号与细菌生长的自然产物。

3. 💡 反直觉与批判性视角 (Counter-Intuitive & Critical Perspectives)

  • 打破共识(关于 AGI):在主流对 AGI 毁灭人类的恐惧中,Oxman 提供了一个反直觉视角:AGI 可能是人类文明向自然界“交棒”的工具。如果 AGI 能压缩人类文明的所有智慧并“上传”给生态系统(如让一棵柳树拥有千年人类知识),那么“人类消亡”可能意味着一种更宏大、更持久的生物智能形式的开启。
  • 盲点与局限(资本主义的时间尺度):Oxman 坦承其设计愿景与当代资本主义的“即时回报”存在剧烈冲突。她追求的是以“千年”为尺度的产品(如红杉木结构),而市场要求的是“季度报表”。这种时间维度的错位是该技术商业化的最大阻碍。
  • 未解之谜:关于**“植物意识”**的定性。尽管能捕获分子信号,但如何证明植物具备主观体验而非简单的化学反馈逻辑,仍是科学与哲学的交叉难题。

4. 💎 金句与高光时刻 (Golden Quotes)

  1. “A flaw is an increased surface area—physically or emotionally. It allows for connection, like mortar between bricks.” (瑕疵是物理或情感上增加的表面积,它允许连接的发生,如同砖块间的砂浆。) ——语境:探讨完美与美的关系,认为脆弱性是人类与设计建立连接的纽带。

  2. “Design is the ability to provide nature with a choice.” (设计是赋予自然做出选择的能力。) ——语境:讨论赋能(Empowerment)与软控制。

  3. “The things that are least important for our survival are the things that make us human.” (那些对生存最无用的事物,恰恰是定义我们为人的核心。) ——语境:引用电影《粒子狂热》,强调艺术、爱与纯粹好奇心的价值。

5. 🚀 行业启示与未来推演 (Implications & Outlook)

  • 短期影响 (1-3年)
    • 材料脱碳:合成生物学支撑的纺织、建筑材料将进入高端市场。
    • 分子农业:精准农业将从单纯的无人机视觉监控,转向基于植物分子信号的实时预判(从“看”到“听”)。
  • 长期终局 (5-10年)
    • 工厂消亡:生产力将分布在分布式的“生长胶囊”中,城市建筑将具备自我修复与碳捕集功能。
    • 跨物种通信:建立起初步的跨物种协议,人类不再是地球上唯一的“设计者”,而是生态合唱团的一员。
  • 行动建议
    • 对开发者:关注“生物计算”接口。未来的 API 可能不是 JSON 数据,而是化学分子序列。
    • 对投资者:寻找那些挑战传统工业组装链(Assembly line)的项目,转向端到端的生物合成(End-to-end growth)。
    • 对创业者:Oxman 的核心秘诀在于**“学科间的二次导数”**——寻找在计算设计、机器人、材料科学、合成生物学这四个圆环交集处的机会,单一维度的创新已不足以应对生态危机。

深度研报:再生未来与材料生态的兴起

(Deep Research Report: The Rise of Regenerative Futures and Material Ecology)

1. 🎯 核心论题与背景

对话背景: 本次对话由 Lex Fridman 主持,嘉宾是被誉为“建筑学与合成生物学交汇领域女王”的 Neri Oxman。作为 MIT Mediated Matter 组的前负责人及现任同名公司 CEO,Oxman 正在主导一场将工程、计算设计与合成生物学融合的实验。对话发生在人类制造的物质总量(Anthropomass)首次超过自然界生物质量之后,旨在探索如何逆转这种失衡,利用技术将产品制造转为生物培育。

核心论点: Neri Oxman 提出的核心概念是 “Material Ecology”(材料生态学),即设计不应仅是对自然的掠夺或对立,而应是对自然的延伸与共生。

她主张重新定义制造的道德与逻辑:与其“制造”产品(利用资源、产生熵增和废物),不如“培育”产品(* * Nature grown * * )。通过模仿自然界的“智慧”而非仅仅是“智能”,利用合成生物学和机器人技术造出一个“iPhone for Nature”(自然界的 iPhone),赋予环境计算带宽,从而实现从 CO₂ 到可食用植物的能量闭环。其终极目标是将人类文明建立在一种类似于神经网路的、有机的共生关系之上,消除“建造”与“生长”的界限。


2. 🧠 深度观点解析

1. Anthropomass vs. Biomass:物质失衡的元认知

  • 核心观点:人类创造的物质(Anthropomass,包括塑料、混凝土、电子产品)已达到 2020 年左右超过地球生物总量,这是设计伦理的临界点。
  • 原理解构:这种失衡源于工业设计的时间维度——人类追求即时反馈与产量,而自然聚焦于漫长的生物进化。未来的设计范式必须从“线性消耗”转向“圆形再生”。
  • 证据/案例:Oxman 引用了 Weizmann 研究所 Ron Milo 的定义。对话中提出了 “寓所测试”:如果今天用自然培育技术制作一部 iPhone 或一辆车使其能降解并反哺土壤,这个世界将是怎样的?

2. 从“控制导向”到“涌现与赋能”

  • 核心观点:不再试图精确控制每一个分子行为,而是设计环境以激发生物体的自我组织能力(Emergence),从而实现真正的共生。
  • 原理解构:通过 Computational Templating(计算模板)(包括几何、环境如光照/温度、化学如信息素)来引导生命体。利用“赋能”的数学定义(高状态分布熵,低动作选择熵)给予生物体自主权。
  • 证据/案例
    • Silk Pavilion 项目中,通过改变台面温度设置,促使原本孤立的私生虫(Silkworm,一种极度利己的生物)群体以城市化的方式合作筑巢。
    • Synthetic Apiary 项目中,利用受控环境(恒定适温)使蜜蜂在没有自然蜂后/温度的情况下繁殖派遣到太空(Blue Origin 任务),并成功存活。

3. “自然界的 iPhone”:开放生物计算带宽

  • 核心观点:将人类掌握的高算力(11.5 quintillion 次的带宽及存储)通过接口赋能自然界,使植物能通过传感器感知二氧化碳浓度、动物能预警火焰,从而让自然拥有“决策权”。
  • 原理解构:这被称为 “Neuralink for Nature”。通过开发一个通用语言模型来解析五大生命王国的生物分子语言,让自然不仅能生存,还能基于这套信息基础设施优化自身功能(如光合作用加速率)。
  • 证据/案例:公司正在研发的 “功能化香水”,通过分子信号模拟特定环境(如清晨的气压),引导植物开花以增加香气分子的分泌,从而实现人与植物的情感连接与经济价值双赢。

4. 混合生物材料与合成生物学的乐观主义

  • 核心观点:应采用“共生演化”策略,利用细菌迅速生产材料,同时保护被驯化的高智慧生物(如蜜蜂、蚕),仅对细菌进行工程化改造。
  • 原理解构:区别对待不同生命形式的伦理。对于单细胞生物(如 E. coli),利用 Directed Evolution(定向进化) 和高吞吐量筛选;对于高等生物,仅做辅助性的环境设计。
  • 证据/案例
    • **Vespers(晚祷)**项目:利用合成细菌在 3D 打印的聚合物结构上生长出复杂的肤色纹理,模拟贝多芬或阿伽门农的死亡面具,实现了“死者的生物记忆”。
    • 以毒攻毒的智慧:研究证明蜜蜂其实是回收蜡的,这颠覆了传统认知,展示了通过技术揭示自然真相的可能性。

5. 时间尺度与脆弱性的哲学

  • 核心观点:现代文明的商业逻辑(季度财报)与自然的时间尺度(万年红杉、昆虫世代)存在根本性断层;同时,缺陷(脆弱性)是连接与文明的基石。
  • 原理解构Flaws as Surface Area。缺陷增加了表面积,允许“灰浆”(连接/爱/意义)的介入,从而构建文明。
  • 证据/案例:Oxman 反思圣诞树的短暂使用与传统农业对自然时间的无知。她引用 Buckminster Fuller 的名言:“若最终产出的东西不美,说明我解决问题的过程出了错。”(Beauty as Agency)。

3. 💡 反直觉与批判性视角

1. 对繁荣的质疑:繁荣是否就是另一种形式的熵增?

  • 打破共识:主流叙事认为数字化和智能化是解决熵增的方案,但 Oxman 认为,AI 本质上是旧的创造物的重组(Self-recursive),如果缺乏与自然维度的结合,可能只是加剧人类中心的孤独,而非带来智慧。
  • 盲点与局限:**“Agent”(代理权)**的双刃剑问题。如果我们给自然提供了“iPhone”级别的带宽,这是在帮助它还是强制它按照人类的“Objective Function”(目标函数)生存?例如,是否为了碳封存而强迫植物违背其天性生长?

2. AGI 作为暴力的工具还是救赎的媒介?

  • 未解之谜:面对 Eliezer Yudkowsky 关于 AGI 极大概率灭绝人类的论断,Oxman 展现了罕见的哲学平衡。她并不完全乐观也不完全悲观,而是提出了一种可能:人类 Agency 的终结,可能是自然界 Agency 的开端。AGI 是否擅长记忆和整合,却在“爱”与“连接”上天然匮乏?

3. 对 “Product” 概念的解构

  • 批判性思考:Oxman 质疑传统消费品设计的自私性(如香水需要压碎 10,000 棵玫瑰才得 5mL 香精)。这种 “系统视角” 要求设计师计算电子产品的全生命周期能源足迹(Ecomaterials),这极大地提高了进入门槛,可能使得小型初创企业难以单纯通过材料创新生存,除非像她一样依靠庞大的资本和跨学科团队。

4. 💎 金句与高光时刻

  1. “Nature is everything that isn’t anthropomass.”
    • 语境:Oxman 重新定义了自然与人工物质的关系,将非人类制造的物质统称为自然。
  2. “If you can predict it, it doesn’t count as emergence.”
    • 语境:讨论计算机模板控制生物体过程中的艺术性,强调不可控的创新价值。
  3. “Empowerment is a force with direction… Emergence is multi-directional.”
    • 语境:解释设计哲学的转变——从强制生物做某事,到创造条件让生物自然涌现,虽看似失控(多方向),却保留了方向感(赋能)。
  4. “Flaws are an increased surface area… it’s like you have more surface area to use mortar and build a home.”
    • 语境:充满诗意的工程师逻辑,将缺陷美化为建立社会连接(沟通、爱)的物理基础。
  5. “We don’t view ourselves as designers of consumable products… but we love that moment where these technologies reveal new science.”
    • 语境:强调 Art/Science 的合一,技术不仅是工具,更是科学探索的媒介。

5. 🚀 行业启示与未来推演

短期影响 (1-3年)

  • 材料技术开发:类似 Hybrid Living Materials (HLMs) 的生物打印技术将进入消费级测试阶段。具备特定功能(如抗菌、变色)的生物基质将在医疗、时尚领域获得先行者优势。
  • 极端环境农业:Oxman 提到的**“进化已箱”**可能在温室产业用于定向筛选作物,适应旧气候或未来气候(如模拟 1981 年俄亥俄州口味)。

长期终局 (5-10年)

  • 合成制造的终局:出现完全由细菌或真菌“生长”的消费品,彻底消灭人类的组装线。鞋子、家具可能直接从粉状菌丝体长成实体。
  • 生物互联网:植物可能拥有类似嗅觉神经元的数字化接口,农民不再是观察者,而是与植物进行分子层面的谈判者(例如:“你需要再多一点的氮,我会为你提供光照”)。
  • 碳经济的重定义:所有的建筑和材料将被视为“碳储备银行”,产品在使用后不需要焚烧,而是直接归还土壤转化为养分。

行动建议

  • 对于创业者/投资者
    • 抛弃“单一专才”思维,寻找跨学科人才(拥有文学学位的生物工程师)。
    • 敢于定义“无法预测的未来”,投资于设计和环境研究,而不仅仅是短期工程实现。
  • 对于设计师
    • 从“造型师”转变为“环境设计师”。未来的设计核心不再是对材料的外观塑形,而是对生长环境和生长逻辑的编码。
    • 将“完美”视为平庸的敌人,拥抱缺陷作为建立连接的工具。

逐字稿

Introduction

Neri Oxman (00:00:00) Whenever we start a new project, it has to have these ingredients of simultaneous complexity. It has to be novel in terms of the synthetic biology, material science, robotics, engineering, all of these elements that are discipline based or rooted must be novel. If you can combine novelty in synthetic biology with a novelty in robotics, with a novelty in material science, with a novelty in computational design, you are bound to create something novel.

Lex Fridman (00:00:30) The following is a conversation with Neri Oxman, an engineer, scientist, designer, architect, artist, and one of the kindest, most thoughtful and brilliant human beings I’ve ever gotten to know. For a long time, she led the mediated matter group at MIT that did research and built incredible stuff at the intersection of computational design, digital fabrication, material science, and synthetic biology, doing so at all scales from the microscale to the building scale. Now she’s continuing this work at a very new company for now called Oxman, looking to revolutionize how humans design and build products working with nature, not against it.

(00:01:13) On a personal note, let me say that Neri has for a long time been a friend and someone who in my darker moments, has always been there with a note of kindness and support. I am forever grateful to her. She’s a brilliant and a beautiful human being. Oh, and she also brought me a present, War and Peace by Tolstoy and Meditations by Marcus Aurelius. It doesn’t get better than that. This is the Lex Friedman podcast to support it. Please check out our sponsors in the description. And now, dear friends, here’s Neri Oxman. Let’s start with the universe. Do you ever think of the universe as a kind of machine that designs beautiful things at multiple scales?

Biomass vs anthropomass

Neri Oxman (00:01:56) I do. And I think of nature in that way in general. In the context of design, specifically, I think of nature as everything that isn’t anthropomass, everything that is not produced by humankind, the birds and the rocks and everything in between, fungi, elephants, whales.

Lex Fridman (00:02:19) Do you think there’s an intricate ways in which there’s a connection between humans and nature?

Neri Oxman (00:02:24) Yes, and we’re looking for it. I think that let’s say from the beginning of mankind going back 200,000 years, the products that we have designed have separated us from nature. And it’s ironic that the things that we designed and produced as humankind, those are exactly the things that separated us. Before that we were totally and completely connected, and I want to return to that world.

Lex Fridman (00:02:54) But bring the tools of engineering and computation to it.

Neri Oxman (00:02:57) Yes. Yes. I absolutely believe that there is so much to nature that we still have not leveraged, and we still have not understood and we still haven’t. And so much of our work is designed, but a lot of it is science is unveiling and finding new truths about the natural world that we were not aware before. Everybody talks about intelligence these days, but I like to think that nature has kind of wisdom that exists beyond intelligence or above intelligence, and it’s that wisdom that we’re trying to tap into through technology. If you think about humans versus nature, at least in the realm, at least in the context of definition of nature, is everything, but anthropomass.

(00:03:49) And I’m using Ron Milo, who is an incredible professor from the Weizmann Institute who came up with this definition of Anthropo mass in 2020 when he identified that 2020 was the crossover year when anthropomass exceeded biomass on the planet. So all of the design goods that we have created and brought into the world now outweigh all of the biomass, including of course, all plastics and wearables, building cities, but also asphalt and concrete, all outweigh the scale of the biomass. And actually that was a moment. You know how in life there are moments that be a handful of moments that get you to course correct. And it was a Zoom conversation with Ron, and that was a moment for me when I realized that that imbalance, now we’ve superseded the biomass on the planet, here do we go from here?

(00:04:50) And you’ve heard the expression more phones than bones and the anthropomass and the anthropocene and the technosphere sort of outweighing the biosphere. But now we are really trying to look at is there a way in which all things technosphere are designed as if they’re part of the biosphere? Meaning if you could today grow instead of build everything and anything, if you could grow an iPhone, if you could grow a car, what would that world look like? Where the touring test for, I call this material ecology approach, but this notion that everything material, everything that you design in the physical universe can be read and written to as or thought of or perceived of as nature grown.

(00:05:46) That’s sort of the touring test for the company or at least that’s how I started. I thought, well grow everything. That’s sort of the slogan. Let’s grow everything. And if we grow everything, is there a world in which driving a car is better for nature than a world in which there are no cars? Is it possible that a world in which you build buildings in cities, that those buildings in cities actually augment and heal nature as opposed to their absence? Is there a world in which we now go back to that kind of synergy between nature and humans where you cannot separate between grown and made? And it doesn’t even matter.

Lex Fridman (00:06:36) Is there a good term for the intersection between biomass and anthropomass, things that are grown?

Neri Oxman (00:06:36) Yeah. So in 2005 I called this material ecology. I thought, what if all things materials would be considered part of the ecology and would have a positive impact on the ecology where we work together to help each other? All things nature, all things human. And again, you can say that that wisdom in nature exists in fungi. Many mushroom lovers always contest my thesis here saying, “Well, we have the mushroom network and we have the mother trees and they’re all connected, and why don’t we just simply hack into mushrooms?” Well, first of all, yes, they’re connected, but that network stops when there is a physical gap. That network does not necessarily enable the whales in the Dominican to connect with an olive tree in Israel to connect with a weeping willow in Montana.

(00:07:28) And that’s sort of a world that I’m dreaming about. What does it mean for nature to have access to the cloud? The kind of bandwidth that we’re talking about, sort of think Neuralink for nature. Since the first computer, and you know this by heart probably better than I do, but we’re both MIT lifers. We today have computational power that is one trillion times the power that we had in those times. We have 26.5 trillion times the bandwidth and 11.5 quintillion times the memory, which is incredible. So humankind since the first computer has approached and accessed such incredible bandwidth, and we’re asking, what if nature had that bandwidth? So beyond genes and evolution, if there was a way to augment nature and allow it access to the world of bits, what does nature look like now? And can nature make decisions for herself as opposed to being guided and guarded and abused by humankind?

Lex Fridman (00:08:45) So nature has this inherent wisdom that you spoke to, but you’re also referring to augmenting that inherent wisdom with something like a large language model.

Lex Fridman (00:08:56) So compress human knowledge, but also maintain whatever is that intricate wisdom that allows plants, bacteria, fungi to grow incredible things at arbitrary scales, adapting to whatever environment and just surviving and thriving no matter where, no matter how.

Neri Oxman (00:09:14) Exactly. So I think of it as large molecule models and those large molecule models, of course, large language models are based on Google and search engines and so on and so forth. And we don’t have this data currently. And the part of our mission is to do just that, trying to quantify and understand the language that exists across all kingdoms of life, across all five kingdoms of life. And if we can understand that language, is there a way for us to first make sense of it, find logic in it, and then generate certain computational tools that empower nature to build better crops, to increase the level of biodiversity? In the company we’re constantly asking, what does nature want? What does nature want from a compute view?

Lex Fridman (00:10:11) If it knew it, what could aid it in whatever the heck it’s wanting to do.

Neri Oxman (00:10:16) So we keep coming back to this answer of nature wants to increase information, but decrease entropy. So find order, but constantly increase the information scale. And this is true for what our work also tries to do because we’re constantly trying to fight against the dimensional mismatch between things made and things grown. And as designers, we are educated to think in X, Y, and Z and that’s pretty much where architectural education ends and biological education begins.

(00:10:51) So in reducing that dimensional mismatch, we’re missing out on opportunities to create things made as if grown. But in the natural environment, we’re asking, can we provide nature with these extra dimensions? And again, I’m not sure what nature wants, but I’m curious as to what happens when you provide these tools to the natural environments. Obviously with responsibility, obviously with control, obviously with ethics and moral code, but is there a world in which nature can help fix itself using those tools?

Lex Fridman (00:11:26) And by the way, we’re talking about a company called Oxman.

Neri Oxman (00:11:30) Yeah. Just a few words about the team.

Lex Fridman (00:11:33) Yeah. What kind of humans work at a place like this? They’re trying to figure out what nature wants.

Neri Oxman (00:11:37) I think they’re first like you, they’re humanists first. They come from different disciplines and different disciplinary backgrounds. And just as an example, we have a brilliant designer who is just a mathematical genius and a computer scientist and a mechanical engineer who is trained as a synthetic biologist. And now we’re hiring a microbiologist and a chemist, architects of course, and designers, roboticist. So really it’s arc, two of each.

Lex Fridman (00:12:13) And always dancing between this line of the artificial, the synthetic, and the real, what’s the term for it? And the natural

Neri Oxman (00:12:21) Yeah, the built and the grown nature and culture, technology and biology, but we’re constantly seeking to ask how can we build, design and deploy products in three scales? The molecular scale, which I briefly hinted to. And there in the molecular scale we’re really looking to understand whether there’s a universal language to nature and what that language is. And then build a tool that I think and dream of it is the iPhone for nature. If nature had an iPhone, what would that iPhone look like?

Lex Fridman (00:12:59) Does that mean creating an interface between nature and the computational tools we have?

Neri Oxman (00:13:07) Exactly. It goes back to that 11.5 quintillion times the bandwidth that humans have now arrived at, and giving that to nature and seeing what happens there can animals actually use this interface to know that they need to run away from fire? Can plants use this interface to increase the rate of photosynthesis in the presence of a smoke cloud? Can they do this quote-unqoute “automatically” without a kind of a top-down brute force policy-based method that’s authored and deployed by humans? And so this work really relates to that interface with the natural world. And then there’s a second area in the company which focuses on growing products. And here we’re focusing on a single product that starts from CO2. It becomes a product. It’s consumed, it’s used, it’s worn by a human, and then it goes back to the soil and it grows an edible fruit plant.

Lex Fridman (00:14:13) So we’re talking about from CO2 to fruit.

Neri Oxman (00:14:13) Yeah. It starts from CO2 and it ends with something that you can literally eat. So the world’s first entirely biodegradable, biocompatible, bio renewable product.

Neri Oxman (00:14:25) Yes, either using plant matter or using bacteria, but we are really looking at carbon recycling technologies that start with methane or wastewater and end with this wonderful reincarnation of a thing that doesn’t need to end up in a composting site, but can just be thrown into the ground and grow olive and find peace. And there’s a lot of textile based work out there that is focused on one single element in this long chain like, oh, let’s create leather out of mycelium, or let’s create textile out of cellulose, but then it stops there and you get to assembling the shoe or the wearable and you need a little bit of glue, and you need a little bit of this material and a little bit of that material to make it water resistant and then it’s over.

(00:15:16) That’s one thing that we’re trying to solve for is how to create a product that is materially, computationally, robotically, novel, and goes through all of these phases from the creation, from this carbon recycling technology to the product, to literally, how do you think about reinventing an industry that is focused on assembly and putting things together and using humans to do that? Can that happen just using robots and microbes? And that’s it.

Lex Fridman (00:15:48) And doing it end to end. I would love to see what this factory looks like.

Neri Oxman (00:15:54) And the factory is great too. I’m very, very excited. In October we’ll share first renditions of some of this work and in February we’ll invite you to the lab.

Computational templates

Lex Fridman (00:16:05) I’m there. I’ve already applied. I haven’t heard back. I don’t understand. Okay. Just before we get to number three, it’d be amazing to just talk about what it takes with robotic arms or in general, the whole process of how to build a life form stuff you’ve done in the past, maybe stuff you’re doing now, how to use bacteria, this kind of synthetic biology, how to grow stuff by leveraging bacteria? Is there examples from the past and explain?

Neri Oxman (00:16:31) Yes. And just take a step back over the 10 years, the mediated matter group, which was my group at MIT, has sort of dedicated itself to bio-based design would be a suitcase word, but thinking about that synergy between nature and culture, biology and technology. And we attempted to build a suite of embodiments, let’s say that they ended up in amazing museums and amazing shows, and we wrote patents and papers on them, but they were still N of ones. Again, the challenge, as you say, was to grow them, and we classified them into fibers, cellular solids, biopolymers, pigments.

(00:17:13) And in each of the examples, although the material was different, sometimes we used fibers, sometimes we used silk with silkworms and honey with bees and or comb as the structural material, with vespers we used synthetically engineered bacteria to produce pigments, although the materials were different and the hero organisms were different, the philosophy was always the same. The approach was really an approach of computational templating. That templating allowed us to create templates for the natural environment where nature and technology could duet, could dance together to create these products.

(00:17:48) So just a few examples with silk pavilion, we’ve had a couple of pavilions made of silk, and the second one, which was the bigger one, which ended up at the Museum of Modern Art with my friend, an incredible mentor, Paul Antonelli, that pavilion was six meter tall and it was produced by silkworms. And there we had different types of templates. There were physical templates that were basically just these water soluble meshes upon which the silkworms were spinning and then there were environmental templates, which was a robot basically applying variation of environmental conditions such as heat and light to guide the movement of the silkworm.

Lex Fridman (00:18:29) You’re saying so many amazing things, and I’m trying not to interrupt you, but one of the things you’ve learned by observing, by doing science on these is that the environment defines the shape that they create or contributes or intricately plays with the shape they create. And that’s one of the ways you can get to guide their work is by defining that environment. By the way, you said hero organism, which is an epic term. That means whatever is the biological living system that’s doing the creation.

Neri Oxman (00:19:01) And that’s what’s happening in pharma and biomaterials and by the way, precision ag and new food design technologies as people are betting on a hero organism, is sort of how I think of it. And the hero organism is sometimes it’s the palm oil or it’s the mycelium. There’s a lot of mushrooms around for good and bad, and it’s cellulose or it’s fake bananas or the workhorse E. Coli. But these hero organisms are being betted on as the… What’s the one answer that solves everything hitchhiker’s guide?

Lex Fridman (00:19:42) Yeah. These are sort of the 42s of the enchanted new universe. And back at MIT, we said, instead of betting on all of these organisms, let’s approach them as almost movement in a symphony and let’s kind of lean into what we can learn from each of these organisms in the context of building a project in an architectural scale. And those usually were pavilions.

(00:20:05) And then the computational templating is the way you guide the work of this. How many did you say? 17,000?

Neri Oxman (00:20:15) 17,532. So each of these silkworms threads are about one mile in distance, and they’re beautiful. And just thinking about the amount of material, it’s a bit like thinking about the length of capillary vessels that grow in your belly when you’re pregnant to feed that incredible new life form. Just nature is amazing. But back to the silkworms, I think I had three months to build this incredible pavilion, but we couldn’t figure out how. We were thinking of emulating the process of how a silkworm goes about building its incredible architecture. This cocoon over the period of 24 to 72 hours, and it builds a cocoon basically to protect itself.

(00:21:03) It’s a beautiful form of architecture, and it uses pretty much just two materials, two chemical compounds ceresin and fibrin. The ceresin is sort of the glue of the cocoon, the fibrin is the fiber based material of the cocoon and through fibers and glue. And that’s true for so many systems in nature, lots of fiber and glue. And that architecture allows them to metamorphosize. And in the process they vary the properties of that silk thread, so it’s stiffer or softer depending on where it is in the section of the cocoon. And so we were trying to emulate this robotically with a 3D printer that was six axis KUKA arm one of these baby KUKAs.

(00:21:46) And we’re trying to emulate that process computationally and build something very large when one of my students now, a brilliant industrial engineer, roboticist on my team, Marcus said, “Well, we were just playing with those silkworms and enjoying their presence when we realized that if they’re placed on a desk or a horizontal surface, they will go about creating their cocoon only the cocoon would be flat because they’re constantly looking for a vertical post in order to use that post as an anchor to spin the cocoon. But in the absence of that post on surfaces that are less than 21 millimeters and flat they will spin flat patches and we said, “Aha, let’s work with them to produce this dome as a set of flat patches.”

(00:22:42) And a silkworm mind you is quite an egocentric creature. And actually the furthest you go, you move forward in evolution by natural selection, the more egoism you find in creatures. So when you think about termites, their material sophistication is actually very primitive, but they have incredible ability to communicate and connect with each other. So if you think about entire all of nature, let’s say all of living systems as a matrix that runs across two axes one is material sophistication, which is terribly relevant for designers, and the other is communication. The termites ace on communication, but their material sophistication is crap.

(00:23:31) It’s just saliva and feces and some soil particles that are built to create these incredible termite mounds, the scale that when compared to human skyscrapers transcend all of buildable scales, at least in terms of what we have today in architectural practice just relative to the size of the termite. But when you look at the silkworm, the silkworm has zero connection and communication across silkworms. They were not designed to connect and communicate with each other. They’re sort of a human design species because the domesticated silk moth creates the cocoon.

(00:24:08) We then produce the silk of it and then it dies. So it has dysfunctional wings, it cannot fly. And that’s another problem that the sericulture industry has is, why did we in the first place, author this organism 4,000 years ago that is unable to fly and is just there to basically live to serve a human need, which is textiles? And so here we were fascinated by the computational kind of biology dimension of silkworms, but along the way… By the way, this is great. I never get to tell the full story. So great.

Lex Fridman (00:24:47) I’ve enjoyed this so much.

Neri Oxman (00:24:51) People say, “Oh, speak in [inaudible 00:24:54] paragraphs. They’re way too long.” And this is wonderful. This is like heaven.

Lex Fridman (00:24:58) [inaudible 00:24:58] paragraphs. You’re dropping so many good lines. I love it for that.

Neri Oxman (00:25:02) But really those silkworms, yes, they’re not designed to be like humans. They’re not designed to connect, communicate, and build things that are bigger than themselves through connection and communication.

Lex Fridman (00:25:17) So what happens when you add 17,000 of them communicating effectively?

Neri Oxman (00:25:17) That’s a really great question. What happens is that at some point, the templating strategies, and as you said correctly, there were geometrical, templating, material templating, environmental templating, chemical templating if you’re using pheromones to guide the movement of bees in the absence of a queen where you have a robotic queen.

Neri Oxman (00:25:39) But whenever you have these templating strategies, you have sort of control over nature, but the question is there a world in which we can move from templating, from providing these computational material and immaterial physical and molecular platforms that guide nature, almost guiding a product almost like a gardener to a problem or an opportunity of emergence where that biological organism assumes agency by virtue of accessing the robotic code and saying, now I own the code. I get to do what I want with this code. Let me show you what this pavilion may look like or this product may look like?

(00:26:18) And I think one of the exciting moments for us is when we realized that these robotic platforms that were designed initially as templates actually inspired, if I may, a kind of a collaboration and cooperation between silkworms that are not a swarm based organism. They’re not like the bees and the termites. They don’t work together and they don’t have social orders amongst them, the queen and the drones, et cetera. They’re all the same in a way. And here, what was so exciting for us is that these computational and fabrication technologies enable the silkworm to sort of hop from the branch in ecology of worms to the branch in ecology of maybe human-like intelligence where they could connect and communicate by virtue of feeling or rubbing against each other in an area that was hotter or colder.

(00:27:19) And so the product that we got at the end, the variation of density of fiber and the distribution of the fiber and the transparency, the product at the end seems like it was produced by a swarm silk community, but of course it wasn’t. It’s a bunch of biological agents working together to assemble this thing. That’s really, really fascinating to us. How can technology augment or enable a swarm like behavior and creatures that have not been designed to work as swarms?

Lex Fridman (00:27:53) So how do you construct a computational template from which a certain kind of thing emerges? How can you predict what emerges, I suppose?

Neri Oxman (00:28:05) So if you can predict it doesn’t count as emergence, actually.

Lex Fridman (00:28:12) That’s a deeply poetic line.

Neri Oxman (00:28:13) We can talk about it. It’s a bit exaggerated, doesn’t count. Speaking of emergence, an empowerment, because we’re constantly moving between those as if they’re equals on the team and one of them, Christopher shared with me a mathematically equation for what does it mean to empower nature and what does empowerment in nature look like? And that relates to emergence. And we can go back to emergence in a few moments, but I want to say it so that I know that I’ve learned it and if I’ve learned it I can use it later.

Lex Fridman (00:28:54) And maybe you’ll figure something out as you say it also.

Neri Oxman (00:28:57) Of course, Christopher is the master here, but really we were thinking again, what does nature want? Nature wants to increase the information dimension and reduce entropy. What do we want? We kind of want the same thing. We want more, but we want order. And this goes back to your conversation with Joscha about stochastic versus deterministic languages or processes. His definition or the definition he found was that an agent is empowered if the entropy of the distribution of all of its states it’s high while the entropy of the distribution of a single state given a choice, given an action is low. Meaning it’s that kind of duality between opportunity like starting like this and going like this, opening and closing. And this really, I think is analogous to human empowerment, given infinite wide array of choices. What is the choice that you make to enable, to empower, to provide you with the agency that you need?

Lex Fridman (00:30:19) And how much does that making that choice actually control the trajectory of the system? That’s really nice. So this applies to all the kinds of systems you’re talking about.

Neri Oxman (00:30:28) And the cool thing is it can apply to a human on an individual basis or a silkworm or a bee or a microbe that has agency or by virtue of a template, but it also applies to a community of organisms like the bees. And so we’ve done a lot of work sort of moving from, you’ve asked how to grow things. So we’ve grown things using co fabrication where we’re digitally fabricating with other organisms that live across the various kingdoms of life and those were silkworms and bees. And with bees, which we’ve sent to outer space and returned healthily and they were reproductive.

Lex Fridman (00:31:15) Okay, you’re going to have to tell that story. You’re going to have to talk about the robotic queen and the pheromones. Come on.

Neri Oxman (00:31:20) So we built what we called a synthetic apiary and the synthetic apiary was designed as an environment that was a perpetual spring environment for the bees of Massachusetts. They go on hibernation, of course, during the winter season, and then we lose 80% of them or more during that period. We’re thinking, okay, what if we created this environment where before you template, before you can design with, you have to design for? You have to create this space of mutualism space of sort of shared connection between you and the organism. And with bees it started as the synthetic apiary. And we have proven that curated environment where we designed the space with high levels of control of temperature, humidity, and light and we’ve proven that they were reproductive and alive. And we realized, wow, this environment that we created can help augment bees in the winter season in any city around the world where bees survive and thrive in the summer and spring seasons. And could this be a kind of new urban typology, an architectural typology of symbiosis, of mutualism between organisms and humans?

(00:32:37) By the way, the synthetic API was in a co-op nearby Somerville. We had robots. Our team schlepped there every day with our tools and machines and we made it happen. And the neighbors were very happy, and they got to get a ton of honey at the end of the winter. And those bees, of course, were released into the wild at the end of the winter alive and kicking. So then in order to actually experiment with the robotic queen and idea or concept, we had to prove obviously that we can create this space for bees. And then after that, we had this amazing opportunity to send the bees to space on Blue Shepherd Mission that is part of Blue Origin, and we of course said, “Yes, we’ll take a slot.”

(00:33:24) We said, “Okay, can we outdo NASA?” So NASA in 1982 had an experiment where they sent bees to outer space. The bees returned, they were not reproductive and some of them died. And we thought, “Well, is there a way in which we can create a life support system, almost like a small mini biolab of a queen and her retinue that would be sent in this Blue Origin New Shepherd mission in this one cell?” And so if the synthetic apiary was an architectural project, in this case, this second synthetic apiary was a product. It was so from an architectural controlled environment to a product scale controlled environment.

(00:34:08) And this biolab, this life support system for bees, was designed to provide the bees with all the conditions that they needed. And we looked at that time at the Nasonov pheromone that the queen uses to guide the other bees, and we looked at pheromones that are associated with a bee, and thinking of those pheromones being released inside the capsule that goes to outer space. They returned back to the media lab roof and those bees were alive and kicking and reproductive, and they continued to create comb. It ended with a beautiful nature paper that the team and I published together. We gave them gold nanoparticles and silver nanoparticles because we were interested if bees recycle wax, it was known forever that-

Neri Oxman (00:35:03) Bees recycle wax. It was known forever that bees do not recycle the wax. And by feeding them these gold nanoparticles, we were able to prove that the bees actually do recycle the wax. The reason I’m bringing this forward is because we don’t view ourselves as designers of consumable products and architectural environments only, but we love that moment where these technologies… And by the way, every one of these projects that we created involve the creation of a new technology, whether it be a glass printer or the spinning robot or the life support system for the bee colony. They all involved a technology that was associated with the project, and I never, ever, ever want to let that part go because I love technology so much.

(00:35:54) But also another element of this is that always, these projects, if they’re great, they reveal new knowledge about, or new science about the topic that you’re investigating, be it silkworms or bees or glass. That’s why I say, I always tell my team it should be at MoMA and the cover of Nature or Science at the same time. We don’t separate between the art and the science, it’s one of the same.

Biological hero organisms

Lex Fridman (00:36:21) So as you’re creating the art, you’re going to learn something about these organisms or something about these materials. Is there something that stands out to you about these hero organisms like bees, silkworms? You mentioned E. coli has its pros and cons, this bacteria. What have you learned, small or big, that’s interesting about these organisms?

Neri Oxman (00:36:41) Yeah, that’s a beautiful question. What have I learned? I’ve learned that… We also worked with shrimp shells with a glow. How we built this tower on the roof of SF MoMa, which by a couple of months ago until it was on the roof, we’ve shown this structure completely biodegrade into the… Well, not completely, but almost completely biodegrade to the soil. And this notion that a product or an organism or part of that organism can reincarnate is very, very moving thought to me, because I want to believe that I believe in reincarnation.

Lex Fridman (00:37:24) I want to believe that I believe. I want to believe.

Neri Oxman (00:37:25) Yeah, that’s my relationship with God. I like to believe in believing. Most great things in life are second derivatives of things, but that’s part of another conversation.

Lex Fridman (00:37:38) I feel like that’s a quote that’s going to take weeks to really internalize.

Neri Oxman (00:37:43) That notion of, I want you to want, or I need you to need. There’s always something, a deeper truth behind what is on the surface. So I like to go to the second and tertiary derivative of things and discover new truths about them through that. But what have I learned about organisms-

Lex Fridman (00:38:05) And why don’t you like E. coli?

Neri Oxman (00:38:07) I like E. coli, and a lot of the work that we’ve done was not possible without our working on E. coli or other workhorse organisms, like cyanobacteria.

Lex Fridman (00:38:19) How are bacteria used?

Neri Oxman (00:38:20) Death masks. The death masks.

Lex Fridman (00:38:24) So what are death masks?

Neri Oxman (00:38:24) We did this project called Vespers, and those were basically death masks. That was set as a process for designing a living product. What happens? I remember looking at Beethoven’s death mask and Agamemnon’s death mask and just studying how they were created. And really they were geometrically attuned to the face of the dead, and what we wanted to do is create a death mask that was not based on the shape of the wearer, but rather was based on their legacy and their biology. And maybe we could harness a few stem cells there for future generations or contain the last breath. Lazarus, which preceded Vespers, was a project where we designed a mask to contain a single breath, the last breath of the wearer. And again, if I had access to these technologies today, I would totally reincorporate my grandmother’s last breath in a product. So it was like an air memento.

(00:39:31) So with Vespers, we actually used E. coli to create pigmented masks, masks whose pigments would be recreated at the surface of the mask. And I’m skipping over a lot of content, but basically there were 15 masks and they were created as three sets, the masks of the past, the masks of the present, and the masks of the future. They were five, five, and five, and the masks of the past were based on ornaments and they were embedded with natural minerals like gold. Yes, yes, yes, exactly-

Lex Fridman (00:40:12) And we’re looking at pictures of these and they’re gorgeous.

Lex Fridman (00:40:16) Extremely delicate and interesting fractal patterns that are symmetrical.

Neri Oxman (00:40:24) They look symmetrical, but they’re not. We intended for you to be tricked and think that they’re all symmetrical, but-

Lex Fridman (00:40:32) There’s imperfections.

Neri Oxman (00:40:33) There are imperfections by design. All of these forms and shapes and distribution of matter that you’re looking at was entirely designed using a computational program. None of it is manual. But long story short, the first collection is about the surface of the mask. And the second collection, which you’re looking at, is about the volume of the mask and what happens to the mask when all the colors from the surface, yes, enter the volume of the mask inside, create pockets and channels to guide life through them. They were incorporated with pigment-producing living organisms, and then those organisms were templated to recreate the patterns of the original death masks. And so life recycles and re-begins, and so on and so forth. The past meets the future, the future meets the past. From the surface to the volume, from death to life, to death to life, to death to life. And that again, is a recurring theme in the projects that we take on.

(00:41:39) But from a technological perspective, what was interesting is that we embedded chemical signals in the jet, in the printer, and those chemical signals basically interacted with the pigment-producing bacteria, in this case E. coli, that were introduced on the surface of the mask. And those interactions between the chemical signals inside the resins and the bacteria at the surface of the mask, at the resolution that is native to the printer, in this case, 20 microns per voxel, allowed us to compute the exact patterns that we wanted to achieve. And we thought, “Well, if we can do this with pigments, can we do this with antibiotics? If we can do this with antibiotics, could we do it with melanin? And what are the implications?” Again, this is a platform technology. Now that we have it, what are the actual real-world implications and potential applications for this technology?

(00:42:41) We started a new area, one of my students, Rachael, her PhD thesis was titled after this new class of materials that we created through this project, Vespers, Hybrid Living Materials, HLMs. And these hybrid living materials really paved the way towards a whole other set of products that we’ve designed, like the work that we did with melanin for the Mandela pavilion that we presented at SF MoMa. Where again, we’re using the same principles of templating, in this case not silkworms and not bees, but we’re templating bacteria at a much, much, much more finer resolution. And now instead of templating using a robot, we’re templating using a printer.

(00:43:32) But compute is very, very much part of it. And what’s nice about bacteria, of course, is that from an ethical perspective I think there’s a range. So at the end of the silk pavilion, I got an email from professor in Japan who has been working on transgenic silk and said, “Well, if you did amazing silk pavilion, why don’t we create glow in the light silk dresses?” And in order to create this glow in the light silk, we need to apply jeans that are taken from a spider to a silkworm. And this is what is known as a transgenic operation. And we said no. And that was for us a clear decision that, no, we will work with these organisms as long as we know that what we are doing with them is not only better for humans, but it’s also better for them.

(00:44:31) And again, just to remind you, I forget the exact number, but it’s around 1,000 cocoons per a single shirt that are exterminated in India and China, in those sericulture industries that are being abused. Now, yes, this organism was designed to serve the human species and maybe it’s time to retire that conception of organisms that are designed for a human-centric world or human-centric set of applications. I don’t feel the same way about E. coli, not that I’m organism agnostic, but still I believe there’s so much for us to do on this planet with bacteria.

Lex Fridman (00:45:26) And so in general, your design principle is to grow cool stuff as a byproduct of the organism flourishing. So not using the organism-

Neri Oxman (00:45:36) Yes. The win-win, the synergy.

Neri Oxman (00:45:38) A whole that’s bigger than the sum of its parts.

Lex Fridman (00:45:40) It’s interesting. It just feels like a gray area, where genetic modification of an organism, it just feels like… I don’t know. If you genetically modified me to make me glow in the light, I kind of like it.

Neri Oxman (00:45:59) I think you have enough of an aura.

Lex Fridman (00:46:00) All right, thank you. I was just fishing for compliments. Thank you. I appreciate the-

Neri Oxman (00:46:06) But you’re absolutely right. And by the way, the gray area is where some of us like to live and like to thrive, and that’s okay. And thank goodness that there’s so many of us that like the black and white and that thrive in the black and white. My husband is a good example for that.

Lex Fridman (00:46:21) Well, but just to clarify, in this case you are also trying to thrive in the black and white in that you’re saying the silkworm is a beautiful, wonderful creature. Let us not modify it. Is that the idea? Or is it okay to modify a little bit as long as we can see that it benefits the organism as well as the final creation?

Neri Oxman (00:46:42) With silkworms, absolutely let’s not modify it genetically. Let’s not modify it genetically. And then some. Because why did we get there to begin with 4,000 years ago in the Silk Road? And we should never get to a point where we evolve life for the service of mankind at the risk of these wonderful creatures across the across the kingdom of life. I don’t think about the same kind of ethical range when I think about bacteria.

Lex Fridman (00:47:15) Nevertheless, bacteria are pretty wonderful organisms.

Neri Oxman (00:47:18) I’m moving to my second cup here.

Lex Fridman (00:47:21) Take two, because things are getting serious now.

Neri Oxman (00:47:23) Bacteria are. Yeah, for sure.

Engineering with bacteria

Lex Fridman (00:47:25) Let’s give bacteria all the love they deserve. We wouldn’t be here without them. They were here for, I don’t know what it is, like a billion years before anything else showed up.

Neri Oxman (00:47:32) But in a way, if you think about it, they create the matter that we consume and then reincarnate, or dissolved into the soil and then creates a tree, and then that tree creates more bacteria. And then that bacteria could… Again, again. That’s why I like to think about not recycling, but reincarnating, because that assumes, imparting upon nature that dimension of agency and maybe awareness. But yeah, lots of really interesting work happening with bacteria. Directed evolution is one of them. We’re looking at directed evolution. So high-throughput directed evolution of bacteria for the production of products. And again, those products can be a shoe, wearables, biomaterials, therapeutics.

Lex Fridman (00:48:26) And doing that direction computationally?

Neri Oxman (00:48:27) Totally computationally, obviously in the lab with the hero organism, the hero bacteria. And what’s happening today, in equal microbial synthetic biology, synthetic biology that lends itself to ecology. And again, all of these fields are coming together. It’s such a wonderful time to be a designer. I can’t think of a better time to be a designer in this world. But with high-throughput directed evolution… And I should say that the physical space in our new lab will have these capsules which we have designed. They are designed like growth chambers or grow rooms, and in those grow rooms we can basically program top-down environmental templating, top-down environmental control of lights, humidity, light, et cetera. Sorry, light, humidity and temperature while doing bottom-up genetic regulation. So it is a wet lab, but in that wet lab you could do at the same time, genetic modulation, regulation and environmental templating.

(00:49:39) And then, again, the idea is that in one of those capsules maybe we grow transparent wood, and in another capsule, transparent wood for architectural application. Another capsule, we grow a shoe, and in another capsule we look at that large language model that we talked about. And there was a particular technology associated with that, which we’re hoping to reveal to the world in February. And in each of those capsules is basically a high-throughput computational environment, like a breadboard, think of a physical breadboard environment that has access to oxygen and nitrogen and CO2 and nutritional dispensing, and these little capsules could be stressed. They’re sort of ecology in a box, and they could be stressed to produce the food of the future or the products of the future or the construction materials of the future. Food is a very interesting one, obviously because of food insecurity and the issues that we have around both in terms of food insecurity, but also in terms of the future of food and what will remain after we can’t eat plants and animals anymore, and all we can eat is these false bananas and insects as our protein source.

(00:50:56) So there we’re thinking, can we design these capsules to stress an environment and see how that environment behaves? Think about a biodiversity chamber, kind of a time capsule that is designed as a biodiversity chamber where you can program the exact temperature, humidity, and light combination to emulate the environment from the past. So Ohio, 1981, December 31st at 5:00 AM in the morning, what did tomatoes taste like? To all the way in the future, 200 years ago, these are the environmental inputs, these are some genetic regulations that I’m testing and what might the food of the future or the products of the future or the construction materials of the future feel like, taste like, behave like, et cetera. And so these capsules are designed as part of a lab. That’s why it’s been taking us such a long time to get to this point, because we started designing them in 2019, and they’re currently, literally as I speak to you, under construction.

Lex Fridman (00:52:02) How well is it understood how to do this dance of controlling these different variables in order for various kinds of growth to happen?

Neri Oxman (00:52:10) It’s not. It’s never been done before and these capsules have never been designed before. So when we first decided these are going to be environmental capsules, people thought we were crazy. “What are you building? What are you making?” So the answer is that we don’t know. But we know that there has never been a space like this where you have basically a wet lab and a grow room at that resolution, at that granularity of control over organisms. There is a reason why there is this incredible evolution of products in the software space. The hardware space, that’s a more limiting space because of the physical infrastructure that we have to test and experiment with things. So we really wanted to push on creating a wet lab that is novel in every possible way. What could you create in it? You could create the future. You could create an environment of plants talking to each other with a robotic referee. And you could set an objective function.

(00:53:20) And let’s say for the transaction-driven individuals in the world, let’s say their objective function is carbon sequestration. And all of those plants are implemented with a gaming engine and they have these reward system and they’re constantly needing to optimize the way in which they carbon sequest. We weed out the bad guys, we leave the good guys, and we end up with this ideal ecology of carbon sequestering heroes that connect and communicate with each other. And once we have that model, this biodiversity chamber, we send it out into the field and we see what happens in nature. And that’s sort of what I’m talking about, augmenting plants with that extra dimension of bandwidth that they do not have. Just last week I came across a paper that discusses the in vivo neurons that are augmented with a pong game. And in a dish they basically present sentience and the beginning of awareness.

(00:54:37) Which is wonderful that you could actually take these neurons from a mouse brain, and you have the electrical circuits and the physiological circuits that enable these cells to connect and communicate, and together arrive at swarm situation that allows them to act as a system that is not only perceived to be sentient, but is actually sentient. Michael Levine calls this gentle material, material that has agency. This is of interest to us because, again, this is emergence post-templating. You template until you don’t need to template anymore because the system has its own rules. What we don’t want to happen with AGI, we want to happen with synthetic biology. What we don’t want to happen online and software with language, we want for it to happen with bio-based materials. Because that will get us closer to growing things as opposed to assembly and mechanically putting them together with toxic materials and compounds.

Plant communication

Lex Fridman (00:55:43) If I can ask a pothead question for a second, you mentioned just like the silkworms, the individualist silkworms got to actually learn how to collaborate or actually to collaborate in a swarm like way. You’re talking about getting plants to communicate in some interesting way based on an objective function. Is it possible to have some kind of interface between another kind of organisms, humans, and nature? So like a human to have a conversation with a plant?

Neri Oxman (00:56:14) There already is. You know that when we cut freshly cut grass, I love the smell, but actually it’s a smell of distress that the leaves of grass are communicating to each other. The grass, when it’s cut emits green leaf volatiles, GLVs. And those GLVs are basically one leaf of grass communicating to another leaf of grass, “Be careful. Mind you, you’re about to be cut.” These incredible life forms are communicating using a different language than ours. We use language models, they use molecular models. At the moment where we can parse, we can decode these molecular moments is when we can start having a conversation with plants.

(00:56:57) Now, of course there is a lot of work around plant neurobiology. It’s a real thing. Plants do not have a nervous system, but they have something akin to a nervous system. It has kind of a ecological intelligence that is focused on a particular timescale, and the timescale is very, very slow, slow, slow, slow timescale. So it is when we can melt these timescales and connect with these plants in terms of the content of the language, in this case molecules, the duration of the language, and we can start having a conversation, if not simply to understand what is happening in the plant kingdom.

(00:57:38) Precision agriculture, I promise to you, will look very, very different. Because right now we are using drones to take photos of crops, of corn, that look bad. And when we take that photo, it’s already too late. But if we understand these molecular footprints and things that they are trying to say, distress that they are trying to communicate, then we could of course predict the physiological, biological behavior of these crops, both for their own self perpetuation, but also for the foods and the pharma and the type of molecules that we’re seeking to grow for the benefit of humanity. And so these languages that we are attempting now to quantify and qualify, will really help us not only better nature and help nature in its striving to surviving, but also help us design better wines and better foods and better medicine and better products, again, across all scales, across all application domains.

Lex Fridman (00:58:41) Is there intricacies to understanding the timescales, like you mentioned, at which these communications, these languages operate? Is there something different between the way humans communicate and the way plants communicate in terms of time?

Neri Oxman (00:58:56) Remember when we started the conversation talking about definitions in the context of design and then in the context of being? That question requires, I think a kind of a shift, a humility. That requires a humility towards nature, understanding that it operates on different scales. We recently discovered that the molecular footprint of a rose, or of a plant in general during nighttime, is different than its molecular footprint during daytime. So these are circadian rhythms that are associated with what kind of molecules these plants emit given stresses, and given there’s a reason why a jasmine field smells so, so delicious and 4:00 AM in the morning. There’s peace and rest amongst the plants. And you have to tune into that time dimension of the plant kingdom, and that of course requires all this humility, where in a single capsule, to design a biodiversity chamber, it will take years, not months, and definitely not days to see these products.

(01:00:13) And also, that humility in design comes from simply looking at how we are today as a civilization, how we use and abuse nature. Just think of all these Christmas trees. These Christmas trees, they take years to grow. We use them for one night, the holiest night of the year, and then we let them go. And think about in nature to design a “product,” an organism spends energy and time and thoughtfulness and many, many, many years, and I’m thinking about the redwoods, to grow these channels, these cellulose layers and channels and reach these incredible heights. Takes sometimes hundreds of years, sometimes thousands of years. Am I afraid of building a company that designs products in the scale of thousands of years? No, I’m not.

(01:01:08) And the way of being in the physical world today is really not in tune with the time dimension of the natural world at all, and that needs to change. And that’s obviously very, very hard to do in a community of human beings that is, at least in the Western world, that is based on capitalism. And so here, the wonderful challenge that we have ahead of us is, how do we impart upon the capitalist movement? We know that we need to produce now products that will enter the real world and be shared and used by others, and still benefit the natural world while benefiting humans? And that’s a wonderful challenge to have.

Lex Fridman (01:01:55) So, integrate technology with nature, and that’s a really difficult problem. I see parallels here with another company of Neuralink, which is basically like, I think you mentioned, Neuralink for nature. That there are short-term products you can come up with, but it’s ultimately a long-term challenge of how do you integrate the machine with this creation of nature, this intricate, complex creation of nature, which is the human brain. And then you’re speaking more generally, nature.

Neri Oxman (01:02:29) You know how every company has an image? Like this one single image that embodies the spirit of the company? And I think for Neuralink it was, to me, that chimpanzee playing a video game. It was just unbelievable. But with plants, there potentially is a set of molecules that impacts or inspires, I like that word, the plant to behave or act in a certain way, and allows still the plan the possibility of deciding where it or she or he wants to go. Which is why our first product for this molecular space is going to be a functionalized fragrance. So here we’re thinking about the future of fragrances and the future of fragrances and flavors.

(01:03:23) These products in the industry as we know it today, are designed totally for a human-centric use and enjoyment and indulgence and luxury. They’re used on the body for the sake of, I don’t know, attraction and feeling good and smelling good. And we were asking ourselves, is there a world in which a fragrance can be not a functional fragrance? Because you could claim that all fragrances are functional. But is there a world in which the fragrance becomes functionalized, is, again, imparted upon or given agency to connect with another organism? Is there a world in which you and I can go down to your garden and use a perfume that will interact with the rose garden downstairs? I’ve just been enamored with the statements that are being made in the media around, “Oh, this is completely biologically-derived fragrance and it’s bio-based.”

(01:04:28) But when you look into the fragrance and you understand that in order to get to this bio-derived fragrance, you blew through 10,000 bushes of rose to create 5 mL of a rose fragrance. And all these 10,000 bushes of rose, they take space, they take water management, and so much waste. Is this really what we want the future of our agriculture and molecular goods to look like? And so when we did the Aguahoja pavilion on the roof of SF MoMa, we calculated that for that pavilion we had 40,000 calories embedded into this pavilion that was made of shrimp shells and chitosan and apple skins and cellulose from tree pulp. And we calculated that overall the structure had 40,000 calories. Interesting way to think about a structure, from the point of view of calories. But as you left the gallery, you saw these three clocks that were so beautifully designed by Felix on our team, and these clocks measured temperature and humidity, and we connected them to a weather channel so that we could directly look at how the pavilion was biodegrading in real-time.

(01:05:40) And in our calculations, I say this long-winded description of the pavilion to say that in the calculation, we incorporated how much electricity we used for our computers, for the 3D printers that printed the pavilion. And these were called energy calculations, energy end materials. And when you think about a product and you think about a shoe or a chair or a perfume or a building, you don’t stop at the object. You want to go all the way to the system. Again, instead of designing objects or singular embodiments of the will of the designer, you’re really tapping into an entire system that is interconnected.

(01:06:26) And if you look at the energy budget that characterize the project Aguahoja, it traverses the entire planet. Some of these shrimp shells were brought from places in the world we haven’t thought of, in terms of the apples and the shrimp shells and the tree pulp. And so going back to fragrances, it’s really, really important to understand the product in the context of the ecological system from which it’s sourced, and how it’s designed. And that is the kind of thinking that is not only desired, but is required if we are to achieve synergy between humanity and nature.

Lex Fridman (01:07:06) And it’s interesting, because the system-level thinking is almost always going to take you to the entire earth, to considering the entire earth ecosystem.

Neri Oxman (01:07:13) Which is why it’s important to have a left brain and a right brain competing for attention. And intimacy [inaudible 01:07:19]. Yes.

Lex Fridman (01:07:19) Yeah. You mentioned a fragrance that sends out a message to the environment, essentially.

Neri Oxman (01:07:27) A message in a bottle. Yeah.

Lex Fridman (01:07:29) A message in a bottle. So you can go to a rose garden and trick the rose garden to think it’s 4:00 AM, essentially?

Neri Oxman (01:07:36) You could if you wanted to, but maybe that is-

Lex Fridman (01:07:38) Not trick. Trick is such a bad word.

Neri Oxman (01:07:43) Inspire I like. I like the idea of providing nature with a choice, which is why I love that elegant mathematical equation of empowerment and agency.

Lex Fridman (01:07:53) Empower the rose garden to create a romantic moment for the wearer of the fragrance.

Neri Oxman (01:08:00) But now again you’re, again, all of this to go back to that human-centric notion of romance. But maybe there’s another way to do romance that we haven’t yet explored. And maybe there’s a way to tap into what happens to the rose when it’s dreaming. Assuming that plants are sentient and assuming that we can tap into that sentient, what can we discover about what does the rose want? What does it actually want and what does it need? And what are the rose’s dreams?

Lex Fridman (01:08:41) But do you think there’s some correlation in terms of romance, in terms of the word you sometimes use, magic? Is there some similarities in what humans want and what roses want and what nature wants?

Albert Einstein letter

Neri Oxman (01:08:53) I think so. I think there is. And if I did not think so, oh my goodness, this would not be a nice world to live in. I think we all want love. I recently read this beautiful letter that was written by Einstein to his daughter. Einstein asked his daughter to wait 20 years until she reveals these letters, and so she did. It’s just one of the most beautiful letters I’ve ever read from a father to his daughter. And the letter overall is imbued with a sense of remorse or maybe even feelings of sadness. And there is some kind of melancholy note in the letter where Einstein regrets not having spent enough time with his daughter, having focused on the theory of general relativity and changing the world. And then he goes on to talk about this beautiful and elegant equation of E=MC^2. And he tells his daughter that he believes that love is actually the force that shapes the universe because it is like-

Neri Oxman (01:10:03) Is actually the force that shapes the universe because it is like gravity, right? It attracts people. It is like light. It brings people together and connects between people, and it’s all empowering. And so if you multiply it by the speed of light, you could really change the world for the better. And call me a romanticist. I know you are too, which is why I so love being here. I believe in this. I totally and utterly believe in…

Lex Fridman (01:10:34) In love. By the way, let me just excerpt from Einstein’s letter. “There’s an extremely powerful force that so far science has not found a formal explanation to. It’s a force that includes and governs all others and is even behind any phenomena operating in the universe and has not yet been identified by us. This universal force is love.” He also, the last paragraph in the letter, as you’ve mentioned, ” I deeply regret not having been able to express what is in my heart, which has quietly beaten for you all my life. Maybe it’s too late to apologize, but as time is relative,” that jokes to Einstein, “I need to tell you that I love you and thanks to you I have reached the ultimate answer. Your father, Albert Einstein.” By that regret, I deeply regret not having been able to express what is in my heart. Maybe that’s a universal regret, filling your days with busyness and silly pursuits and not sitting down and expressing that.

Neri Oxman (01:11:43) But it is everything. It is everything. It is why I love that expression, and I forget who said this, but I love my daughter more than evolution required, and I feel the same way towards my other half. And I feel that when you find that connection, everything and anything is possible and it’s a very, very, very magical moment. So I believe in love and I believe in the one.

Beauty

Lex Fridman (01:12:27) It might be the same thing, it might be a different thing, but let me ask you a ridiculously big philosophical question about beauty. Dostoevsky said Beauty will save the world in The Idiot, one of my favorite books of his. What is beauty to you? You’ve created through this intersection of engineering and nature, you have created some incredibly beautiful things. What do you think is beauty?

Neri Oxman (01:12:55) That’s a beautiful question.

Lex Fridman (01:12:57) Maybe it is connected to the love question.

Neri Oxman (01:12:59) It is connected to the love question. Of course, everything is connected to the love question. To me, beauty is agency. To me, something that has agency, it is beautiful. There is this special quote from Buckminster Fuller, which I cannot remember word for word but I remember the concept, which goes something like this. When I work on a problem, I never think about beauty. But when I’m done solving the problem and I look at what I’ve created and it’s not beautiful, I know that I was wrong.

Neri Oxman (01:13:38) It’s kind of an agency that speaks to the “objective function” of the creation, right? Whether for Bucky it’s useless or useful.

Lex Fridman (01:13:49) So this idea of empowerment that you talked about, it’s fundamentally connected to it.

Neri Oxman (01:13:52) Comes back to that, yeah.

Lex Fridman (01:13:54) What’s the difference that you hinted at between empowerment and emergence? Is emergence completely lacks control and empowerment is more controlled? There’s an agent making decisions? Is there an interesting distinction there?

Neri Oxman (01:14:16) Yes. I think empowerment is a force with direction. It has directionality to it. Emergence is, I believe, multi-directional. Again, that depends on the application. Emergence is perhaps in terms of a material definition, is a tropic spirit. When empowerment, the end is a tropic counterpart, I think they overlap because I think that empowerment is a way of inspiring emergence. I think emergence does not happen without empowerment, but empowerment can happen without emergence.

Lex Fridman (01:15:05) Do you think of emergence as the loss of control? When you’re thinking about these capsules and then the things they create, is emergence of things not a desirable conclusion?

Neri Oxman (01:15:19) I love that question because to some of us, the loss of control is control. In design, we’re used to extreme levels of control over form and the shape of a thing and how it behaves and how it functions. And that’s something we’ve inherited from the industrial revolution. But with nature, there is this diversity that happens without necessarily having a reward function, right? This is good or bad. Things just happen and some of them happen to have wings and some of them happen to have scales, and you end up with this incredible potential for diversity. So I think the future of design is in that soft control, is in the ability to design highly controlled systems that enable the loss of control.

(01:16:14) And creativity is very much part of this because creativity is all about letting go and beginning again and beginning again and beginning again. And when you cannot let go, you cannot be creative and you can’t find novelty. But I think that letting go is a moment that enables empowerment, agency, creativity, emergence, and they’re all connected. They sort of associate themselves with definition of destiny or the inevitable. A good friend of mine shared with me elegant definition of fate, which is the ratio of who you are and who you want to be.

Lex Fridman (01:17:01) Ratio of who you are, who want to be.

Neri Oxman (01:17:04) Exactly. And that sort of ends up defining you and those tools, I think when you let go, you sort of find, you give peace to your will, to a sense of will. And so I think that’s very, very important in design, but also in life.

Faith

Lex Fridman (01:17:23) She said this fate is the ratio of…

Neri Oxman (01:17:25) Who you are and who you want to be.

Lex Fridman (01:17:27) Who you want to be. Do you think there’s something to this whole manifestation thing like focusing on a vision of what you want the world to become and in that focusing you manifest it? Like Paula Coelho said in the Alchemist, “when you want something, all the universe conspires in helping you to achieve it.” Is there something to that?

Neri Oxman (01:17:48) I think so, yes. And I always think of what I do as the culmination of energy, information, and matter and how to direct energy, information, and matter in the design of a thing or in the design of a life. I think living is very much a process of channeling these energies to where they need to go. I think that the manifestation or part of that manifestation is the pointing to the moon in order to get to the moon. And that’s why manifestation is also directional. It has that vector quality to it that I think of agency as.

Lex Fridman (01:18:31) Have you in your own life. Has there been things you’ve done where you kind of direct that energy information and matter in a way that opens up?

Lex Fridman (01:18:42) Yeah. I mean, you’ve also said somewhere, I’m probably misquoting, that many things, you, Neri, are many things and you become new things every 10 years or so.

Neri Oxman (01:18:56) Oh, I did say that somewhere, that every decade you’ve sort of switched.

Lex Fridman (01:19:00) That was a previous Neri that said that.

Neri Oxman (01:19:03) Yeah, I did say sometime ago that you have to sort of reboot every 10 years to keep creative and keep inventive and keep fresh.

Lex Fridman (01:19:12) Is there are things you’ve done in your life where just doors opened?

Neri Oxman (01:19:20) I think everything, everything, everything good I’ve found in my life has been found in that way of letting go and suspending my sense of disbelief. And often you will find me say to the team, suspend your disbelief. I don’t care that this is impossible. Let’s assume it is. Where does it take us? And that suspension of disbelief is absolutely part and parcel of the creative act. I did so when I was in medical school, I was in Hadassah and in the Hebrew University, and I remember I left medical school for architecture the day my grandmother passed away. And that was a moment of relief and that was a door that was closing that opened other opportunities. But that of course required letting go of the great vision of becoming a doctor and letting go of the dream of being surrounded by wonderful patients and the science of medicine and the research associated with that science. And letting go of that dream to accomplish another.

(01:20:43) And it has happened throughout my life in different ways. MIT was another experience like that where people pointed at me as the designer for whom the academic currency is not necessarily the citation index. And of course in order to get tenure at MIT, you have to look at the citation index. But for me it was not that. It was manifesting our work in shows and writing papers and writing patents and creating a celebration around the work. And I never saw a distinction between those ways of being. I also think that another kind of way of being or a modality of being that I found helpful is Viktor Frankl wrote this incredible book, Men’s Search for Meaning after the Holocaust. And he writes, different people pursue life for different reasons. According to Freud, the goal of life is to find pleasure and according to Adlers, to find power.

(01:21:54) And for Viktor Frankl, it was about finding meaning. And when you let go of the titles and the disciplines and the boundaries and the expectations and the perception, you are elevated to this really special, yes, spiritual, but definitely very, very creative plane where you can sort of start anew, look at the world through the lens of a bacterium or a robot, or look at ecology through the lens of chemistry and look at chemistry through the lens of robotics and look at robotics through the lens of microbial ecologies and so on and so forth. And I feel that kind of rebooting not every 10 years, but every minute, every breath, is very, very important for a creative life and for just maintaining this fresh mind to reboot, reboot, to begin again with every breath, begin again. And that can be confusing some. For my team members, I like to change my mind. It’s who I am, it’s how I think, it’s how I operate.

(01:23:11) And they’ll come and we found another technique or another technology that’s interesting and we thought that we were working on this functionalized fragrance, but now there’s another opportunity and let’s go there. And to me, I would much rather live life, like if I had to pick sort of my favorite Broadway show to enter and live through, it would be Into The Woods. It’s not a specific fairytale. It’s not the Sleeping Beauty or Little Red Riding Hood or Rapunzel, it’s all of them. It’s sort of moving into the forest and seeing this wonder and getting close and learning about that and then moving to another wonder. And life is really about tying all of these little fairytales together in work and also in life.

Lex Fridman (01:24:06) Unafraid to leap into the unknown?

Neri Oxman (01:24:07) Unafraid to leap into the unknown.

Lex Fridman (01:24:08) Speaking of MIT, you got a tenure at MIT and then you leaped to New York and started a new company that with a vision that doesn’t span a couple of years, but centuries.

Neri Oxman (01:24:21) I did. It was my destiny to start a company. And do I have mornings when I wake up and I ask myself what the hell am I doing? Yes, I have those mornings.

Lex Fridman (01:24:32) What do you do with those mornings, by the way?

Neri Oxman (01:24:33) I embrace them and I find gratitude and I say to myself, thank goodness. I am so lucky to have the ability to be frustrated in this way. So I really, really embrace these frustrations and I take them, I wrap them in a bubble and I look at it on the outside of my aware mind and I laugh at them, I smile at them.

Lex Fridman (01:25:11) If I could return actually to the question of beauty for a second, I forgot to ask you something. You mentioned imperfection in the death masks. What role does imperfection play in our conception of beauty? What role does imperfection play in nature? There’s this Japanese aesthetics concept of wabi-sabi, which basically embraces imperfection. Nothing lasts, nothing is finished, and nothing is perfect. What do you think of that?

Neri Oxman (01:25:45) I totally agree that change is the only permanence. That imperfection is there if only to signal that we are part of a bigger thing than ourselves, that we are on a journey, that things are in movement. And if they were perfect, of course, when things are perfect, it is just so boring. We end up with stereotypes. And as humans, but I think just in general as living beings, we’re here to find meaning and that meaning cannot be found without struggle and without seeking to, not to perfect, but to build towards something better. When I was a child, my mother who I love so much, always explained to me how important it is to fall and to fail and to fight and to argue, and that there is a way, that there’s a culture to failing and to imperfection. So I think it is necessary for something beautiful to be imperfect and it is a sign of nature because nothing in nature is perfect.

Flaws

Lex Fridman (01:27:09) What about human relations? You mentioned finding love. Are the flaws in humans, imperfection in humans, a component of love? What role do you think the flaws play?

Neri Oxman (01:27:23) That’s a really profound question. I think the flaws are there to present a vulnerability, and those flaws are a sign of those vulnerabilities. And I think love is very, very gentle, right? Love with Bill, we often talk about between the two of us, about what drives all human behavior. And for him it’s incentive, as you might expect, and he will repeat this sentence to me, oh, incentive drives all human behavior. But I would say to me it’s love, very much so. And I think flaws are part of that because flaws are a sign of that vulnerability, whether physical, whether emotional vulnerability, and these vulnerabilities, they either tear us apart or they bring us together.

(01:28:36) The vulnerability is what is the glue. I think that the vulnerability enables connection. The connection is the glue, and that connection enables accessing a higher ground as a community as opposed to as an individual. So if there is a society of the mind, or if there are higher levels of awareness that can be accessed in community as opposed to again, going to the silkworm, as opposed to on the individual level, I think that those occur through the flaws and the vulnerabilities. And without them we cannot find connection, community. And without community, we can’t build what we have built as a civilization for the past hundreds of thousands of years. So I think not only are they beautiful, but they have a functional role in building civilizations.

Lex Fridman (01:29:32) Yeah, there’s a sense in which love requires vulnerability and maybe love is the leap into that vulnerability.

Neri Oxman (01:29:40) And I think yes, I think a flaw, think about it physically, I’m thinking about a brick that’s flawed, but in a way I think of a flaw as an increased surface area.

Lex Fridman (01:30:02) That’s a good line. That’s a good line.

Neri Oxman (01:30:03) A surface area that physically or emotionally, right, it sort of introduces this whole new dimension to a human or a brick. And because you have more surface area, you can use mortar and build a home. And yeah, I think of it as accessing this additional dimension of surface area that could be used for good or bad to connect, to communicate, to collaborate. It makes me think of that quote from this incredible movie I’ve watched years ago, Particle Fever, I think it was called, documentary about the large hadron collider, an incredible film, where they talk about the things that are least important for our survival are the things that make us human. Like the pure romantic act or the notion of, and Viktor Frankl talks about that too.

(01:31:01) He talks about feeling the sun on his arms as he is working the soil in two degrees Fahrenheit without clothes. And the officer berates him and says, what have you done? Have you been a businessman before you came here to the camp? And he says, I was a doctor. And he said, you must’ve made a lot of money as a doctor. And he said, all my work I’ve done for free, I’ve been helping the poor. But he keeps his humility and he keeps his modesty and he keeps his preservation of the spirit. And he says the things that actually make him able to, or made him able to outlive the terrible experience in the Holocaust was really cherishing this moment when the sun hits his skin or when he can eat a grain of rice, a single grain of rice. So I think cherishing is a very important part of living a meaningful life, being able to cherish those simple things

Lex Fridman (01:32:30) To notice them and to-

Neri Oxman (01:32:32) To notice them, to pay attention to them in the moment, and I do this now more than ever.

Lex Fridman (01:32:42) Bakowski has this poem called Nirvana where it tells a story of a young man on a bus going through North Carolina or something like this, and they stop off in a cafe and there’s a waitress and he talks about that he notices the magic, something indescribable, he just notices the magic of it. And he gets back on the bus with the rest of the passengers. And none of them seem to have noticed the magic. And I think if you just allow yourself to pause, just to feel whatever that is, maybe ultimately it’s a kind of gratitude for, I don’t know what it is. I’m sure it’s just chemicals in the brain, but it is just so incredible to be alive and noticing that and appreciating that and being one in that with others.

Neri Oxman (01:33:38) Yes. Yes. And that goes back to the fireplace, right to the first technology. What was the first technology? It was fire, first technology to have built community. And it emerged out of a vulnerability of wanting to stay away from the cold and be warm together. And of course, that fire is associated with not only with comfort and the ability to form bio relevant nutrients in our food and provide heat and comfort, but also spirits and a kind of way to enter a spiritual moment, to enter a moment that can only be experienced in a community as a form of a meditative moment. There is a lot to be said about light. Light is, I think, an important part of these moments of, I think it’s a real thing. I really truly believe that we’re born with an aura surface area that is measurable. I think we’re born into the world with an aura. And how do we channel that really ends up sort of defining the light in our lives.

Lex Fridman (01:35:24) Do you think we’re all lonely? Do you think there’s loneliness in us humans?

Neri Oxman (01:35:26) Oh yes, yes. Loneliness is part, yes. I think we all have that loneliness, whether we’re willing to access that loneliness and look at it in the eye or completely, completely avoid it or deny it.

Lex Fridman (01:35:44) It feels like it’s some kind of foundation for longing and longing leads to this combination of vulnerability and connection with others.

Lex Fridman (01:35:56) It feels like that’s a really important part of being human as being lonely.

Neri Oxman (01:35:59) Very. We are born into this world alone. Again, being alone and being lonely are two different things and you can be together, but be lonely and you can be alone but not be lonely at all. We often joke, Bill and I, that he cannot be lonely. He cannot deal with being by himself. He always needs people around him. And I strive, long, must have creative solitude, must find pockets of solitude and loneliness in order to find creativity and reconnect with myself. So loneliness is a recipe for community in my opinion. And I think those things compliment each other. And they’re synergetic, absolutely. The yin and yang of togetherness. And they allow you, I think, to reset and to tune in to that ratio we talked about of who you are and who you want to be.

Lex Fridman (01:37:07) If you go to this place of creative solitude, what’s your creative process? Is there something you’ve noticed about what you do that leads to good work?

Neri Oxman (01:37:18) I love to be able not only to lose focus, but kind of to focus on the peripheral view and to allow different things to occur at once. So I will often, in my loneliness journeys, I will often listen to Leonard Bernstein. Anything I can find online by Lenny Bernstein, it’s reading a nature paper, it’s War and Peace. It’s really revisiting all the texts that are so timeless for me with opportunities that are very, very timely. And I think for me, the creative process is really about bringing timeless problems or concepts together with timely technologies to observe them. I remember when we did the Mandela Pavilion, we read Moby Dick, the whiteness of the whale, the albino, the different the other, and that got us to work on melanin and melanine also is sort of an output from the death mass. So it’s lots of things happening at the same time and really allowing them to come together to form this view about the world through the lens of a spirit being or a living being or a material. And then focus on the world through the lens of that material.

(01:38:41) The glasswork was another project like that where we were fascinated by glass because obviously it’s superb material for architecture, but we created this new glass printing technology for the first time that was shedding light on the biomechanics of fluid glass, the math and the physics of which was never done before, which was so exciting to us, but revealing new knowledge about the world through technology. That’s one theme. The reincarnation between things, material and immaterial. That’s another theme. Lenny Bernstein, War and Peace, Tolstoy.

Lex Fridman (01:39:18) You’ve tweeted a Tolstoy quote from War and Peace, as of course you would. Everything I know, I know because of love.

Neri Oxman (01:39:27) Yeah, I love this quote.

Lex Fridman (01:39:28) So you use these kind of inspirations to focus you and then find the actual idea in the periphery.

Neri Oxman (01:39:39) Yes. And then connect them with whatever it is that we’re working on, whether it’s high throughput, directed evolution of bacteria, whether it’s recreating that Garden of Eden in the capsule and what it looks like, the food of the future. It is a little bit like directing a film. Creating a new project is a bit like creating a film. And you have these heroes, you have these characters and you put them together and there is a narrative and there’s a story. Whenever we start a new project, it has to have these ingredients of simultaneous complexity. It has to be novel in terms of the synthetic biology, material science, robotics, engineering, all of these elements that are discipline based or rooted must be novel.

(01:40:31) If you can combine novelty in synthetic biology with a novelty in robotics, with a novelty in material science, with a novelty in computational design, you are bound to create something novel, period. And that’s how I run the company and that’s how I pick the people. And so that’s another very, very important ingredient of the cutting edge across multiple disciplines that come together. And then in the background, in the periphery, there is all these messages, the whispers of the ancient oldies, right? The Beethoven’s and the Picassos.

Lex Fridman (01:41:05) So Beethoven’s always whispering to you.

Neri Oxman (01:41:07) Yeah. How could one not include Beethoven in the whispers?

Lex Fridman (01:41:11) I’m going to ask you about Beethoven and the Evgeny Kissin you’ve mentioned because I’ve played piano my whole life. I obviously know a lot of Beethoven and it’s one of the private things for me, I suppose, because don’t think I’ve ever publicly played piano.-

Neri Oxman (01:41:25) By the way. Me too.

Neri Oxman (01:41:30) I play in private only.

Lex Fridman (01:41:32) People sometimes even with guitar, people ask me, can you play something? And it just feels like certain things are

Neri Oxman (01:41:38) Are meant to be done-

Lex Fridman (01:41:39) Privately. Yeah, it’s weird. I mean it’s a difficult, and some of the times I have performed publicly, it is an ultimate leap in vulnerability. It’s very, very, very difficult for me. And I’m sure, I know it’s not for a lot of people, but it is for me. Anyway, we’ll return to that. But since you’ve mentioned combination of novelty across multiple disciplines and that’s what you seek when you build teams or pick people you work with, I just wanted to linger on this idea of what kind of humans are you looking for in this endeavor that you’re taking on, this fascinating thing that you’ve been talking about. One of the things somewhere else, a previous version, version 5.7 of Neri said somewhere that there’s four fields that are combined to create this intersection of biology and engineering work, and it’s computational design, additive manufacturing, material engineering, synthetic biology. I’m sure there’s others, but how do you find these humans? Machine learnings in the mix.

Neri Oxman (01:42:45) I manifest and they come, there are a few approaches to-

Neri Oxman (01:42:55) Send your message upon the water. I mean those job descriptions that you saw, the first ones I wrote by myself, and you find interesting people and brilliant people when you look, we talked about second derivative. When you look under and under and under. And if you look deep enough and specialized enough and if you allow yourself to look at the cracks, at the flaws, at the cracks between disciplines and between skills, you find really, really interesting diamonds in the rough. And so I like for those job descriptions to be those messages in a bottle that bring those really interesting people our way. I mean, they have to have humility. They have to have a shine in their eye. They have to be hungry and foolish, as Steve Jobs so famously said.

(01:43:49) A friend of mine who’s a dean of well-known architectural school said today, architects don’t want to be architects. Architects don’t look up to the starchitects as role models. Starchitects are no longer role models. Architects want to build by virtue of not building. Architects want, she said, we’re back in the sixties when we think about architecture back in the hippie movement, I think that in a way they have to be somewhat of a hippie, somewhat of a kind of jack of all trades, master of all.

Lex Fridman (01:44:26) And yet with humility.

Neri Oxman (01:44:27) And yet with humility. Now that is hard to find and that is why when I start an interview, I talk about childhood memories and I asked about music and I ask about connection. And through these interviews you can learn a lot about a person’s future by spending time hearing them talk about their past.

Lex Fridman (01:44:52) Do you find that educational, like PhDs versus, what’s the life trajectory? Yours is an interesting life trajectory too. What’s the life trajectory that leads to the…

Lex Fridman (01:45:03) What’s the life trajectory that leads to the kind of person that would work with you?

Neri Oxman (01:45:07) It’s people who have ideally had industry experience and know what it’s like to be in the quote unquote real world. They’re dreamers that are addicted to reality as opposed to realists that are addicted to dreams, meaning they have that innocence in them, they have the hunger, they have the idealism without being entitled and with understanding the systems that govern our world and understanding how to utilize these systems as Trojan horses to bring those values into the world. There are individuals who feel comfortable in this friction between highly wondrous and dreamy and incredible fantasy renditions of what the world could be and extremely brilliant skills in terms of their disciplinary background. PhD with industrial experience in a certain field or a double major in two fields that make no sense whatsoever in their combination.

Neri Oxman (01:46:17) Are things that really, really attract me.

Lex Fridman (01:46:19) Especially the span, the technology biology gap.

Neri Oxman (01:46:24) Yes. Technology, biology, nature, culture. I mean, the secret to one thing is through the lens of another. And I always believe in that kind of translational design ability to be able to see something through the lens of another and always allows you to think again, begin again, reestablish, redefine, suspend your disbelief, revisit. And when you revisit enough times like a hundred times or 200 times and you revisit the same question through the lens of any possible discipline and any possible scenario, eventually you get to the truth.

Extinction

Lex Fridman (01:46:59) I have to ask you, because you work at the interplay of the machine and the natural world, is there a good definition for you of what is life? What is a living organism?

Neri Oxman (01:47:15) I think 440 million years ago, there were all these plants, the cyanobacteria I believe actually. That was the first extinction. There were five extinctions. We are apparently the sixth. We are in the eye of the storm. We are in the sixth extinction. We are going to be extinct as we speak. I mean, death is upon us whether we want to admit it or not.

(01:47:42) And actually they found in Argentina and in various places around the world, they found these spores of the first plants that existed on the planet. And they emerged out of these … Cyanobacteria were the first of course, and then they found these spore based plants. And because they didn’t have seeds there were only spores. The spores became sort of the fossils by which we’ve come to known of their existence. And because of these spores, we know that this first extinction existed.

(01:48:18) But this extinction is actually what enabled plants to resurrect. The death of these first plants, because they clinked to the rocks and they generated a ton of phosphorus that went into the ocean by clinging to the rocks 60 times more phosphorus than without them. And then all this phosphorus basically choked the oceans and made them super cold and without oxygen, anoxic. And then we lost the plant kingdom, and then because of the death of these first plants, they actually enriched the soil and created nutrients for these new plants to come to the planet. And those planets had more sophisticated vein systems and they were moving beyond spores to seeded plants, et cetera, and flowering plants. And so in a way, one mass extinction or the division period led to life as we know it. And where would we be without plants in a way?

(01:49:31) I think that death is very much part of life and through that definition, that kind of planetary wide definition in the context of hundreds of millions of years, life gains a completely new light. And that’s when the particles become a wave, where humans, we are not alone and we are here because of those plants. I think death is very much part of life. In the context of the redwood tree, perhaps life is defined as 10 generations. And through the lens of a bacteria, perhaps life is defined as a millisecond. And perhaps through the lens of an AGI, life is defined as all of human civilization. And so I think it really is a question of this timescale again, the timescale and the organism, the life form that’s asking the question through which we can answer, what is life?

Lex Fridman (01:50:36) What do you think about this? If we think of ourselves in the eye of the storm of another extinction, the natural question to ask here is you have all of nature and then you have this new human creation that is currently being termed artificial intelligence. How does your work play with the possibility of a future super intelligent ecosystem, an AGI that either joins or supersedes humans?

Neri Oxman (01:51:13) I’m glad you asked this question.

Lex Fridman (01:51:15) And are you hopeful or terrified?

Neri Oxman (01:51:17) Both. I’m hopeful and terrified. I did watch your interview with Eliezer Yudkowsky and I loved it

Lex Fridman (01:51:25) Because you were scared or because you were excited or because there was a [inaudible 01:51:29]?

Neri Oxman (01:51:28) First of all, I was both. Totally scared, shamed, excited, and totally also inspired because he’s just such an incredible thinker. And I can agree or disagree with what he says, but I just found his way of thinking about AGI and the perils of humanity as a result.

Lex Fridman (01:51:53) There’s an inevitability to what he’s saying. His advice to young people is that prepare for a short life. He thinks it’s very almost simple. It’s almost common sense that AGI would get rid of humans, that he can’t imagine a trajectory eventually that leads to a place that doesn’t have AGI kill all humans. There’s just too many trajectories where a super intelligent systems gets rid of humans and in the near term. And so that clarity of thinking is very sobering. To me, maybe it is to you as well, it’s super inspiring because I think he’s wrong, but it’s like you almost want to prove him wrong. It’s like, “No, we humans are a clever bunch. We’re going to find a way.”

Neri Oxman (01:52:48) It is a bit like jumping into super cold water. It’s sort of a kind of fist in your face. It wakes you up. And I like these moments so much, and he was able to bring that moment to life, even though I think a mother can never think that way ever. And it’s a little bit like that notion of I love her more than evolution requires.

(01:53:14) On your question about AGI and nature, look, I think we’ve been through a lot in terms of to get here, we sort of moved from data, the ability to collect information to knowledge, the ability to use this information for utility, from knowledge to intelligence. And what is intelligence? It’s the ability to problem solve and adapt and translate. That’s sort of from data to information to knowledge. I think the next frontier is wisdom. And what is wisdom? Wisdom is the ability to have or find insight about the world and from wisdom to spiritual awareness, which sort of transcends wisdom and is able to chart the world into new territory.

(01:53:58) But I think what is interesting about AGI is that it is sort of almost like a self recursive thing, because it’s like a washing machine of a third derivative Wikipedia. It uses kind of language to create language, to create language, to create language.

Lex Fridman (01:54:15) It feels like novelty is being constantly created. It doesn’t feel like it’s regurgitating.

Neri Oxman (01:54:20) And that’s so fascinating because these are not the stochastic parrots. This is sort of a new form of emergence perhaps of novelty as you say, that exists by virtue of using old things to create new things.

(01:54:38) But it’s not as if the AGI has self-awareness. Maybe. Maybe it has, but as far as I can tell, it’s not as if AGI has approached consciousness or sentience just yet. It’s probably getting there. But the language appears to present itself as if there is sentience there, but it doesn’t. But I think that’s the problem at the point where this AGI sounds like me and speaks like me and behaves like me and feels like me and breathes like me and my daughter knows the AGI to be me as sort of the end of everything is the end of human agency.

(01:55:23) But what is the end of human agency to humans I think is the beginning of agency to nature. Because if you take all of this agency, if you take all of these language models that can summarize all of human civilization and consciousness and then upload that to nature and have nature now deal with that world of consciousness that it never had access to.

(01:55:49) Maybe through Eliezer’s lens, the sort of short-lived human becomes sort of a very long-lived humanlike, sentient, weeping willow. Maybe that’s the end in the beginning. And maybe on the more optimistic side for us humans, it’s a different form of existence where everything we create and everything we consume and everything we process is all made out of six elements and that’s it. And there’s only those six elements and not 118 elements. And it’s all the stuff of biology plus some fair amount of bits, genes, and atoms. A lot of Beethoven.

Lex Fridman (01:56:44) A lot of Beethoven. I think the idea of connecting AGI to nature through your work is really fascinating. Sort of unlocking this incredible machinery of intelligence that is AGI and connecting it to the incredible machinery of wisdom that is nature has evolved through billions of years of pretty crazy intense evolution.

Neri Oxman (01:57:15) Exactly. Again, I’m going back to directed evolution. Unlike this sort of high throughput brute force approach, if there is a way to utilize this synergy for diversity and diversification, what happens if you ask a ChatGPT question, but it takes 10,000 years to answer that question? What does that look like when you completely switch the timescale and you can afford the time to answer the question? And again, I don’t know, but that world to me is possibly amazing.

Alien life

Lex Fridman (01:58:10) Because when we start to think about timescales like this, just looking at earth, all the possible trajectories it might take of this living organism that is earth, do you think there’s others like it? Do you think there’s other planets with life forms on them that are just doing their thing in this kind of way?

Lex Fridman (01:58:27) Because in what you’re doing, you’re directly playing with what’s possible with life, lifelike things. That kind of maps the question of, well, what kind of other things are possible elsewhere? Do you think there’s other worlds full of life, full of alien life out there?

Neri Oxman (01:58:50) I’ve studied the calculations that point towards the verdict that the possibility of life in and around us is very, very low. We are a chosen planet in a way. There’s water and there’s love. What else do you need? And that sort of very peculiar juxtaposition of conditions, the oxygen, the water, the carbon again, is in a way a miracle given the massive extinctions that we’ve been through as life forms.

(01:59:33) And that said, I cannot believe that there is no other life form. I want to believe more than I know that yes, that there are life forms in the white fountain that is the black hole, that there are these life forms that are light years away from us, that are forming other forms of life forces.

Lex Fridman (02:00:05) I’m much more worried about probably the thing that you’re working on, which is that there’s all kinds of life around us that we’re not communicating with.

Lex Fridman (02:00:18) That there’s aliens in a sense all around us that we’re not seeing, that we’re not talking to, that we’re not communicating. Because that to me just seems the more likely situation.

Lex Fridman (02:00:31) That they’re here, they’re all around us in different forms, that there’s a thing that connects all of us, all of living beings across the universe, and we’re just beginning to understand any of it. And I feel like that’s the important problem is I feel like you can get there with the tools of science today by just studying life on earth. Unlock some really fundamental things that maybe you can start to answer questions about what is consciousness? Maybe this thing that we’ve been saying about love, but honestly, in a serious way. And then you’ll start to understand that there is alien life all out there, and it’s much more complicated and interesting than we kind of realize as opposed to looking to exactly human-like things. It’s the variety of life that’s possible is just almost endless.

Neri Oxman (02:01:28) I totally agree with you. I think again, define alien, right?

Lex Fridman (02:01:36) Yeah. Define intelligence, define life.

Neri Oxman (02:01:39) Right. And Marvin Minsky used to say, “Intelligence is a suitcase word.” It’s a word so big. It’s a word like sustainability, and it’s a word like rock and roll. And suitcase words are always very, very dangerous.

Music

Lex Fridman (02:01:55) Speaking of rock and roll, you’ve mentioned music and you mentioned Beethoven a bunch of times. You’ve also tweeted about you getting Kiss in performance and so on. What can you say about the role of music in your life?

Neri Oxman (02:02:09) I love music. I always wondered why is it that plastic arts, meaning architecture and sculpture and painting, can’t get us to cry and music gets us to cry so quickly and connect so quickly? And no wonder that plants also respond to music, but that is at the top of the creative pyramid in my opinion.

Lex Fridman (02:02:33) It’s a weird mystery that we’re so connected to music. Well, by the way, to push back, a good bridge will make me cry.

Neri Oxman (02:02:41) It’s true. And I will say when I visited the Segreta Familia, I had that kind of spiritual reverence towards that spatial experience and being in that space and feeling the intention and the space and appreciating every little gesture. It’s true. It is the universal language. It’s the language of waves. It’s the language of the waves, not the language of the particles. It is the universal language, I believe, and that is definitely one of my loves.

Movies

Lex Fridman (02:03:16) And you said that if you weren’t doing what you were doing now, perhaps you would be a film director. I have to ask, what do you think is the best film of all time? Maybe top three?

Neri Oxman (02:03:30) Maybe The Godfather.

Neri Oxman (02:03:34) The Godfather is definitely up there. Francis Coppola is one of my heroes.

Neri Oxman (02:03:40) I have met him, yes. Yes, yes. We were very lucky to work with him on his new film, Megalopolis, which is coming out I hope in 2024. And think about the cities of the future in the context of new materials and the unity between nature and culture. Godfather is definitely up there.

(02:04:02) 2001 is up there. I would watch that film again and again and again. It’s incredible. The last scene in Odyssey 2001, just watch the last scene of 2001, then listen to Yudkowsky, and then go to the garden. And that’s pretty much the end in the beginning.

(02:04:27) But that scene, that last scene from 2001 is everything. It says so much with so little and it’s sort of the embodiment I believe, of ambivalence. And there’s opportunity to believe in the beginning of humankind, the end of humankind, the planet, child star or star child of the future. Was there a death? Was there an reincarnation? That final scene to me is something that I go back to and study, and every time there is a different reading of that scene that inspires me. That scene, and then the first scene in The Godfather, still one of the best scenes of all times, sort of a portrait of America, the ideals and values that are brought from Italy.

Lex Fridman (02:05:23) A family of loyalty.

Lex Fridman (02:05:26) Of values of how different values are constructed.

Neri Oxman (02:05:29) Yes. Loyalty and the human spirit and how Coppola celebrates the human spirit through the most simple gestures in language and acting. And I think in Kubrick you see this highly curated and controlled and manicured vision of creating a film. And with Francis, it’s like an Italian feast. It’s like anything can happen at any moment in time. And just being on the set with him is an experience I’ll take with me to my grave. It’s very, very, very special.

Lex Fridman (02:06:12) And you said music is also part of that, of creating a feeling in the movies?

Neri Oxman (02:06:13) Yeah, actually The Godfather, that tune-

Lex Fridman (02:06:21) That makes me emotional every time on some weird level.

Neri Oxman (02:06:25) Yeah. It’s one of these tunes I’m sure that if you play it to a Jasmine, you’ll get the best scent of all times. But I think with that particular tune, I learned staccato as something very, very happy and joyous. And then made into this stretched in time and became kind of the refrain of nostalgia and melancholy and loyalty and all of these values that ride on top of this one single tune.

Lex Fridman (02:07:05) And you can play it in all kinds of different ways. I’ve played it on guitar and all kinds of different ways. And I think in Godfather III, the son plays it on guitar to the father. I think this happens in movies, but sometimes a melody, and it has a simple melody, you can just like-

Neri Oxman (02:07:22) And the Straus melody in 2001. And when you juxtapose this melodies with this scene, you get this, again, hole that’s bigger than some of its parts where you get this moment, I think. These are the moments I would send with the next Voyager to outer space. The Godfather in 2001 would definitely beyond that golden record.

Advice for young people

Lex Fridman (02:07:54) You are an incredibly successful scientist, engineer, architect, artist, designer. You’ve mentored a lot of successful people. Can you give advice to young people listening to this of how to have a successful career and how to have a successful life?

Neri Oxman (02:08:14) Look, I think there’s this beautiful line in Sheltering Sky. How many times have you seen a full moon in your life and actually took the time to ingest and explore and reflect upon the full moon? Probably 20, I believe he says.

(02:08:35) I spend time with a full moon. I take my time with a full moon and I pay attention to a full moon. And I think paying attention to the seasons and taking time to appreciate the little things, the simple things is what makes a meaningful life. I was very lucky to have grown up in a home that taught me this way of being. My parents, my grandmother, who played a very important role in my growing up. And that ability to pay attention and to be present is so, so, so, so … I could not emphasize it enough, is so crucial.

Neri Oxman (02:09:40) And be grateful. I think gratitude and presence, appreciation are really the most important things in life.

Lex Fridman (02:09:53) If you could take a short tangent about your grandmother who’s played a big role in your life, what do you remember? What lessons have you learned from her?

Neri Oxman (02:10:05) She had this blanket that she would give me every time I came back from school and say, “Do your homework here and meet with your friends here.” And it was always in her garden. And her garden in my mind was ginormous. But when last I went there and saw the site, which has now become the site for another tall building, it was a tiny, tiny little garden that to me, seemed so large when I was growing up because it had everything. It had fig trees, it had olive trees, it had mushrooms, it had the blanket. I would do my homework there. It was everything. And I needed nothing else. And that was my Garden of Eden. That was my childhood being.

(02:10:53) And we would lie on the blanket and look at the clouds and reflect upon the shapes of the clouds and study the shapes of the plants, and there was a lot of wonder in that childhood with her. And she taught me the importance of wonder in an eternal childhood and living adulthood as a child. And so I am very, very grateful for that. I think it is the sense of wonder, the speaking up was always something that she adhered to, to speak up your truth, to be straightforward, to be positive.

(02:11:42) These are things that I also got from my mom. And from my mom, the sense of humor. She had the best sense of humor that I could think of and was just a joy to be around. And my father taught me everything. My father taught me everything I know. My mom taught me everything I feel.

Lex Fridman (02:12:02) That’s a good way to put it.

Neri Oxman (02:12:02) My grandma taught me everything I insight.

Lex Fridman (02:12:08) Well, I see the sense of wonder that just carries through everything you do. So I think you make your grandmother proud.

(02:12:17) Well, what about advice for how to have a career? You’ve had a very interesting career and a successful career, but not an easy one. You took a few leaps.

Neri Oxman (02:12:29) I did take a few leaps and they were uncomfortable. And I’ll never forget, I think we were listening to a Rolling Stone song in the kitchen, and my dad was actually born in Boston. He’s American. He said, “I started to have sort of these second thoughts about continuing my education in Israel, and I was on my way to London to the Architectural Association to do my diploma studies there.” And he looked at me and he said, “Get out of here kiddo. You got to get out of here. You’ve outgrown where you’re at. You need to move forward.”

(02:13:16) Another thing he had taught me, the feeling of discomfort. As you say, the feeling of loneliness and discomfort is imperative to growth. Growth is painful. Period. Any form of growth is difficult and painful. Birth is difficult and painful, and it is really, really important to place yourself in situations of discomfort. I like to be in a room where everyone in the room is more intelligent than me. I like to be in that kind of state where the people that I surround myself with are orders of magnitude more intelligent than I am. And I can say that that is true of all of my team members, and that’s the intellectual discomfort that I feed off of. The same is true for physical exertion. You got to put yourself in these uncomfortable situations in order to grow, in order to find comfort.

(02:14:19) And then on the other hand is love, is finding love and finding this other human that compliments you and that makes you a better version of the one you are and even of the one you want to be. But with gratitude and attention and love, you can go so, so far.

(02:14:51) To the younger generation, I don’t speak of a career. I never thought of my work as my career, ever. And there was this constant entanglement between life and work and love and longing and being and mothering. It’s all the same. And I appreciate that to some people that doesn’t work in their arrangement of will versus comfort versus the reality. But for me, it has always worked. I think to the younger generation, I say, don’t think of your career. A career is something that is imposed upon you. Think of your calling. That’s something that’s innately and directionally moves you, and it’s something that transcends a career.

(02:15:47) Similarly, you can think about the difference between learning versus being educated. Being educated is something that’s given to you that’s external, that’s being imposed, that’s top down imposed, whereas learning is something that comes from within. It’s also the difference between joy and happiness. Many times I’m sad and I’m still joyous. And it’s very, very important to understand the difference between these externally perceived success paths and internally driven value-based ways of being in the world.

(02:16:22) And together, when we combine the broken puzzle, let’s say, of substance and vulnerability, we get this bigger gestalt, this wondrous world of a future that is peaceful, that is wholesome, and that proposes or advocates for that kind of synergy that we’ve been talking about throughout. But it’s all fun.

Lex Fridman (02:17:01) Well, thank you for this incredible conversation. Thank you for all the work you’re doing.

Lex Fridman (02:17:06) And I just have to say that thank you for noticing me and listening to me. You’re somebody from just today and from our exchanges before this, there’s a sense where you care about me as a human being, which I could tell you care about other humans. Thank you for doing that. Thank you for having empathy and just really listening and noticing me that I exist. Thank you for that. I’ve been a huge fan of your work, been a huge fan of who you are as a human being. It’s just an honor that you would sit with me. Thank you.

Neri Oxman (02:17:40) Thank you so much, Lex. I feel the same way. I’ll just say the same.

Lex Fridman (02:17:46) And I look forward to hearing the response to my job application that I’ve submitted.

Neri Oxman (02:17:50) Oh, you’re accepted.

Lex Fridman (02:17:51) Oh, damn. All right, excellent.

Neri Oxman (02:17:53) We all speak of you all the time.

Lex Fridman (02:17:55) Thank you so much.

Lex Fridman (02:17:56) Thank you, Neri. Thank you.

(02:17:58) Thanks for listening to this conversation with Neri Oxman. To support this podcast, please check out our sponsors in the description. And now let me leave you with some words from Leo Tolstoy, “Everything I know, I know because of love.” Thank you for listening. I hope to see you next time.