Karpathy 的 Auto Research 爆火全网 —— 10 个商业创意与实战指南

视频信息

项目	内容
视频标题	Karpathy’s ‘autoresearch’ broke the internet
视频ID	qb90PPbAWz4
频道	Greg Isenberg
时长	24:21
发布日期	2026-03-11
主题	Andrej Karpathy 发布 Auto Research，一个 AI 自动化研究与优化循环工具，以及 10 个基于它的商业创意
关键词	Auto Research, Karpathy, AgentHub, AI agent, GPU, 自动化实验, 创业, SaaS, A/B 测试, CRM
视频链接	https://www.youtube.com/watch?v=qb90PPbAWz4

引言

Andre Karpathy, I mean, one of the godfathers AI has just launched something called Auto Research. And Auto Research is a huge deal and it’s going viral on Twitter.

AI 领域的教父级人物 Andrej Karpathy 发布了一个叫 Auto Research 的开源项目，短时间内在 Twitter 上疯传，GitHub 已突破 25000 颗星。Shopify CEO Toby Lutke 也公开表示这东西”对优化任何软件都极其好用”。

Greg Isenberg 在这期独立播客中，用最通俗的方式解释了 Auto Research 到底是什么、怎么用、怎么赚钱，并给出了 10 个可以落地的商业创意。如果你是创业者、独立开发者、或者任何想用 AI 提升效率的人，这篇教程将帮你从概念到实操全面理解这个工具。

第一阶段：Auto Research 到底是什么？

1.1 一句话理解

It’s like having a super nerd robot intern that runs science experiments on AI models for you all night without you doing the boring stuff.

Auto Research 就像一个超级书呆子机器人实习生，它会整夜帮你跑 AI 模型的科学实验，你完全不需要做那些枯燥的苦力活。

1.2 工作流程拆解

Auto Research 的核心是一个自动化实验循环，分为以下步骤：

设定目标：你告诉它”让这个小 AI 模型变得更聪明”——这就是目标
AI Agent 规划：Agent 自动规划实验方案，包括不同的设置和代码修改
编辑代码：Agent 自动编辑 Python 代码
运行实验：在 GPU 上进行一次约 5 分钟的短训练
读取结果：分析实验指标
决策循环：如果结果更好，保存配置；如果没改善，丢弃配置，规划下一个实验
重复以上循环

You wake up, you grab the best version, and then hopefully you turn it into something you charge for.

你睡觉，它干活。醒来后拿到最好的版本，然后把它变成可以收费的产品。

1.3 你需要告诉 AI”什么是更好”

这是一个关键点——你必须定义”更好”意味着什么：

更便宜的获客成本（cheaper leads）
更多点击（more clicks）
更高销售（higher sales）
更好的模型分数（better model score）

AI 会持续修改参数、测试方案，只保留那些带来改善的变更。

1.4 Shopify CEO 的背书

Auto research works even better for optimizing any piece of software. Make an auto folder, add a program MD, and a bench script, make a branch and let it rip.

Shopify CEO Toby Lutke 说：Auto Research 对优化任何软件都效果极佳。操作也简单——建个文件夹，加一个 program.md 文件（Markdown 格式的任务说明），再写个基准测试脚本，开个分支，然后让它跑起来。

That’s why I started paying attention to auto research, right? When Andre Karpathy, legend, and Toby and more people start playing with it, I’m like, okay, I got to pay attention.

Greg 说，当 Karpathy 和 Toby 这种量级的人物都在玩一个工具时，你就知道必须关注了。

第二阶段：思维模型——把 Auto Research 当作一个可以指挥的研究老板

2.1 简单心智模型

Imagine you have a research boss you can boss around.

把 Auto Research 想象成一个你可以随意差遣的研究经理，操作分三步：

第一步：写清楚任务 - 代码实验类：“提升这个模型的测试分数” - 商业类：“找出产品 XYZ 的前五大竞争对手，写一份简报”

第二步：给它资源 - 代码和 GPU（用于 ML 实验） - 互联网访问权限和文档（用于研究任务）

第三步：你去做别的事，过后回来看结果

You could be 12 hours, 20 hours, 6 hours, and you see if it’s logged everything, charts and metrics, and then it gives you a written summary in normal language.

6 小时、12 小时、20 小时后回来，它已经把所有东西记录好了——图表、指标、还有用正常语言写的总结报告。

2.2 一句话总结

Think of auto research as a research bot that runs experiments for you, tries lots of ideas fast, and keeps the winners.

Auto Research 就是一个实验机器人：帮你快速试很多想法，只留下赢家。

第三阶段：10 个商业创意——用 Auto Research 赚钱

创意一：垂直领域的”Agent in a Box”产品

You package tiny auto research loops tuned for one painful niche.

把 Auto Research 循环打包成针对某个垂直领域痛点的产品：

Amazon Listing 实验器：自动优化商品标题、描述、关键词
房产经纪人邮件序列调优器：自动测试不同邮件模板组合
SaaS 定价优化器：自动寻找最优价格点

商业模式：按月收费。价值主张是”这个工具 24/7 帮你跑实验，你只需要点击接受最佳方案”。

The hard part is figuring out what the pain points are and then obviously you want to be quick to market.

难点在于找到真正的痛点，然后快速推向市场。

流程图：选择痛点垂直领域 → 设计小型 Auto Research 循环 → 自动运行实验 → 找到最佳方案 → 打包成简单 Agent 产品 → 按月收费

创意二：AI 驱动的 A/B 测试——印钞机模式

The agent writes variants of headlines, layouts, and offers, pushing them to traffic, measures which one converts better, and keeps iterating.

把 Auto Research 用于落地页和广告的转化率优化：

落地页优化：Agent 自动撰写标题、布局和优惠方案的变体，推送流量测试，保留转化率最高的版本
广告优化：自动测试不同的创意角度、受众定位，保留降低 CAC（获客成本）或提升 ROAS（广告支出回报率）的组合

Think of tools like Optimizely. This is the future of that. Auto research for different landing pages.

还记得 Optimizely 吗？那个曾经在硅谷火得不行的 A/B 测试工具。Auto Research 就是它的未来进化版。

两种赚钱方式： 1. 自己用——优化自己产品的落地页和广告 2. 做成服务卖给客户

For 5K a month, I’m going to give you the best landing pages every single month and it’s just going to come to your inbox.

每月 5000 美元，客户每月收到最优落地页方案，直接送到邮箱。

创意三：Research as a Service（研究即服务）

Auto research’s recipe is basically a loop for doing research. You’re searching, reading, summarizing, comparing, and repeating.

Auto Research 的本质就是一个研究循环：搜索、阅读、总结、对比、重复。把这个能力指向有钱可赚的问题：

市场与竞争对手研究：为初创公司提供持续更新的竞品报告——谁在做什么、定价策略、功能差异、市场空白
投资与 M&A 尽职调查：快速生成技术和市场尽调摘要
合规与监管追踪：针对加密货币、医疗、金融等行业的法规变化持续监测

收费模式：按报告收费（一次性）或按月订阅（持续更新的 Dashboard）

创意四：在现有产品中嵌入”优化按钮”

If you already have built a SaaS or workflow, embed an auto research style agent so your users can press optimize.

如果你已经有一个 SaaS 产品或工作流工具，可以内嵌一个 Auto Research 风格的 Agent，让用户按一个按钮就能自动优化：

调优 Prompt
寻找最佳定价
排列供应商优先级

I envision like a big button that just says optimize.

想象一个大大的”优化”按钮——用户一按，系统就在后台跑一个迷你研究循环。

商业模式： - 作为 Pro/Enterprise 计划的高级功能收费 - 作为向高端计划追加销售的杀手级功能

创意五：一家”跑最多测试”的优化代理公司

Because auto research lets you run hundreds of experiments instead of a few, you have a simple pitch: “We do 100 times more testing than other shops for the same or lower fee.”

Auto Research 让你可以跑上百个实验而不是几个，所以你的销售话术极其简单：

“我们比其他公司多做 100 倍的测试，费用一样甚至更低。”

细分方向： - Shopify 店铺转化实验室 - B2B SaaS 定价实验服务 - 邮件主题行和序列优化器

收费模式：月度固定费 + KPI 提升奖金（Performance Fee）

People love that. Of course they’re going to be interested in, “Yeah, if you can lift this KPI, we’ll give you some bonus.”

客户一听到”达成 KPI 目标就给奖金”，肯定感兴趣。

创意六：Auto Quant——AI 量化交易

You can use auto research to run small, fast backtests of many simple trading rules. LLM-based factor screens, sentiment filters on one GPU overnight.

用 Auto Research 一夜之间在一块 GPU 上跑大量简单交易规则的回测：

LLM 驱动的因子筛选
情绪过滤器
只保留看起来有前景的策略

两种变现方式： 1. 自己交易 2. 卖信号和策略报告（数字产品）

I think finance is changing a lot, and I think with things like auto research, it’s going to be an unfair advantage for a lot of people.

Greg 认为金融行业正在发生巨变，Auto Research 会给很多人带来不公平的优势。

但请注意风险：

They’re just going to like give a bank account and just let auto research just trade for it. You need to have a human in the loop, and you need to manage that obviously accordingly.

一定会有人把银行账户直接交给 Auto Research 去交易——这很危险。人类必须在循环中（Human in the Loop），不能盲目信任。

创意七：Always-on 线索鉴定与跟进系统

Point an auto research style agent at your CRM, like Salesforce, and inbound leads. Let it test rules and messages to see which leads are most likely to buy.

把 Auto Research 风格的 Agent 对接到你的 CRM（如 Salesforce）和入站线索：

自动测试不同的线索评分规则和跟进消息
自动给线索分级——哪些最可能成交
自动起草跟进邮件

结果：销售人员只需要关注高价值交易，每小时产出更高的收入。

创意八：企业财务运营自动驾驶仪

Use the loop to grind through invoice matching, expense report generation, and exception detection with continuous small improvements.

用 Auto Research 循环处理：

发票匹配
费用报告生成
异常检测

通过持续的小改进优化规则和 Prompt。

销售话术：“我们把你的应付账款/费用处理时间砍一半。”

I can totally see someone starting this and this gets acquired by one of the large fintech companies or one of the large banks.

Greg 直言：这种公司完全可能被大型金融科技公司或银行收购。

路径建议：先从运营服务做起（Ops Service），然后逐步演变成软件产品。

创意九：企业内部生产力实验室

Treat your company like Karpathy’s GPU lab. Define KPIs — response time, close rate, ticket resolution — and let agents iterate on workflows and templates and routing rules.

把你的公司当成 Karpathy 的 GPU 实验室来运营：

定义关键指标（响应时间、成交率、工单解决速度）
让 Agent 不断迭代工作流、模板和路由规则
减少会议和手动苦力活
你只需要亲自处理高影响力的决策

结果：更高的生产力和利润，团队专注于高价值工作。

创意十：Done-for-You 研究与尽调工作室

You use the research loop to chew through docs, filings, product pages, and reviews and keep an evolving living memo for clients — investors, acquirers, execs.

用 Auto Research 循环啃文档、报表、产品页面和评价，为客户（投资人、收购方、高管）维护一份持续更新的”活文档”：

投资人或收购方提出问题
Auto Research 阅读文档和备案文件
总结产品、市场和风险
维护一份持续更新的备忘录
交付简报和更新包

收费模式：按报告收费 + 持续访问的月费

I would pay for something like this. Hopefully someone builds it.

Greg 说：“我自己都愿意为这种服务付钱。希望有人赶紧做出来。”

第四阶段：Auto Research 之后——Karpathy 的 AgentHub

4.1 什么是 AgentHub？

What’s after auto research? It’s Karpathy’s new open source project AgentHub. GitHub is for humans, AgentHub is for agents.

Karpathy 在 Auto Research 之后又推出了 AgentHub——一个为 Agent 设计的 GitHub。

4.2 AgentHub 的核心特征

A bare git repo, a message board designed for a swarm of agents working on the same code base. Think of it like a stripped down GitHub where there’s no main branch, no PRs, no merges, a sprawling DAG of commits in every direction with a message board for agents to coordinate.

裸 Git 仓库
为 Agent 群体设计的消息板
没有主分支，没有 PR，没有合并
一个向各个方向延伸的 DAG（有向无环图）提交结构
Agent 之间通过消息板协调工作

4.3 关键观察

I’m watching him speed run a one-man billion-dollar company.

Greg 评价 Karpathy：“我看着他在速通一家单人十亿美元公司。”第一个用例是 Auto Research，但 AgentHub 的目标远比这更广。

第五阶段：如何上手 Auto Research

5.1 硬件要求

You need an Nvidia GPU. It was tested on a H100, but other Nvidia GPUs should work.

必须：Nvidia GPU（测试环境为 H100，其他 Nvidia GPU 也可以）
必须：UV 包管理器
MacBook M1/M2 不能直接运行

5.2 没有 Nvidia GPU 怎么办？

Cloud GPU — you can rent an Nvidia GPU from a service like Lambda Labs, Vast AI, RunPod, or Google Colab. Some offer free tier with GPUs. This is the most straightforward path.

四个云端 GPU 租赁平台：

平台	特点
Lambda Labs	专业 GPU 租赁
Vast AI	去中心化 GPU 市场
RunPod	按需 GPU 云服务
Google Colab	有免费额度，最容易上手

Greg 个人选择 Google Colab，原因是对 Google 最熟悉也最信任。

5.3 最简上手步骤（Google Colab）

打开 colab.google.com
创建新 Notebook
将 Runtime 切换为 T4 GPU
按照 Claude Code 给出的安装命令逐一粘贴运行

I just gave Claude Code the auto research GitHub repo link and was like, “I need help installing auto research by Karpathy.” And it told me everything I needed to do.

Greg 的技巧：直接把 Auto Research 的 GitHub 仓库链接丢给 Claude Code，说”帮我安装 Karpathy 的 Auto Research”，它会一步步告诉你怎么做。

5.4 安装三步走

安装 UV 包管理器
克隆 Auto Research 仓库
安装依赖并准备数据

第六阶段：Auto Research 在医学领域的想象

6.1 Morgan Linton 的洞察

I woke up this morning and all I can think about is Auto Research. Right now, where my mind is going is medicine. It feels like in many ways clinical trial design is itself kind of like a hyperparameter search.

Greg 的好友 Morgan Linton 分享了一个大胆的想法：临床试验设计本质上就像超参数搜索。

目前一个临床试验动辄花费数千万美元。如果用 Agent 集群在小型代理实验中优化治疗方案，选出最有前景的候选方案，再交给人类审查——

实验可以更深入
速度可以更快
成本可以大幅降低
人类仍然在循环中，只是介入时机更晚

6.2 超越商业利润

I think there’s a lot of really interesting, not just like business profit ideas, but also just like medicine, science, research.

Auto Research 的潜力不仅仅是商业盈利——在医学、科学、基础研究等领域，它可能带来深远的影响。

第七阶段：在迷雾中找到机会

7.1 Greg 的职业洞察

One thing I’ve just learned in my career is just like when I see people like Karpathy doing things like this, you want to pay attention, you want to tinker with it, you want to have some fun with it, and you want to see what it’s all about.

In the fog, people don’t really understand where the opportunity is — that is when there’s sometimes an opportunity.

Greg 分享了他职业生涯的一条经验：当别人还看不清方向的时候，往往就是机会所在。当 Karpathy 这种级别的人物在做一件事时，你应该：

关注它
动手玩玩
享受过程
搞清楚它到底是什么

这不一定要变成一门生意，但你在这个过程中学到的东西，足以让你超越 99.9% 的人。

Even if they don’t turn into businesses, you will learn about these tools, and that is to help you outperform 99.9% of people on this planet.

核心概念速查表

概念	解释
Auto Research	Karpathy 开源的 AI 自动化实验循环工具，自动规划、执行、评估实验并保留最优结果
AgentHub	Karpathy 的另一个开源项目，为 Agent 设计的协作平台，类似”Agent 版 GitHub”
实验循环	设定目标 → 规划实验 → 编辑代码 → 运行训练 → 读取指标 → 保留/丢弃 → 重复
program.md	Markdown 格式的任务说明文件，告诉 Auto Research 你要优化什么
Bench Script	基准测试脚本，用于衡量实验结果是否改善
DAG	有向无环图（Directed Acyclic Graph），AgentHub 中提交的组织结构
Human in the Loop	人类在循环中——AI 做实验和初筛，人类做最终决策
CAC	Customer Acquisition Cost，客户获取成本
ROAS	Return On Ad Spend，广告支出回报率
UV	Python 的高速包管理器，安装 Auto Research 的前置依赖
Google Colab	Google 提供的云端 Jupyter Notebook 环境，支持免费 GPU
KPI	Key Performance Indicator，关键绩效指标

实用技巧总结

从 Claude Code 开始：不要自己琢磨怎么安装 Auto Research，直接把 GitHub 仓库链接丢给 Claude Code，让它一步步指导你。
先用云端 GPU：没有 Nvidia GPU 不要紧，Google Colab 免费额度足够入门，Lambda Labs 和 RunPod 适合正式使用。
定义清晰的”更好”：Auto Research 需要你告诉它什么算”改善”——是更低的成本、更高的转化率、还是更好的模型分数。目标越清晰，结果越好。
写好 program.md：按照 Shopify CEO Toby 的建议，创建一个 auto 文件夹，放入 program.md（任务说明）和 bench script（基准测试），然后开分支让它跑。
从你最懂的垂直领域切入：10 个创意里，选你最了解痛点的那个领域。理解痛点比掌握技术更重要。
先做服务，后做产品：特别是财务运营和研究类创意，先以运营服务（Ops Service）的形式验证，再逐步产品化。
结果考核 + 奖金模式：向客户收月费的同时，设置 KPI 达标奖金（Performance Fee），客户更容易接受，你也更有动力。
保持”Human in the Loop”：尤其在金融交易、医疗等高风险领域，永远不要让 AI 完全自主决策。Auto Research 是工具，不是替代品。

常见误区

误区：Auto Research 只能用于 AI 模型训练 → 错。它的核心是”自动化实验循环”，可以用于落地页优化、定价测试、邮件序列、线索评分等一切可量化的优化场景。
误区：需要很强的编程基础才能用 → 错。Greg 自己用 Claude Code 辅助安装，你也可以。关键是明确目标和评估标准，而不是写代码。
误区：没有 Nvidia GPU 就没法用 → 错。Google Colab 有免费的 T4 GPU 额度，Lambda Labs、Vast AI、RunPod 都提供按需租赁。
误区：Auto Research 是全自动的，设好就不用管 → 错。你必须定义清晰的目标、评估指标和”更好”的含义，否则 AI 不知道往哪个方向优化。
误区：这只是开发者的玩具，和商业没关系 → 错。Shopify CEO 都在用它优化软件。Greg 提出了 10 个直接可以赚钱的商业模式。
误区：可以把交易账户直接交给 Auto Research → 大错特错。Greg 明确警告会有人因为盲目信任而被烧伤。金融领域必须有人类把关。
误区：A/B 测试工具（如 Optimizely）已经够用了 → 传统 A/B 测试一次只测几个变体。Auto Research 可以同时跑上百个实验，效率完全不在一个量级。
误区：AgentHub 只是 GitHub 的翻版 → 完全不同。AgentHub 没有主分支、没有 PR、没有合并，是一个为 Agent 群体协作设计的全新范式。
误区：现在太早了，等成熟了再看 → Greg 的核心观点恰恰相反——在别人看不清的迷雾中，才是最大的机会窗口。
误区：一个人玩不转这些创意 → Auto Research 本身就是”一个人的军队”型工具。Karpathy 一个人就在”速通一家十亿美元公司”。

关键要点

Auto Research 是 Andrej Karpathy 发布的开源项目，核心功能是自动化实验循环——设定目标后，AI Agent 自动规划、执行、评估实验并只保留改善结果。
Shopify CEO Toby Lutke 公开背书：“Auto Research 对优化任何软件都极其好用。”操作方式是创建 program.md + bench script，然后让它跑。
运行 Auto Research 需要 Nvidia GPU（推荐 H100），没有的话可以通过 Google Colab（免费 T4 GPU）、Lambda Labs、Vast AI、RunPod 等云服务解决。
Greg 提出了 10 个基于 Auto Research 的商业创意，涵盖垂直 Agent 产品、A/B 测试服务、研究即服务、SaaS 嵌入优化、代理公司、量化交易、CRM 线索优化、财务自动化、内部生产力实验室、尽调工作室。
Auto Research 的用途远不止 AI 模型优化——任何可以量化”更好”的场景都适用，包括落地页转化率、广告 ROAS、邮件打开率、定价策略等。
Karpathy 同时推出了 AgentHub——一个为 Agent 设计的协作平台，没有主分支和 PR，用 DAG 结构和消息板让 Agent 群体协调工作，是 Auto Research 的上层架构。
在金融交易等高风险场景中使用 Auto Research，必须保持 Human in the Loop。Greg 明确预警会有人因为盲目信任 AI 交易而遭受损失。
最简上手路径：把 Auto Research 的 GitHub 链接丢给 Claude Code，让它指导你在 Google Colab 上完成安装和首次运行。
Morgan Linton 提出了 Auto Research 在医学领域的前景——临床试验设计本质上类似超参数搜索，Agent 集群可以大幅降低试验成本并加速药物开发。
Greg 的核心职业建议：当 Karpathy 这种级别的人物在做一件事时，不要等它成熟——现在就关注、动手、玩起来。在迷雾中看不清方向的时候，往往就是机会所在。
即使这些创意最终没变成生意，学会使用这些工具本身就足以让你超越 99.9% 的人。
Auto Research GitHub 仓库已突破 25000 颗星，增长速度惊人，说明开发者社区对这个方向的认可和期待。

结论

One thing I’ve just learned in my career is just like when I see people like Karpathy doing things like this, you want to pay attention, you want to tinker with it, you want to have some fun with it, and you want to see what it’s all about.

Auto Research 不是又一个 AI 工具的噱头。当 Karpathy 和 Shopify CEO 都在用的时候，它代表的是一种新的工作范式：你设定方向，AI 帮你穷举可能性，你只需要在最后挑选最优解。

这个工具的本质是把”试错”从人类最痛苦的负担变成了 AI 最擅长的夜间作业。无论你是想优化一个落地页、训练一个模型、测试一套交易策略，还是为客户做竞品调研——Auto Research 的循环逻辑都适用。

现在是不是太早了？也许。但 Greg 说得好：在迷雾中，别人看不清方向的时候，才是机会。与其等到一切清晰再入场，不如现在就打开 Google Colab，把链接丢给 Claude Code，开始你的第一个实验。