什么不是 AI Native
What AI Native Is Not
混淆"用了 AI 的组织"和"AI Native 组织",是这个时代最常见也最致命的误读。这两者的差别不是程度差别,是种类差别。
Conflating "an organization that uses AI" with "an AI Native organization" is the most common, and the most fatal, misreading of our era. The difference between the two is not a matter of degree; it is a difference in kind.
- Coase, 1937 - Theory of Firm
- Weick, 1979 - Organizing as a verb
- Beer, 1972 - Brain of the Firm
- Andreessen, 2011 - Software is eating
最常见的误解,是把"用了 AI 的组织"当成 AI Native 组织。如果你只是在传统流程上接入了 ChatGPT,让员工用 Cursor 写代码,让客服用 AI 起草回复——那么你的组织是 AI-enabled,不是 AI Native。
The most common misconception is treating "an organization that uses AI" as an AI Native organization. If you have simply plugged ChatGPT into existing workflows, let employees write code with Cursor, or had customer-support staff draft replies with AI, then your organization is AI-enabled, not AI Native.
更深的误解,是把"AI 转型"当成通向 AI Native 的路径。绝大多数 AI 转型项目失败,不是因为技术不行,而是因为它们试图把 AI 嫁接到一个为前 AI 时代架构的组织上。你不能把 AI 嫁接到传统组织图上得到 AI Native 组织——就像你不能把电动机嫁接到蒸汽工厂上得到现代工厂一样。
The deeper misconception is treating "AI transformation" as the path to becoming AI Native. The vast majority of AI transformation initiatives fail, not because the technology falls short, but because they try to graft AI onto an organizational architecture designed for the pre-AI era. You cannot graft AI onto a traditional org chart and get an AI Native organization, any more than you can bolt an electric motor onto a steam-powered factory and get a modern plant.
电的真正价值不是替代蒸汽机,而是让全新的工厂布局成为可能。第一代电气化工厂仍按蒸汽工厂的逻辑布局,效率提升微乎其微;直到工厂被重新设计成"以电为前提的工厂",生产率才出现量级跃迁。AI 时代正在重演这个剧本。
The true value of electricity was not replacing the steam engine; it was enabling an entirely new factory layout. The first generation of electrified factories was still arranged according to the logic of steam; efficiency gains were negligible. It was only when factories were redesigned from the ground up as "electricity-first" plants that productivity made an order-of-magnitude leap. The AI era is replaying this script.
所以这套方法论不是给"想转型的传统大公司"用的——那需要一套不同的方法论,叫变革管理。这套方法论是给从零开始的人用的:创业者、有真正架构权的事业部负责人、能搭建独立 greenfield 单元的政策设计者。它的核心问题是——如果今天从零开始,把 AI 当作一等公民来设计组织,这个组织应该长什么样?
This methodology is therefore not for "legacy enterprises seeking transformation"; that problem belongs to a different discipline: change management. This methodology is for people building from scratch: founders, business-unit leaders with genuine architectural authority, policy designers who can stand up independent greenfield units. Its central question is: if you were starting from zero today, designing an organization with AI as a first-class citizen, what would that organization look like?
"AI Native"三种本质不同的现象
"AI Native": Three Fundamentally Different Phenomena
"AI Native"在 2024-2026 年间被用来指三种本质不同的东西。不先说清自己讲的是哪一种,任何讨论都只是在不同层面上互相错过。
Between 2024 and 2026, "AI Native" has been used to mean three fundamentally different things. Without first specifying which one you mean, any discussion simply talks past itself at different levels.
结构视角Structural - Agent as team member
Structural Perspective: Agent as Team Member
看到的 AI Agent 像员工一样有自己的"工号"(Microsoft Agent ID)、"职责描述"、"绩效指标"、甚至"被解雇"的权限。组织图上 Agent 与人并列。这是最容易被理解的视角,但也最容易陷入"AI 员工"的拟人化误读。
AI Agents are treated as employees, each with its own "employee ID" (Microsoft Agent ID), "job description," "performance metrics," and even the possibility of being "terminated." Agents appear alongside humans on the org chart. This is the most accessible perspective, yet also the one most prone to the anthropomorphic misreading of "AI as staff."
Salesforce Agentforce 3
ServiceNow AWM
Lattice "AI Employee" (撤回)Lattice "AI Employee" (withdrawn)
运营视角Operational - AI-first workflows
Operational Perspective: AI-First Workflows
不是把 AI 装进现有流程,而是先问"AI 能做哪一步",再设计人介入的位置。每条 workflow 以 AI 为第一步。这是最 productive 的视角,但也最容易被表演化为"AI Theater"。
Rather than inserting AI into existing processes, start by asking "which step can AI own?" and then design where humans intervene. Every workflow leads with AI as the first actor. This is the most productive perspective, and the one most easily hollowed out into "AI Theater."
Luis von Ahn Duolingo memo
IBM AskHR 自动化IBM AskHR automation
Klarna 客服 AI 化(与回撤)Klarna customer-service AI-ification (and partial rollback)
本体视角Ontological - Agent-first organization
Ontological Perspective: Agent-First Organization
组织的主体是 Agent 网络,人是判断与责任的锚点。这是最激进的视角,也是最值得长期跟踪的。2026 年它仍处于边界探测阶段——须注明:下列实验多以诚实的负结果告终(Project Vend 的 Claudius 经营亏损、被员工诱导打折、虚构收款账户),它们是可能性边界的探针,不是可行性的证明。
The primary actors of the organization are Agent networks; humans serve as the anchors of judgment and accountability. This is the most radical perspective, and the one most worth tracking over the long term. As of 2026 it remains in the boundary-probing stage. Note: the experiments listed below largely ended with honest negative results (Project Vend's Claudius ran at a loss, was manipulated into discounts by employees, and fabricated payment accounts). They are probes of the possibility frontier, not proofs of viability.
Anthropic Project Deal
Sakana AI Scientist
MetaGPT / ChatDev 实验MetaGPT / ChatDev experiments
- KEY INSIGHT
- 三种视角不矛盾——它们是 AI Native 这个连续光谱上的不同位置
- The three perspectives are not mutually exclusive; they mark different positions on a continuous spectrum of AI Native.
- WARNING
- 多数公开讨论混淆三种视角,导致"AI Native"成为含糊修辞
- Most public discussions conflate all three perspectives, reducing "AI Native" to an empty slogan.
三种视角不矛盾——它们是 AI Native 这个连续光谱上的不同位置。一个成熟的 AI Native 组织会同时包含三种元素——结构层面有 Agent 作为正式生产单位(视角一),运营层面所有工作流以 AI 为第一步(视角二),关键创新单元有 Agent 自主运营的实验(视角三)。理解了这三个视角,后面的讨论才不会混乱。
The three perspectives are not mutually exclusive: they occupy different positions on the continuous spectrum of AI Native. A mature AI Native organization embodies all three simultaneously: at the structural level, Agents function as formal production units (Perspective 1); at the operational level, every workflow places AI in the first position (Perspective 2); and at the frontier of key innovation units, there are experiments in autonomous Agent operation (Perspective 3). Only by understanding these three perspectives can the rest of this discussion remain coherent.
本方法论的七大支柱同时回应这三个视角——"AI 优先即默认"是运营视角;"Agent 即默认工种"和"工作流即代码"在结构与运营之间;"人作为判断与责任锚"则锚定本体视角的边界,确保即使在最激进的 Agent-first 实验中,人类不会失去最终的责任承担位置。
The seven pillars of this methodology respond to all three perspectives simultaneously: "AI-first as default" addresses the operational perspective; "Agent as the default worker" and "workflow as code" bridge the structural and operational; and "humans as the anchor of judgment and accountability" fixes the boundary of the ontological perspective, ensuring that even in the most radical Agent-first experiments, humans never forfeit their ultimate position of responsibility.
三股力量的汇聚
The Convergence of Three Forces
要理解为什么 AI Native 是种类性的不同,要看到三股力量在 AI 时代汇聚——每一股都使传统组织设计的某个底层假设失效。
To understand why AI Native is a categorical difference, see how three forces converge in the AI era, each invalidating a foundational assumption of traditional organizational design.
本图集必须同时做两件互相拉扯的事:一边保持证据纪律,一边保留探索空间。处理方式不是把所有话都说得保守,而是分两本账:证据账只登记已经有来源、口径和等级的事实或模型;探索账允许提出尚未被证明的组织形态,但必须附先行指标、适用边界和证伪条件。证据账负责不骗人,探索账负责不僵死。两本账混在一起,本方法论会退化成愿景营销;只剩证据账,它又会失去对新形态的感知能力。
This atlas has to do two things that pull against each other: preserve evidence discipline while leaving room for exploration. The answer is not to make every sentence cautious, but to keep two ledgers. The evidence ledger records only claims with sources, measurement bases, and grades. The exploration ledger permits organizational forms that are not yet proven, but only with leading indicators, scope boundaries, and falsification conditions attached. The evidence ledger keeps the method honest; the exploration ledger keeps it from going rigid. Merge the two and the methodology degrades into vision marketing. Keep only the evidence ledger and it loses sensitivity to new forms.
同理,AI 也不应被写成唯一原因。更本质的变量是组织约束的迁移:信息如何流动、判断如何承担、执行如何外包、资本如何结算、能源与算力如何定价、物理行动如何被机器化、责任如何被法律与社会承认。AI 是当前最强的触发器,因为它同时压低执行和协调成本;但未来组织形态会由一束技术共同塑造——agent 协议、机器支付、机器人、能源/算力基础设施、生物与脑机接口都可能改变不同约束。本章先从 AI 切入,是因为它现在最可施工;不是因为其他技术不重要。
By the same logic, AI should not be written as the only cause. The more fundamental variable is the migration of organizational constraints: how information flows, how judgment is borne, how execution is outsourced, how capital settles, how energy and compute are priced, how physical action is mechanized, and how responsibility is recognized by law and society. AI is the strongest current trigger because it lowers execution and coordination costs at once, but future organizational forms will be shaped by a bundle of technologies: agent protocols, machine payments, robotics, energy and compute infrastructure, and bio/brain-computer interfaces can each move a different constraint. This chapter starts with AI because it is the most buildable lever now, not because the other technologies are irrelevant.
- Coase boundary
- Workflow inversion
- Judgment scarcity
- Algorithmic feudalism
第一股力量是协调机器化。科斯(Coase)的交易成本——搜寻、议价、监督、执行——在 AI Agent 介入后可以被部分或全部机器化。这意味着内部协调的成本曲线发生了根本性的下移。原本必须靠层级监督的工作,现在可以靠 telemetry 与 agent guardrails 监督。
The first force is coordination mechanized. Coase's transaction costs (search, negotiation, monitoring, enforcement) can now be partly or wholly mechanized once AI agents intervene. This means the cost curve of internal coordination shifts fundamentally downward. Work that previously required hierarchical supervision can now be supervised through telemetry and agent guardrails.
第二股力量是工作单位的反转。在传统组织里,你定义角色,工作流从角色之间的互动中涌现出来。在 AI Native 组织里,你定义工作流,角色从工作流的需求中涌现出来。这是设计逻辑的彻底反转——组织的核心文档不是岗位说明书,而是工作流规约。
The second force is the work-unit inversion. In a traditional organization, you define roles and workflows emerge from the interactions between them. In an AI Native organization, you define workflows and roles emerge from the requirements of those workflows. This is a complete inversion of design logic: the organization's core document is not a job description but a workflow specification.
第三股力量是瓶颈从执行转向判断。AI 可以以接近零的边际成本生成、转换、总结、执行。它无法可靠地做的事——决定什么值得生成、在多个备选方案之间选择、为后果承担责任、维持组织方向——成为新的稀缺资源。这意味着组织最有价值的人不再是执行者,而是判断者。
The third force is the bottleneck shifting from execution to judgment. AI can generate, transform, summarize, and execute at near-zero marginal cost. The things it cannot reliably do become the new scarce resource: deciding what is worth generating, choosing among alternatives, bearing accountability for consequences, maintaining organizational direction. This means the most valuable people in an organization are no longer executors but those who exercise judgment.
这三股力量合起来意味着:传统的组织设计在为错误的东西优化。它优化清晰的角色、可预测的流程、人类中介的协调。AI Native 设计要优化的是快速的工作流、嵌入的判断、机器中介的协调。这不是细微调整,是底层范式的换位。
Together, these three forces mean that traditional organizational design is optimizing for the wrong things. It optimizes for clear roles, predictable processes, and human-mediated coordination. AI Native design optimizes for fast workflows, embedded judgment, and machine-mediated coordination. This is not fine-tuning; it is a displacement of the underlying paradigm.
还有一个常被忽略的结构性事实加固这个判断:LLM 反转了技术扩散的历史方向。电力、计算、GPS 都是政府与企业先用、消费者后用;LLM 反过来——先触达数十亿消费者,组织反而滞后。Karpathy 2025/6 把这件事当作主题来讲[R6],实证也跟上了:Bick-Blandin-Deming 的全国调查(NBER WP 32966 → Management Science, 2026[R7])测得 2024 年底 45% 的美国成年人已在使用生成式 AI——整体采纳快于 PC 与互联网同期,且由消费端驱动;而企业的正式采纳率仅 5-9%。这意味着组织重构的知识此刻在个体手里、不在制度里——AI Native 创业者不是在等技术成熟,而是在等组织形态追上技术。口径的诚实注脚:同一研究显示企业工作场所采纳两年达 28%,与 PC 时代速度相当——组织是相对消费浪潮慢,不是绝对慢。
There is a structural fact frequently overlooked that reinforces this judgment: LLMs reversed the historical direction of technology diffusion. Electricity, computing, and GPS all reached governments and enterprises first, consumers later; LLMs went the other way: touching billions of consumers first, while organizations lagged. Karpathy made this the theme of his 2025/6 talk[R6], and the empirical record followed: the Bick-Blandin-Deming national survey (NBER WP 32966 → Management Science, 2026[R7]) found that by end-2024, 45% of American adults were already using generative AI: adoption faster overall than the PC or the internet at the same stage, and driven by the consumer end. Formal enterprise adoption stood at only 5-9%. This means the knowledge of how to restructure organizations currently resides with individuals, not institutions. AI Native founders are not waiting for the technology to mature; they are waiting for organizational forms to catch up with the technology. An honest footnote on the data: the same study shows two-year workplace adoption at 28%, comparable to the PC era: organizations are relatively slow against the consumer wave, not slow in absolute terms.
Coase 边界的当代重画The Coase Boundary, Redrawn
The Coase Boundary, Redrawn
Ronald Coase 在 1937 年提出企业之所以存在,是因为内部协调比市场协调便宜。这个回答稳定了 80 年——直到 AI Agent 出现。Williamson、Jensen-Meckling 进一步把"代理成本"加入对比,给出了"企业最优规模 = 内部追加一笔交易的边际成本 = 市场完成同一交易的边际成本"这个均衡条件。
Ronald Coase proposed in 1937 that the firm exists because internal coordination is cheaper than market coordination. That answer held for eighty years, until AI agents arrived. Williamson and Jensen-Meckling later added "agency costs" to the comparison, yielding the equilibrium condition: the optimal firm size is the point at which the marginal cost of adding one more transaction internally equals the marginal cost of completing that same transaction through the market.
AI Agent 的引入根本性地改变了这个均衡。三类成本同时下降——搜寻成本(RAG / 向量库让组织记忆秒级可达)、议价成本(Agent-to-Agent 协议如 MCP、A2A 让自动议价成为可能)、监督成本(实时观察性如 LangSmith / Helicone 让远程异步监督优于现场监督)。其逻辑结果是:传统企业的边界——哪些活动留在内部 vs 外包给市场——会大规模重画。Anysphere 以约 300 人做到 $20 亿 ARR(2026/2,人均约 $600 万,仍是 SaaS 巨头的十倍量级)、Cognition 以并购前累计净烧钱不足 $2,000 万走到 $260 亿投后估值(2026/5)——不是孤立异常,而是Coase 边界向"市场端"压缩的早期实证。
The introduction of AI agents fundamentally alters this equilibrium. Three categories of cost fall simultaneously: search costs (RAG / vector stores make organizational memory accessible in seconds), negotiation costs (Agent-to-Agent protocols such as MCP and A2A make automated negotiation possible), and monitoring costs (real-time observability tools such as LangSmith / Helicone make remote asynchronous supervision superior to on-site supervision). The logical consequence: the boundaries of the traditional firm (which activities stay internal, which get outsourced to the market) will be redrawn at scale. Anysphere reached ~$2B ARR with roughly 300 people (2026/2, ~$6M revenue per person, still ten times the figure for SaaS giants); Cognition reached a $26B post-money valuation (2026/5) on under $20M of cumulative net burn before its acquisition. These are not isolated outliers but early empirical evidence of the Coase boundary compressing toward the market end.
这条推演在 2025 年获得了正面的学术对话对象。NBER 工作论文《The Coasean Singularity?》(Shahidi, Rusak, Manning, Fradkin & Horton, WP 34468, 2025/11[R1])把话说得更直接:交易成本的全部构成要素——查询价格、谈判条款、签订合约、监督履约——恰好是 AI Agent 能以极低边际成本执行的任务类型;一旦有效执行,1937 年定义的 make-or-buy 边界将显著移动。论文给存量市场画的三阶段路径——增强人类 → 整任务替代(人类转向判断、监督与关系工作)→ 工作流围绕 Agent 能力重组——与本规约的瓶颈诊断同构;而它对全新市场的判断是:agent-first 市场将直接从终点状态设计——这正是本图集只为 greenfield 而画的学理版本。诚实的另一半也要引:标题里的问号是作者自己打的——拥塞、价格混淆、监管构成新摩擦,"有效执行"的前提在今天尚未满足,这是理论预测,不是已实现的事实。
This reasoning found a direct academic interlocutor in 2025. The NBER working paper The Coasean Singularity? (Shahidi, Rusak, Manning, Fradkin & Horton, WP 34468, 2025/11[R1]) states it more bluntly: every component of transaction costs (querying prices, negotiating terms, signing contracts, monitoring performance) is precisely the type of task that AI agents can execute at near-zero marginal cost; if effectively executed, the make-or-buy boundary defined in 1937 will shift substantially. The paper charts a three-stage path for incumbent markets: augmenting humans → whole-task substitution (humans shift to judgment, oversight, and relational work) → workflows reorganized around agent capabilities. This path is isomorphic with this atlas's bottleneck diagnosis; and its verdict on greenfield markets is: agent-first markets will be designed directly from the endpoint state. That is the academic formulation of why this atlas is drawn only for greenfield. The caveats must be cited too: the question mark in the title is the authors' own. Congestion, price confusion, and regulation constitute new frictions, and the precondition of "effective execution" has not yet been met. This is a theoretical prediction, not an accomplished fact.
但同时,一种新的成本兴起——算法封建主义(algorithmic feudalism)。当 AI 能力被 OpenAI、Anthropic、Google、Microsoft 四家巨头垄断,"AI Native 组织"实际上把核心生产要素外包给一个高度集中的供应商寡头,构成一种新的"地租依附"关系。这就是为什么"多模型架构"在七大支柱中是基础性的——它是当代 Coase 边界设计中的"主权保留"。
At the same time, a new cost is rising: algorithmic feudalism. When AI capabilities are monopolized by four giants (OpenAI, Anthropic, Google, and Microsoft), an "AI Native organization" in effect outsources its core means of production to a highly concentrated supplier oligopoly, creating a new form of rent dependency. This is why the "multi-model architecture" is foundational among the seven pillars: it is the "sovereignty reservation" in contemporary Coase boundary design.
代理理论的扩展Agency Theory, Extended
Agency Theory, Extended
Jensen-Meckling 1976 年的代理理论建立在一个清晰的二元结构上——委托人(principal,如股东)与代理人(agent,如经理)。代理成本来自信息不对称、目标分歧、激励错位。整个公司治理结构(董事会、薪酬委员会、KPI、绩效考核)都是这个理论的工程化实现。
The Jensen-Meckling (1976) agency theory rests on a clear binary structure: principal (e.g., shareholders) and agent (e.g., managers). Agency costs arise from information asymmetry, goal divergence, and misaligned incentives. The entire apparatus of corporate governance (boards, compensation committees, KPIs, performance reviews) is the engineering implementation of this theory.
AI Agent 介入后,这个二元结构变成了三元结构——principal-agent-agent。人类经理代理股东,AI Agent 又代理人类经理。责任链多了一层,问题是这一层的责任如何分配?Air Canada 案(2024 年 BC 省 BCCRT 在 Moffatt v. Air Canada 中判决公司必须为 chatbot 承诺承担法律责任)首次明确了第一层答案——公司不能用"我们的 AI 说错了"作为免责理由。但更深的问题尚未解决——当 AI Agent 之间互相调用、互相代理(如 Anthropic Project Deal 中员工授权 Claude Opus 代为议价),责任链如何追溯?
Once AI agents intervene, this binary structure becomes a ternary structure: principal-agent-agent. Human managers act as agents for shareholders; AI agents in turn act as agents for human managers. The accountability chain gains one more layer, and the question is how accountability at that layer is allocated. The Air Canada case (2024, BC BCCRT ruling in Moffatt v. Air Canada that the company must bear legal responsibility for its chatbot's commitments) settled the first-layer answer for the first time: a company cannot use "our AI got it wrong" as a disclaimer. But the deeper question remains unresolved: when AI agents call and proxy one another (as when an employee in Anthropic Project Deal authorizes Claude Opus to negotiate on their behalf), how is the accountability chain traced?
2025 年的组织经济学给这个三元结构补上了更激进的一块。Hadfield 与 Koh 在《An Economy of AI Agents》(arXiv:2509.01063,NBER 变革性 AI 经济学手册章节[R2])中重新检视经典文献——Coase 的协调摩擦、Williamson 的交易成本、Grossman-Hart 的产权、Holmström-Milgrom 的代理模型——并指出:这些理论识别的企业规模上限,全部源于人类固有约束(沟通速率受限、偷懒倾向),而这些约束"似乎内在于人类、却不内在于 AI"——Agent 近即时通信,奖励函数可以被直接设计为不偷懒,于是监督与履约这两大组织开销在理论上变得不必要。论文进一步引用 Chen-Elliott-Koh 的形式模型(Journal of Economic Theory, 2023):当 AI 压低维持异质能力的组织成本时,经济可能发生突变式相变——从大量专业化企业转向少数横跨众多行业的巨型企业。引用须保留原文的虚拟语气:这是条件性预测("如果 Agent 确实能……"),NBER 同卷评论人 Kevin Bryan 也对变化速度提出了制度性异议——但它把"组织规模的旧均衡正在失效"从直觉升格成了可检验的理论命题。
Organizational economics in 2025 added a more radical piece to this ternary structure. Hadfield and Koh, in An Economy of AI Agents (arXiv:2509.01063, a chapter in the NBER Handbook on the Economics of Transformative AI[R2]), revisit the canonical literature (Coase's coordination friction, Williamson's transaction costs, Grossman-Hart property rights, Holmström-Milgrom agency models) and observe that every size limit on the firm identified by these theories derives from constraints inherent to humans (bounded communication rates, shirking tendencies), and that these constraints "seem intrinsic to humans but not to AI": agents communicate near-instantaneously, and reward functions can be designed directly against shirking, making monitoring and enforcement (the two largest organizational overheads) theoretically unnecessary. The paper further cites the formal model of Chen-Elliott-Koh (Journal of Economic Theory, 2023): when AI lowers the organizational cost of maintaining heterogeneous capabilities, the economy may undergo a discontinuous phase transition: from many specialized firms to a small number of mega-firms spanning numerous industries. Citations must preserve the paper's conditional mood: this is a conditional prediction ("if agents truly can…"); NBER co-volume commentator Kevin Bryan also raised institutional objections about the speed of change. But the paper elevates "the old equilibrium of firm size is failing" from intuition to a testable theoretical proposition.
这就是为什么"人作为判断与责任锚"在七大支柱中是不可妥协的——不是因为人比 AI 决策更准,而是因为只有人能承担后果。Lattice 2024 年 7 月把 AI 列为"正式员工"3 天后撤回,本质上就是因为 HR 框架要求"员工"能承担责任,而 AI 无法。这是代理理论在 AI 时代仍然成立的最深部分——责任不能委托给无法承担后果的实体。
This is why "humans as judgment and accountability anchors" is non-negotiable among the seven pillars: not because humans make more accurate decisions than AI, but because only humans can bear consequences. When Lattice listed AI as a "formal employee" in July 2024 and reversed course three days later, the essential reason was that the HR framework requires "employees" to be capable of bearing responsibility, which AI cannot do. This is the deepest part of agency theory that still holds in the AI era: accountability cannot be delegated to an entity incapable of bearing consequences.
控制论的回归Cybernetics, Returning
Cybernetics, Returning
Stafford Beer 在 1972 年《Brain of the Firm》中提出 Viable System Model(VSM)——任何能持续运转的组织都需要五个子系统:S1 操作单元(执行)、S2 协调(避免冲突)、S3 控制(资源分配与短期优化)、S4 智能(外环境扫描与长期规划)、S5 政策(身份与价值观)。这个模型在 1980 年代被广泛尝试但最终未能主流化——因为人类无法实时执行 S2 和 S3 所需的反馈密度。
Stafford Beer proposed the Viable System Model (VSM) in his 1972 Brain of the Firm: any organization that can sustain itself requires five subsystems: S1 operational units (execution), S2 coordination (conflict avoidance), S3 control (resource allocation and short-term optimization), S4 intelligence (external environment scanning and long-term planning), S5 policy (identity and values). The model was widely attempted in the 1980s but ultimately failed to go mainstream, because humans could not sustain in real time the feedback density that S2 and S3 require.
当代 AI Agent 让 VSM 重新成为可行的组织设计语言。具体的映射是——生成式 Agent 处于 S1(执行)与 S4(探索);telemetry 与 guardrails 处于 S2(协调)与 S3(实时控制);人类保留 S5(身份与价值观)。这就是为什么 Anthropic 据报道的"90 天最长规划周期"能够运转——不依赖年度战略来对齐组织,而依赖 S2/S3 层的实时反馈密度(口径注:该规划周期出自高管访谈与媒体报道,未经独立验证)。Cursor、Replit、Cognition 的极速迭代节奏也是同一逻辑——VSM 在 AI 时代第一次有了可施工的实现路径。
Contemporary AI agents make VSM viable once more as an organizational design language. The concrete mapping is: generative agents occupy S1 (execution) and S4 (exploration); telemetry and guardrails occupy S2 (coordination) and S3 (real-time control); humans retain S5 (identity and values). This is why Anthropic's reported "90-day maximum planning horizon" can function: not by relying on annual strategy to align the organization, but by relying on real-time feedback density at the S2/S3 layer (sourcing note: this planning horizon comes from executive interviews and media reports; it has not been independently verified). The hyper-velocity iteration cadence at Cursor, Replit, and Cognition follows the same logic: for the first time, VSM has an implementable execution path in the AI era.
判断稀缺性的经济学The Economics of Judgment Scarcity
The Economics of Judgment Scarcity
Daron Acemoglu 2024 年在 MIT 的研究《The Simple Macroeconomics of AI》给出了一个谨慎的测算——AI 未来 10 年累计 GDP 贡献约 1.1-1.6%(年均 ~0.05%),远低于行业普遍宣称的数倍效应。MIT NANDA 2025/7 预印报告《The GenAI Divide》测得:定制化企业 GenAI 试点在约六个月观察窗口内,95% 没有可衡量的 P&L 影响(150 份访谈 + 350 份问卷 + 300 个公开部署;非同行评议,引用须保留此口径)。这两个数字背后是同一个第一性原理——AI 加速了"执行",但执行从来不是组织瓶颈的真正所在。
Daron Acemoglu's 2024 MIT research The Simple Macroeconomics of AI offers a cautious estimate: AI's cumulative GDP contribution over the next ten years will be approximately 1.1-1.6% (annual average ~0.05%), far below the multiples commonly claimed by the industry. The MIT NANDA 2025/7 preprint report The GenAI Divide measured that 95% of customized enterprise GenAI pilots showed no measurable P&L impact within an approximately six-month observation window (150 interviews + 350 surveys + 300 public deployments; non-peer-reviewed; citations must retain this qualification). Behind both numbers lies the same first principle: AI accelerates "execution," but execution has never been the true organizational bottleneck.
"判断稀缺"在经济学里有正主文献——不是 Acemoglu,而是 Agrawal、Gans 与 Goldfarb 的 prediction-vs-judgment 框架。他们 2018 年的形式模型(NBER WP 24626;同行评审版刊于 Information Economics and Policy, 2019[R3])给出三个本规约直接继承的结论:① AI 降低的是"预测"这一特定任务的成本——预测是决策的输入,不是决策本身;② 判断被形式化定义为"目标函数无法被描述或编码时人类所行使的能力"——并非所有人类判断都与 AI 互补,更便宜的预测以相反方向影响不同类型判断的回报;③ 委托定理:即便人类参与能产出更优决策,人类仍会理性地把部分决策完全委托给机器——Agent 在严格优于人类之前就获得完全自治,不是失误,是模型内最优选择的推论。2025 年他们把"判断"进一步拆开(《The Economics of Bicycles for the Mind》, NBER WP 34034[R4]):机会判断(识别什么值得启动)在模型中恒为认知工具的互补品——AI 提升而非侵蚀它的价值;收益判断(知道在给定状态下采取何种行动)只在工具不过度削减人类努力时才互补;而实现技能被建模为认知工具的替代品。一句话翻译:AI 吃掉实现、抬高机会判断、对收益判断态度暧昧——这恰好是"人即判断锚点"与"操作者即编排者"两个世界观的经济学坐标。同一谱系里,Gans 的《AI as Strategist》(NBER WP 33650, 2025[R5])从控制权角度独立推出与 FIG 5.1 判断锚点地图同构的结论:授予战略家正式控制权的增量价值随其可信度单调递减——所以组织应当逐域(domain-by-domain)而非统一地分配 AI 的控制与影响力:判断密集域人类主导、数据丰富域 AI 主要靠透明推理产生影响力而非权威。三篇全是理论模型而非实证——引用它们,是给"判断稀缺"找可对话、可证伪的学术对象,不是宣称已被证明。
"Judgment scarcity" has a canonical economics literature: not Acemoglu, but the prediction-vs-judgment framework of Agrawal, Gans, and Goldfarb (AGG). Their 2018 formal model (NBER WP 24626; peer-reviewed version published in Information Economics and Policy, 2019[R3]) yields three conclusions this atlas inherits directly: ① AI reduces the cost of "prediction" as a specific task: prediction is an input to decisions, not the decision itself; ② judgment is formally defined as "the capability humans exercise when the objective function cannot be described or encoded": not all human judgment is complementary to AI; cheaper prediction affects the returns to different types of judgment in opposite directions; ③ the delegation theorem: even when human involvement yields better decisions, humans will rationally delegate some decisions entirely to machines. Agents acquiring full autonomy before they strictly outperform humans is not a failure but a corollary of optimal choice within the model. In 2025 they decomposed "judgment" further (The Economics of Bicycles for the Mind, NBER WP 34034[R4]): opportunity judgment (identifying what is worth starting) is modeled as a persistent complement to cognitive tools: AI raises rather than erodes its value; payoff judgment (knowing which action to take given a state) is complementary only when the tool does not excessively reduce human effort; implementation skill, by contrast, is modeled as a substitute for cognitive tools. In one sentence: AI consumes implementation, elevates opportunity judgment, and is ambivalent about payoff judgment. That is precisely the economic coordinate of "humans as judgment anchors" and "operators as orchestrators." In the same lineage, Gans's AI as Strategist (NBER WP 33650, 2025[R5]) independently derives from a control-rights angle a conclusion isomorphic with the FIG 5.1 judgment-anchor map: the incremental value of granting a strategist formal control rights decreases monotonically with their credibility, so organizations should allocate AI control and influence domain-by-domain rather than uniformly: in judgment-dense domains humans lead; in data-rich domains AI exerts influence primarily through transparent reasoning rather than authority. All three are theoretical models, not empirical studies. Citing them gives "judgment scarcity" a scholarly interlocutor that can be engaged and falsified; it is not a claim that the thesis has been proven.
顺着这条线读,Acemoglu 的谨慎测算与 AGG 的微观模型指向同一处——当 AI 让执行无限便宜,组织真正稀缺的资源是判断:决定什么值得做、在多个备选方案之间选择、为后果承担责任、维持组织方向。这种判断不能被 AI 替代——不是因为 AI 不够聪明,而是因为判断的本质包含"承担后果的能力",而后果的承担是法律的、社会的、伦理的,不是计算的。这就为什么 AI Native 组织的 KPI 必须从"执行产出"转向"判断质量与方向正确度"——传统的"人均 ARR"指标在 AI 时代误导你优化错的东西。Anysphere 人均创收约 $600 万(按 2026 年 ~300 人与 $2B ARR 同期口径计)——这个数字惊人,但它的真正含义不是 AI 让人变得更高效,而是这类公司把判断密度做到了极致。同一句警告必须反向成立:人均 ARR 一旦自己成为目标,也会变成下一个被博弈的指标——Goodhart 定律不给本方法论豁免权。
Reading along this line, Acemoglu's cautious macroeconomic estimate and AGG's microeconomic model converge on the same point. When AI makes execution infinitely cheap, the truly scarce organizational resource is judgment: deciding what is worth doing, choosing among alternatives, bearing accountability for consequences, maintaining organizational direction. This kind of judgment cannot be substituted by AI, not because AI is insufficiently intelligent, but because the essence of judgment includes "the capacity to bear consequences," and the bearing of consequences is legal, social, and ethical, not computational. This is why the KPIs of an AI Native organization must shift from "execution output" toward "judgment quality and directional correctness"; the traditional "revenue per employee" metric misleads you into optimizing for the wrong thing in the AI era. Anysphere's revenue per employee of approximately $6M (computed at ~300 people and $2B ARR for 2026, same-period basis) is a striking number, but its true meaning is not that AI makes people more efficient; it is that companies of this type have pushed judgment density to its extreme. The same warning must hold in reverse: once revenue per employee itself becomes a target, it too becomes the next metric to be gamed; Goodhart's Law grants no exemption to this methodology.
把"公司"当作组织的自然形态,是一种近视。它不是永恒的容器,而是一组为特定历史约束临时拼装、并且分层叠加起来的解——每一层都比想象中年轻:
Treating the "corporation" as the natural form of organization is myopic. It is not an eternal container but a set of solutions provisionally assembled for specific historical constraints and layered on top of one another, each layer younger than we imagine:
三层加起来不过四百年,且每一层都是对当时人类协调约束的回应:信息只能逐级传递、信任只能靠科层背书、资本只能长期绑定。这正是 Coase(1937)与 Hadfield-Koh(2025[R2])指认的同一件事——企业规模的上限源于人类约束,不是自然法则。AI 的意义不在"改造公司",而在溶解公司的奠基约束:当协调、信任、议价的成本结构被重写,这四百年的拼装就不再是唯一解。Notion 创始人 Ivan Zhao 把话说得更直白——"公司是一项晚近的发明,它在规模化时退化,并触及上限"[R22]。本图集要画的,是约束溶解之后、从终点状态重新设计的那一种。
These three layers together span barely four hundred years, and each was a response to the human coordination constraints of its time: information could only travel step by step, trust could only be underwritten by hierarchy, capital could only be committed long-term. This is the same thing that Coase (1937) and Hadfield-Koh (2025[R2]) both identify: the upper limit on firm size derives from human constraints, not natural law. The significance of AI lies not in "reforming the corporation" but in dissolving the foundational constraints on which the corporation rests: once the cost structure of coordination, trust, and negotiation is rewritten, this four-hundred-year assembly is no longer the only solution. Notion co-founder Ivan Zhao put it more plainly: "The company is a recent invention; it degrades at scale and hits a ceiling"[R22]. What this atlas is drawing is the kind designed from the endpoint state, after the constraints have dissolved.
三股力量各自瓦解了一个旧假设。它们合起来指向什么——把散落的线索装配成一个命题,并检验它能否承重——是下一张图纸的工作。
Each of the three forces dismantles one old assumption. What they point to together is the work of the next blueprint: assembling the scattered threads into a proposition and testing whether it can bear weight.
组织与管理的本质
The Essence of Organization and Management
这张图纸把整套图集压缩成一次推导:三条公理,一条推论,一个命题。命题若成立,AI Native 就不是管理时尚,而是成本结构变化下的必然解;命题若被驳倒,后面的图纸都该作废。管理学一百一十年的五种职能,以及"一个人的公司"为什么第一次成为严肃的组织设计选项,都是这次推导的直接后果。
This blueprint compresses the entire atlas into a single derivation: three axioms, one lemma, one theorem. If the theorem holds, AI Native is not a management fad but an inevitable solution under shifting cost structures; if it is refuted, every blueprint that follows should be discarded. Two direct corollaries follow: the fate of management's hundred-and-ten-year canon of five functions, and the reason a one-person company has, for the first time, become a serious organizational design option.
- Fayol, 1916 - Five functions
- Coase, 1937 - Theory of Firm
- Simon, 1947 - Bounded rationality
- Dunbar, 1992 - Social brain
- Jarvis, 2019 - Company of One
- Altman, 2024 - One-person unicorn
先把两个旧问题摆回桌面。Coase 1937 年问:既然市场有效,公司为什么存在?答案是交易成本——市场协调有摩擦,于是把一部分协调收进组织内部。Simon 1947 年问:既然人是有限理性的,组织如何可能?答案是结构——用层级与流程补偿单个大脑的带宽。两个答案合起来,是过去一百年全部组织设计的地基。AI 没有推翻这两问——它改变了两问的参数。而参数变化大到一定程度,解的形态就会突变。
Start by returning two old questions to the table. Coase asked in 1937: if markets are efficient, why do firms exist? The answer: transaction costs. Market coordination carries friction, so some coordination is absorbed inside the organization. Simon asked in 1947: if humans are boundedly rational, how is organization possible? The answer: structure. Hierarchy and process compensate for the bandwidth of a single mind. Together, these two answers constitute the foundation of every organizational design of the past century. AI does not overturn either question: it changes the parameters of both. And when parameters shift far enough, the shape of the solution undergoes a phase transition.
T1 把组织从"人的集合"重新定义为两张结构的叠加——判断在哪里发生(分布),以及做判断所需的背景如何抵达(流动)。这不是修辞性的重定义:两张结构各自有可检查的健康度——判断是否发生在离上下文最近的位置?上下文是否不经人肉转译就能抵达?组织设计从此可以被工程化地审计,而不是被组织图描述。
T1 redefines the organization from a "collection of people" into the superposition of two structures: where judgment occurs (distribution), and how the context needed for judgment arrives (flow). This is not a rhetorical redefinition. Each structure carries auditable health indicators: does judgment occur at the point closest to context? Does context arrive without human translation in the middle? Organizational design can henceforth be engineered and audited, not merely described by an org chart.
T2 是 T1 的直接推论:如果组织是这两张结构,管理就是这两张结构的工程学。SHEET 04 章末的 THE KERNEL 给出这门工程学的日常动词——压缩×持续:持续压缩串行瓶颈,把判断之前的等待交给 agent 网络。本张图纸给出它的名词。名词回答"组织是什么",动词回答"每周一早上做什么"——合起来,才是完整的内核。
T2 is the direct corollary of T1: if the organization is these two structures, management is the engineering of these two structures. THE KERNEL block at the end of SHEET 04 supplies the daily verbs of this engineering, compress × continuously: continuously compress serial bottlenecks, and hand over the waiting-before-judgment to the agent network. This blueprint supplies the nouns. Nouns answer "what is the organization"; verbs answer "what to do on Monday morning." Together, they form the complete kernel.
但 T1、T2 都只回答了"怎么造"。在它们之前,还有一个更先、也更容易被跳过的问题——"为何造"。四百年来它被默认掉了:人手稀缺,效率本身就是值得榨取的目标,组织为效率而建,人的意义被压在效率之下。AI 第一次让效率变得充裕——效率于是从"目标"降级为"手段"(正如下文规模也将从目标降为变量)。真正值得重新围绕它来设计组织的,是那个一直被压住的答案:让人去做值得做、也值得热爱的工作——判断、探索、创造,为意义与价值负责。所以这套内核得倒过来读:T1 是手段,让人回归于人才是目的。把判断 × 上下文优化到极致、却让人沦为喂养算法的组织,不是 AI Native 的成功,而是它最危险的失败——本末倒置。
But T1 and T2 only answer "how to build." Before them sits an earlier question, the one most easily skipped: "what to build it for." For four centuries that question was assumed away: hands were scarce, so efficiency itself was the prize; organizations were built for it, and human meaning was pressed underneath it. AI makes efficiency abundant for the first time, so efficiency is demoted from a goal to a means (just as scale, below, is demoted from goal to variable). What is now worth rebuilding the organization around is the answer held down all along: letting people do work worth doing and worth loving, namely judgment, exploration, creation, and responsibility for meaning and value. So the kernel reads in reverse: T1 is the means; returning people to being human is the end. An organization that optimizes judgment × context to the limit yet reduces people to feeding the algorithm is not an AI-Native success; it is its most dangerous failure, an inversion of ends and means.
管理五职能的去向What Happens to Fayol's Five Functions
What Happens to Fayol's Five Functions
1916 年,Henri Fayol 在《工业管理与一般管理》里把管理定义为五种职能:计划、组织、指挥、协调、控制。此后一百一十年,管理学教材都是这五个词的注脚。把 T2 套在这五个词上,可以逐项预言它们的去向——注意,没有一种是"被 AI 增强",也没有一种是凭空消失:每一种都被拆成两半,可结构化的一半下沉为基础设施,不可结构化的一半上浮为判断。
In 1916, Henri Fayol defined management as five functions in Administration Industrielle et Générale: planning, organizing, commanding, coordinating, and controlling. For a hundred and ten years afterward, management textbooks were footnotes to those five words. Applying T2 to each in turn yields a precise forecast of their fate. Note that not one is "augmented by AI," nor does any vanish into thin air: each is split in two, with the structurable half sinking into infrastructure and the unstructurable half rising as judgment.
表的右侧两列藏着一个组织学结论:中层管理者恰好整层站在这条分界线上。中层的传统职能——信息上传下达、跨组转译、进度协调——几乎全部落在"可结构化"一侧。这不是"AI 取代中层"的耸动说法,而是一个结构事实:当上下文可以被系统继承,人肉路由器就从岗位退化为瓶颈(SHEET 04 把它列为瓶颈而非职位)。幸存下来的不是"中层"这个层级,而是其中真正在做判断的人——管理幅度(span of control)让位给判断幅度(span of judgment):一个人能为多大的图承担例外、不可逆与方向三类判断。
The two right-hand columns of the table conceal an organizational conclusion: middle managers as a class stand precisely on this dividing line. The traditional functions of middle management (relaying information up and down, translating across groups, coordinating progress) fall almost entirely on the "structurable" side. This is not the sensationalist claim that "AI replaces middle management"; it is a structural fact: once context can be inherited by a system, the human router degrades from a role into a bottleneck (SHEET 04 classifies it as a bottleneck, not a position). What survives is not the "middle management" tier, but the individuals within it who are genuinely exercising judgment. Span of control yields to span of judgment: how large a graph one person can bear responsibility for across the three classes of judgment (exception, irreversibility, and direction).
组织形态的光谱The Spectrum of Organizational Forms
The Spectrum of Organizational Forms
T1 有一个最容易被忽略的推论:规模从目标变量降级为自由变量。如果组织是判断的分布与上下文的流动,"多少人"就不再是组织的定义性属性,而是一个工程参数——由判断需要多少个不可替代的承担者决定。参数空间的两端,第一次被同一套原理覆盖。
T1 carries one corollary that is most easily overlooked: scale is demoted from a target variable to a free variable. If an organization is the distribution of judgment and the flow of context, "how many people" is no longer a defining attribute of the organization; it is an engineering parameter, determined by how many irreplaceable bearers of judgment the work requires. For the first time, both ends of the parameter space are covered by the same set of principles.
光谱最左端的完整论述——核心命题「规模是选择,连贯性是目的」、四个世界观与七个支柱、现实标定(Levels / Lou / Welsh 自报口径与 Altman 的一人独角兽赌局)、以及"极限解非普遍处方"的诚实注脚——详见 SHEET 14《组织的下限·一人公司》。它是 T1 在 N=1 处的极限解与试金石:把"组织必须是很多人"这个隐含假设,永久地变成一个待论证的命题。
The full treatment of the far-left end of the spectrum is in SHEET 14: The Lower Bound of Organization · One-Person Company, covering the core proposition "scale is a choice, coherence is the purpose," the four worldviews and seven pillars, real-world calibration (the self-reported figures of Levels / Lou / Welsh and Altman's one-person-unicorn wager), and the honest caveat that "the limiting solution is not a universal prescription." It is the limiting solution and litmus test of T1 at N=1: it permanently converts "an organization must be many people" from a hidden assumption into a proposition awaiting proof.
接下来的图纸回到光谱的主流区段,处理一个更难的问题:当组织确实需要不止一个人时,为什么"在旧结构上加 AI"注定失败。十六个瓶颈,每一个都是 T1 的反面证明。
The blueprints that follow return to the mainstream segment of the spectrum, addressing a harder question: when an organization genuinely requires more than one person, why is "adding AI onto the old structure" destined to fail? Sixteen bottlenecks, each a proof-by-contradiction of T1.
"加 AI"解不开的十六个结构瓶颈
The Sixteen Structural Bottlenecks That Overlaying AI Cannot Solve
给每个员工配上 AI 的组织,端到端吞吐几乎不动——因为瓶颈从来不在节点的速度,而在图的形状。这十六个瓶颈源于传统组织的结构本身:工具触不到它们,转型绕不开它们,只有从底层重画才能消除它们。
Equip every employee with AI and end-to-end throughput barely moves, because bottlenecks have never lived in the speed of nodes; they live in the shape of the graph. These sixteen bottlenecks are native to the structure of the traditional organization itself: tools cannot reach them, incremental transformation cannot bypass them, and only redrawing from the foundation can eliminate them.
- Amdahl, 1967 - Limits of speedup
- Conway, 1968 - Committees invent
- Galbraith, 1974 - Info-processing view
- Brooks, 1975 - Mythical Man-Month
- Goldratt, 1984 - Theory of Constraints
- METR, 2025 - RCT: AI & dev speed
- 黄益贺, 2026 - AI原生组织的底层逻辑
- Huang Yihe, 2026 - The Underlying Logic of AI-Native Organizations
SECTION 02 给出了那个刺眼的数字——95% 的企业 GenAI 试点没有可衡量的损益影响。本章回答"为什么":因为这些组织把 AI 部署在节点上(让某个人、某个环节更快),而组织的吞吐量是图的属性——由依赖链的拓扑、协调成本的曲线、决策队列的深度决定。工具改变节点的速度,改不了图的形状。
SECTION 02 surfaced the glaring number: 95% of enterprise GenAI pilots show no measurable profit-and-loss impact. This chapter answers "why": these organizations deploy AI on the nodes (making a person or a step faster), yet an organization's throughput is a property of the graph, set by the topology of its dependency chains, the curve of its coordination cost, and the depth of its decision queues. Tools change the speed of a node; they do not change the shape of the graph.
一条流程若有 70% 的时间花在串行的等待、交接与审批上,那么即使把其余 30% 的执行加速一百倍,端到端也只快 1.42 倍。这就是"组织大量使用 AI 却没有变快"的数学结构:加速节点是工具问题,重画图是架构问题。AI 解决前者;这份规约的其余部分解决后者。
If 70% of a process's time is consumed by serial waiting, handoffs, and approvals, accelerating the remaining 30% by a factor of a hundred still yields only a 1.42× end-to-end gain. That is the mathematical structure behind "organizations deploying lots of AI yet not getting faster": speeding up nodes is a tooling problem; redrawing the graph is an architecture problem. AI solves the former; the rest of this specification solves the latter.
黄益贺(2026)描述过这条定律的一个具象版本——从业者观察,非受控研究:风投机构给每个分析师配上最强的 AI,让十个 agent 并行生成十份研究报告——然后所有报告仍由同一个人按顺序阅读、由同一个投委会按周期审议。生产端并行了,消费端依旧串行,两到四周的流程几乎没有缩短。他对此有一句精确的总结:"AI 不是自动让组织变快,它只是把串行瓶颈照得更亮。"——可并行的部分被 AI 加速之后,真正拖慢组织的环节会暴露得前所未有地明显:慢的不是写报告,是谁来读、谁来判断、谁来拍板。个体层面的证据同样刺眼:METR 2025 年的随机对照试验中(16 名资深开源维护者、246 个任务、各自深耕多年的百万行级代码库),开发者使用 AI 工具后实际慢了 19%,却自认为快了约 20%。研究者明确警告此结果不应外推到陌生代码库或从零构建的场景——但它至少钉死了一件事:叠加层面的收益可能远比体感小,而体感本身不可信。
Huang Yihe (2026) described a concrete instance of this law (a practitioner observation, not a controlled study): a VC firm equips every analyst with the most powerful AI and lets ten agents generate ten research reports in parallel. All the reports are then still read sequentially by the same person, and reviewed on the same committee cadence. The production side parallelized; the consumption side remained serial. A two-to-four-week process barely shortened. His precise summary: "AI doesn't automatically make an organization faster - it just illuminates serial bottlenecks more brightly." Once the parallelizable parts are accelerated by AI, the stages that truly slow the organization become exposed as never before: the bottleneck is not writing reports, but who reads them, who judges them, who decides. Individual-level evidence is equally stark: in METR's 2025 randomized controlled trial (16 experienced open-source maintainers, 246 tasks, each working in a million-line codebase they had cultivated for years), developers using AI tools were actually 19% slower, yet estimated themselves to be about 20% faster. The researchers explicitly cautioned against extrapolating to unfamiliar codebases or greenfield builds, but the trial nails one thing down: overlay-level gains may be far smaller than perceived, and perception itself cannot be trusted.
判别一个问题属于哪一层:工具层(单点执行慢——AI 直接可解)、流程层(顺序可重排——流程再造可解)、结构层(瓶颈由组织的存在方式本身产生——只能重构)。以下十六个瓶颈全部位于结构层。每一个都按同一格式解剖:机制(传统组织为什么必然产生它)、为什么加 AI 无效(叠加悖论的具体形态)、AI Native 重构(映射到心智模型 M.01-M.05 与七大支柱)、检验信号(你的组织是否已经越过它)。
Diagnosing which layer a problem belongs to: tooling layer (a single node executes slowly; AI can fix this directly); process layer (sequence can be reordered; process reengineering can fix this); structural layer (bottleneck is generated by the organization's very mode of existence; only restructuring can fix this). The following sixteen bottlenecks all reside at the structural layer. Each is dissected in the same format: mechanism (why the traditional organization inevitably produces it), why overlay fails (the specific form of the overlay paradox), AI Native restructure (mapping to mental models M.01-M.05 and the seven pillars), test signal (whether your organization has already cleared it).
串行依赖链The Serial Dependency Chain
传统流程是一场人传人的接力赛:需求 → 设计 → 构建 → 评审 → 发布,每一棒之间是队列与等待。精益研究反复测得流动效率不足 15%——一项工作 85% 以上的生命周期处于"等人"状态。总时长由串行链决定,与任何单个环节的内部效率无关。
The traditional process is a human relay race: requirements → design → build → review → ship, with queues and waiting between every baton pass. Lean research consistently measures flow efficiency below 15%: over 85% of a work item's lifecycle is spent waiting for someone. Total duration is determined by the serial chain, independent of the internal efficiency of any individual stage.
给每个环节配 AI 加速的是"棒内奔跑",碰不到"棒间等待"。十个 agent 并行写出十份报告,最终仍由同一个人按顺序阅读——生产端并行了,消费端依旧串行。按阿姆达尔定律,串行占比 70% 时,无论把其余部分加速多少倍,总收益上限也只有 1.43 倍。更糟的是,上游加速会在未扩容的下游堆出更深的队列——约束理论早已断言:非瓶颈处的改善是幻觉。
Equipping each stage with AI accelerates "running with the baton"; it never touches "waiting between baton passes." Ten agents write ten reports in parallel; they are still read sequentially by the same person: the production side parallelized, the consumption side remains serial. By Amdahl's Law, when serial fraction is 70%, no matter how much you accelerate the rest, the total gain ceiling is only 1.43×. Worse: upstream acceleration piles deeper queues at unscaled downstream stages; the Theory of Constraints established long ago that improvement at a non-bottleneck is an illusion.
M.01 把组织声明为工作流图,支柱 02"工作流即代码"重画拓扑——可并行的全部并行扇出;交接消失,因为流转的是同一份上下文而非互相抛接的文档;审批从"排队等人"变为策略即代码的自动门加例外上报。人只出现在少数判断节点上。
M.01 declares the organization as a workflow graph; Pillar 02 "workflow-as-code" redraws the topology: everything parallelizable fans out in parallel; handoffs disappear because what flows is a shared context, not documents tossed back and forth; approvals shift from "queuing for a human" to policy-as-code automated gates with exception escalation. Humans appear only at a small number of judgment nodes.
协调成本平方律The Quadratic Coordination Tax
n 个需要对齐的人产生 n(n−1)/2 条沟通信道,组织每长大一圈,新增产能就被新增协调吃掉一块——Brooks 定律"给延期项目加人会让它更延期"只是这条曲线最著名的切片。整个中层管理的本质,就是组织为这条平方曲线雇佣的人肉路由器。
n people who need to align produce n(n−1)/2 communication channels; every time the organization grows by one ring, a slice of new capacity is consumed by new coordination. Brooks's Law ("adding people to a late project makes it later") is just the most famous cross-section of this curve. The entire function of middle management is the human routing layer the organization hires to service this quadratic curve.
每人配 AI 不减少信道数量,反而提高每条信道的流量——更多文档、更多消息、更快的来回,拥塞加剧。AI 纪要与摘要是在给平方曲线做无损压缩,曲线本身纹丝不动。
Giving everyone an AI does not reduce the number of channels; it increases the traffic on every channel: more documents, more messages, faster back-and-forth, greater congestion. AI meeting minutes and summaries merely compress the traffic running over the quadratic curve; the curve itself does not move.
M.03 上下文即核心资产——对齐通过共享上下文库完成,而非点对点同步。任何成员(人或 agent)从同一份机器可读的真相出发工作,信道结构从 O(n²) 网状坍缩为 O(n) 星形;agent 之间走结构化协议(任务、事件、状态机),根本不"开会"。
M.03 (context as core asset): alignment happens through a shared context store, not point-to-point sync. Every member (human or agent) works from the same machine-readable source of truth; channel structure collapses from O(n²) mesh to O(n) star. Agents coordinate via structured protocols (tasks, events, state machines); they do not "have meetings."
决策带宽天花板The Executive Bandwidth Ceiling
传统组织用"判断集中到塔尖"换取一致性:重要决策逐级上报,CEO 的清醒时间成为全组织吞吐的硬上限。Simon 的有限理性在组织层面的表现是——组织越大,决策队列越深,一线感知与决策点之间的距离越远。
The traditional organization trades "concentrating judgment at the apex" for consistency: important decisions escalate tier by tier, and the CEO's waking hours become the organization's hard throughput ceiling. Simon's bounded rationality at the organizational scale means: the larger the organization, the deeper the decision queue, and the greater the distance between frontline perception and the decision point.
给高管配 AI 摘要、给汇报配 AI 润色,只是让队列中的文档更漂亮、队列前进略快。决策延迟的主项是"排队等判断",不是"读材料太慢"——单点带宽没变,天花板就没变。
Giving executives AI-generated summaries and AI-polished reports just makes the queued documents prettier; the queue moves only marginally faster. The dominant source of decision latency is "queuing for judgment," not "reading too slowly." Single-point bandwidth unchanged, ceiling unchanged.
SECTION 02 判断稀缺性经济学的组织化:可编码的判断写成 guardrails 与策略,下放给 agent 与一线——M.05"人即判断锚点"指的是判断有锚,不是判断有漏斗。高层只保留三类判断:例外、不可逆、方向。决策权随上下文走,不随职级走。对必须保留在塔尖的判断,把判断前的预消化全部交给 agent——读完材料、对齐观点、列出假设、整理反方证据、标注不确定性——人面对的不再是原始材料的洪流,而是一张高度浓缩的决策地图。
SECTION 02's economics of judgment scarcity, institutionalized: codifiable judgments are written as guardrails and policies, delegated to agents and the frontline. M.05 "human as judgment anchor" means judgment has an anchor, not a funnel. Leadership retains only three categories of judgment: exceptions, irreversibles, and direction. Decision authority follows context, not hierarchy. For judgments that must stay at the apex, pre-digest everything before the judgment with agents (read materials, align viewpoints, list assumptions, compile counter-evidence, flag uncertainties) so humans face not a flood of raw material, but a highly condensed decision map.
层级信息衰减Hierarchical Signal Decay
信息每向上传一层都被压缩、平滑、政治化,坏消息衰减得最快。控制论给过精确诊断——Ashby 必要多样性定律:当调节者拿到的信息品类少于系统扰动的品类,控制必然失败。传统组织的报告链是一台逐层削减多样性的机器。
With every tier that information travels up, it is compressed, smoothed, and politicized; bad news decays fastest. Cybernetics offers a precise diagnosis, Ashby's Law of Requisite Variety: when the variety of information reaching the regulator is less than the variety of disturbances in the system, control must fail. The traditional organization's reporting chain is a machine that strips variety away layer by layer.
AI 帮中层把周报写得更流畅,等于更高效地生产失真。失真不来自写作能力,来自"层级转述"这个信道本身——每一层都有选择性呈现的激励,AI 只会把选择性呈现做得更专业。
AI helping middle managers write smoother weekly reports is producing distortion more efficiently. Distortion does not come from writing ability; it comes from the channel of "hierarchical retelling" itself. Every tier has an incentive to present selectively; AI just makes selective presentation more polished.
支柱 05 可观测性先于规模:工作流原生埋点,决策者直接查询现场数据与 agent 执行轨迹。人肉报告链被"随时可查询的状态"替代,汇报从周期性叙事变成对同一套遥测的不同视图。
Pillar 05 (observability before scale): instrumentation native to the workflow, decision-makers querying live data and agent execution traces directly. The human reporting chain is replaced by "always-queryable state"; reporting becomes different views over the same telemetry rather than periodic narrative.
部门墙与局部最优Functional Silos & Local Optima
按职能切分让每个部门优化自己的 KPI,端到端价值流被切成片段,部门交界处堆积着队列、翻译损耗和"这不是我们的问题"。Conway 定律保证你的产品结构最终复刻这堵墙。
Organizing by function causes every department to optimize its own KPIs; the end-to-end value stream is sliced into fragments; queues, translation loss, and "that's not our problem" accumulate at every departmental boundary. Conway's Law guarantees your product architecture will eventually mirror this wall.
每个部门各自采购 AI 工具,墙反而更厚——数据孤岛之上又叠了一层工具孤岛,跨墙交接依旧靠人开会翻译。每个局部都更快了,全局还是次优,而且次优得更快。
Each department purchases its own AI tools, making the walls thicker: a layer of tool silos stacked on top of data silos; cross-wall handoffs still require humans to meet and translate. Every local optimum improved, global outcome still suboptimal, and now suboptimal faster.
M.01 围绕端到端价值流设计组织:一条工作流从客户触发直到客户收到,由跨域 agent 编队走完全程,职能变成"被工作流调用的能力",不再是"占有工作的领地"。Operator 拥有整条流,不是其中一段。
M.01 designs the organization around end-to-end value streams: one workflow from customer trigger to customer receipt, traversed by cross-domain agent ensembles; functions become "capabilities invoked by the workflow," no longer "territories that own work." The Operator owns the whole stream, not a segment of it.
会议同步税The Synchronous Coordination Tax
会议是传统组织的默认协调原语——一种要求所有参与者同时在场的阻塞调用。管理者 30-50% 的时间花在会上,日历碎片进一步杀死深度工作。会议泛滥的根因是两个缺失:状态不可见,决策权不明确——于是只好用"同时在场"兜底。
Meetings are the default coordination primitive of the traditional organization: a blocking call that requires all participants to be present simultaneously. Managers spend 30-50% of their time in meetings; calendar fragmentation further kills deep work. The root cause of meeting proliferation is two absences: state is not visible, and decision authority is not defined, so "all present at once" becomes the fallback.
AI 纪要、AI 排程降低了开会的边际成本——于是会议更多了,这是杰文斯悖论的会议版。工具优化的是"怎么开会",而真正的问题是"为什么需要开会"。
AI meeting minutes and AI scheduling lower the marginal cost of meetings, so there are more meetings. This is the meeting-world version of Jevons's Paradox. The tools optimize "how to meet"; the real question is "why meetings are needed at all."
支柱 02 让协调走异步状态机:工作流状态对所有参与者可见,交接由事件触发,决策带理由记录在案。同步在场只保留给真正需要它的三件事——判断分歧、关系建立、危机处理。AI Native 组织的默认是 async-first,日历近乎空白。
Pillar 02 routes coordination through asynchronous state machines: workflow state is visible to all participants, handoffs are event-triggered, decisions are recorded with rationale. Synchronous presence is reserved for the three things that truly require it: adjudicating judgment disagreements, relationship-building, and crisis response. The AI Native organization defaults to async-first; calendars are nearly empty.
知识私有化Tacit Knowledge Lock-in
关键知识活在个人头脑、私聊记录和"去问老王"里。新人上手以月计,老人离职即知识蒸发——bus factor 长期为个位数。更糟的是私有化被激励结构强化:不可替代性就是职业安全。
Critical knowledge lives in individual minds, private chat histories, and "go ask Zhang Wei." Onboarding new hires takes months; when veterans leave, knowledge evaporates; the bus factor stays in the single digits indefinitely. Worse, privatization is reinforced by the incentive structure: irreplaceability is job security.
给每人配 AI 助手改变不了知识的私有属性——AI 能检索写下来的一切,唯独检索不了从未写下的东西。二十年来企业知识库项目失败的原因从来不是搜索技术,而是"写下来"从未成为工作本身的一部分。
Giving everyone an AI assistant does not change the private nature of knowledge: AI can retrieve everything that was written down; it cannot retrieve what was never written. The reason enterprise knowledge-base projects have failed for twenty years was never search technology. It was that "writing things down" was never made part of the work itself.
M.03 加支柱 03 上下文工程作为系统实践:知识只有进入机器可读的上下文库才算"存在"——决策连同理由入库,流程以代码形式自文档,agent 的执行轨迹自动沉淀为组织记忆。新人与新 agent 的上手时间从月降到天。
M.03 plus Pillar 03 (context engineering as a system practice): knowledge only "exists" once it enters a machine-readable context store. Decisions are stored with their rationale; processes self-document as code; agent execution traces are automatically deposited into organizational memory. Onboarding time for new humans and new agents drops from months to days.
审批链与责任稀释Approval Chains & Diffused Accountability
传统组织用多级签字管理风险,但签字越多责任越稀——每个人都默认上一级看过了、下一级会把关,社会心理学称之为责任分散效应。审批链的实际功能往往不是质量控制,而是责任分摊仪式:出事之后,找不到"做决定的那个人"。
The traditional organization uses multi-tier sign-off to manage risk, but the more signatures there are, the more diffuse accountability becomes: each person assumes the tier above reviewed it and the tier below will catch it. Social psychology calls this the diffusion of responsibility. The actual function of the approval chain is often not quality control but a ritual of distributing blame: when something goes wrong, there is no "person who made the decision" to be found.
AI 起草材料、AI 预审,让链条空转得更快——"人人有份、无人负责"的结构原封未动。甚至更糟:"AI 预审通过"成为新的集体免责理由。
AI drafting materials and AI pre-review make the chain spin faster; the structure of "everyone involved, no one accountable" is untouched. It may even worsen: "AI pre-review passed" becomes the new collective disclaimer.
M.05 与支柱 06:审批收敛为少数显式判断节点,每个节点单人、具名、权责成对。可编码的检查全部交给自动门——测试、策略、合规规则;人签的字只剩一种含义:"这个后果我来承担。"
M.05 and Pillar 06: approvals converge to a small number of explicit judgment nodes; each node is a single named individual with paired authority and accountability. All codifiable checks go to automated gates: tests, policy rules, compliance checks. The only meaning left in a human signature: "I own this outcome."
人头即产能Headcount-as-Capacity
传统组织扩张能力只有一个原语:招聘。周期以月计、成本固定化、错配难逆转,于是产能规划变成赌博,组织在"人手不够"与"养着闲人"之间永久摆动。预算以人头计、权力以下属数计——这也是帝国构建行为的经济根源。
The traditional organization has only one primitive for expanding capability: hiring. Cycles measured in months, costs that become fixed, mismatches that are hard to reverse: capacity planning becomes a gamble, and the organization oscillates permanently between "not enough hands" and "carrying dead weight." Budget is counted in headcount; power is measured in the number of direct reports. This is also the economic root of empire-building.
AI 招聘工具加速的是旧通道——更快地筛简历、更快地面试,但从不质疑"能力 = 人头"这个等式。各部门继续以多要人头为目标,AI 反而成了新论据:"我们需要再招五个 AI 工程师。"
AI recruiting tools accelerate the old channel (faster resume screening, faster interviews) while never questioning the equation "capability = headcount." Departments continue to target more headcount; AI becomes the new argument: "We need to hire five more AI engineers."
M.02 Agent 即默认工种:任何新能力需求,默认先问"工作流加 agent 能否承担",招人只为增加判断密度。产能变成弹性量——agent 实例随负载伸缩,组织能力与组织人数解耦。这正是人均创收 $600 万量级的结构基础(SECTION 02,口径见 SHEET 09)。
M.02 (agent as default job type): for any new capability requirement, the default first question is "can a workflow plus agent handle this?" Hiring is reserved for increasing judgment density. Capacity becomes elastic: agent instances scale with load, and organizational capability is decoupled from headcount. This is the structural basis for the $6M revenue-per-person scale (SECTION 02; scope defined in SHEET 09).
规划节奏失配The Planning Cadence Mismatch
年度预算加季度 OKR 的节奏继承自工业时代的资本开支周期。环境以周为单位变化,资源以年为单位锁定——组织对机会的响应速度,被规划日历钉死。年中发现方向错了?等明年预算。
The annual budget plus quarterly OKR cadence is inherited from the industrial era's capital expenditure cycle. The environment changes by the week; resources are locked by the year, so the organization's speed of response to opportunity is capped by the planning calendar. Discover mid-year that the direction is wrong? Wait for next year's budget.
AI 让规划文档的生产快了十倍——三天做出过去三周的 PPT——但"批准、锁定、执行、年终复盘"的律动没变。更快地制定一个一年不变的计划不叫敏捷,叫更精致的僵化。
AI makes planning document production ten times faster (a three-week slide deck now takes three days), but the rhythm of "approve, lock, execute, year-end review" is unchanged. Producing a year-locked plan faster is not agility; it is more refined rigidity.
支柱 07 持续演化:资源跟随工作流遥测动态再分配——表现好的流自动获得更多算力、预算与 agent 配额,表现差的流被自动收缩。规划从年度仪式变成持续运转的内部资源市场,节奏与反馈周期同阶。
Pillar 07 (continuous evolution): resources are dynamically reallocated following workflow telemetry. Well-performing flows automatically receive more compute, budget, and agent quota; poorly performing flows are automatically contracted. Planning transforms from annual ritual into a continuously operating internal resource market whose cadence is in phase with the feedback cycle.
试错成本与风险规避Experiment Cost & Risk Aversion
传统组织里一次尝试等于立项加排期加占人加失败追责。试错又贵又伤人,于是只做"安全"的事——March 所说的 exploitation 挤出 exploration,组织系统性地低配探索。创新不是死于失败,是死于"值得吗"的会议。
In the traditional organization, one attempt equals project approval plus scheduling plus headcount allocation plus accountability for failure. Experimentation is expensive and painful, so the organization only does what is "safe": March's exploitation crowds out exploration, and the organization systematically under-invests in discovery. Innovation doesn't die from failure; it dies in the "is this worth it?" meeting.
执行变快了,但立项流程、追责文化、机会成本核算原样保留——组织还是只批"看起来稳"的实验。AI 甚至放大了错觉:AI 生成的市场分析让坏主意显得更可信——见 SECTION 11 的陷阱"合成自信"。
Execution is faster, but the project-approval process, accountability culture, and opportunity-cost accounting are untouched; the organization still greenlights only experiments that "look safe." AI can even amplify the illusion: AI-generated market analyses make bad ideas look more credible (see SECTION 11's failure pattern "Synthetic Confidence").
M.04 持续学习即操作系统:agent 并行运行 N 个变体,由真实数据裁决,实验的单位成本低到不值得为失败追责。文化从"审批制"换轨为"回滚制"——默认可试,越界自动回滚。探索从例外变成底色。
M.04 (continuous learning as the operating system): agents run N variants in parallel, adjudicated by real data; the unit cost of an experiment drops too low to justify accountability for failure. Culture shifts from "approval regime" to "rollback regime": everything is tryable by default, and anything out of bounds rolls back automatically. Exploration moves from exception to baseline.
指标剧场Metric Theater
传统组织度量个人产出——代码行、工时、关单数。Goodhart 定律保证指标一旦成为目标就被博弈:囤积信息以保不可替代、刷指标以保排名、报喜不报忧以保预算。度量系统本身在制造反协作。
The traditional organization measures individual output: lines of code, hours logged, tickets closed. Goodhart's Law guarantees that any metric, once it becomes a target, gets gamed: hoarding information to stay irreplaceable, inflating metrics to protect rankings, reporting only good news to protect budget. The measurement system itself manufactures anti-collaboration.
AI 把刷指标的成本降到零——合成产出无限供给,"看起来很高产"从未如此容易。继续度量个人产出,等于正式邀请全员用 AI 生产度量噪音。MIT 那 95% 里,相当一部分正是"AI 提升了指标、没碰到损益"。
AI reduces the cost of gaming metrics to zero: synthetic output comes in unlimited supply, and "appearing highly productive" has never been easier. Continuing to measure individual output is a formal invitation for everyone to use AI to produce measurement noise. A significant portion of that MIT 95% is precisely "AI improved the metrics without touching the P&L."
度量对象从个人换成工作流(支柱 05):吞吐、质量、成本、演化速度——由埋点客观采集,难以被个体博弈。人的评价转向 agent 无法供给的稀缺物:判断质量、上下文贡献、方向正确度(SECTION 02 的 KPI 反转)。
The unit of measurement shifts from individuals to workflows (Pillar 05): throughput, quality, cost, and evolution speed, objectively captured by instrumentation and difficult for any individual to game. Human evaluation shifts toward scarcities that agents cannot supply: judgment quality, context contribution, directional correctness (the KPI inversion from SECTION 02).
信任半径坍缩Trust-Radius Collapse
组织的有效协作半径由心理安全决定——人只在"暴露问题不会被惩罚"时才上报坏消息、试验、求助。Edmondson 的研究反复显示:心理安全高的团队报告更多错误——差异来自报告意愿而非犯错频次。信任是吞吐量的隐形上限。
An organization's effective collaboration radius is determined by psychological safety: people only report bad news, run experiments, and ask for help when "exposing problems carries no punishment." Edmondson's research consistently shows that high-psychological-safety teams report more errors; the difference comes from willingness to report, not frequency of mistakes. Trust is the invisible throughput ceiling.
Agent 让一切可观测,诱惑是把可观测变成全员监控。一旦遥测被用于考核与裁员,员工的理性反应是隐藏——隐藏 AI 用法、隐藏省下的时间、隐藏失败试验。监控越密,真实信号越枯竭。这是"裁员叙事自反噬"(见失败模式)的结构根源:你买了 X 光机,却让所有人学会了憋气。
Agents make everything observable; the temptation is to convert observability into surveillance of everyone. Once telemetry is used for performance reviews and layoffs, the rational employee response is to hide: hide AI usage, hide time saved, hide failed experiments. The denser the surveillance, the drier the true signal. This is the structural root of "the layoff narrative auto-cannibalization" (see failure modes): you bought an X-ray machine, but taught everyone to hold their breath.
把遥测的用途宪法化——可观测性服务于系统改进而非个人审判(支柱 05)。区分"看流程"与"看人":流程指标公开,个人产出不入考核。Mollick 的 Leadership/Lab/Crowd——激励对齐到分享而非惩罚,才能让 X 光机照出真相而非教会憋气。
Constitutionalize the purpose of telemetry: observability serves system improvement, not individual prosecution (Pillar 05). Distinguish "watching the process" from "watching the person": process metrics are public; individual output is not factored into evaluations. Mollick's Leadership/Lab/Crowd framework: align incentives toward sharing rather than punishment, and the X-ray machine reveals truth instead of teaching breath-holding.
权力梯度与议程垄断Power Gradient & Agenda Capture
组织里最关键的权力不是"否决提案",而是"决定哪些提案进入议程"——Bachrach & Baratz 称之为权力的第二张面孔。议程设置权高度集中时,大量选项在被讨论前就已死亡,而组织对此毫无记录。
The most critical power in an organization is not "vetoing proposals"; it is "deciding which proposals reach the agenda." Bachrach & Baratz call this the second face of power. When agenda-setting authority is highly concentrated, vast numbers of options die before they are discussed, and the organization has no record of this.
给决策者配 AI,放大的是既有议程持有者的产能——他能更快生成更多支持自己议程的材料。AI 不会自动质疑"为什么是这些选项"。更隐蔽的是:当 AI 推荐被当作中立,议程垄断就披上了客观的外衣,反而更难挑战。
Equipping decision-makers with AI amplifies the capacity of the existing agenda holder: they can generate more material supporting their own agenda, faster. AI does not automatically question "why these options?" More insidiously: when AI recommendations are treated as neutral, agenda capture dresses itself in the appearance of objectivity and becomes even harder to challenge.
Gans 的逐域控制权([R5])给出方向:透明推理是权威的替代品——让 AI 公开候选选项的全集与淘汰理由,议程从"谁有权设置"变为"图上可见的分支"。决策日志(支柱 03)把被否决的选项也记下来,议程垄断失去隐蔽性。
Gans's domain-by-domain control authority ([R5]) points the direction: transparent reasoning replaces authority. Have AI surface the full candidate option set and the elimination rationale; the agenda moves from "who has the right to set it" to "a visible branch on the graph." The decision log (Pillar 03) records rejected options too; agenda capture loses its concealment.
动机抽干Motivation Crowding-Out
自我决定论与动机挤出研究(Frey-Jegen 的元分析)显示:外在控制(监控、计件、把内在意义换成 KPI)会挤出内在动机。当一件原本有意义的工作被重新框定为"用 AI 多产出 X 倍",意义感会被效率叙事抽干,留下应付。
Self-determination theory and motivation crowding-out research (Frey-Jegen meta-analysis) show that extrinsic control (surveillance, piece-rate pay, replacing intrinsic meaning with KPIs) crowds out intrinsic motivation. When work that was originally meaningful is reframed as "produce X times more output with AI," the sense of meaning is drained by the efficiency narrative, leaving only going through the motions.
把 AI 收益直接翻译成"同样的人产出翻倍"或"同样的产出裁一半人",是教科书级的外在化操作。短期数字好看,中期工匠精神、主人翁感、自发改进一起蒸发——而这些恰是 AI无法提供、只有人能注入组织的东西。你优化了产量,抽干了发动机。
Translating AI gains directly into "same people, double output" or "same output, half the people" is a textbook way to turn intrinsic motivation extrinsic. Short-term numbers look good; medium-term craftsmanship, ownership, and self-driven improvement evaporate together. These are precisely what AI cannot supply and only humans can inject into an organization. You optimized throughput and drained the engine.
把 AI 定位为"卸下苦工、释放判断"而非"同岗增产"——人的角色上移到 M.05 判断锚点与 M.06 编排者,工作变得更需要品味与主张,而非更像流水线。收益分配若指向"更难的好问题"而非"更少的人头",内在动机被放大而非挤出。
Position AI as "removing drudgery, freeing judgment" rather than "same role, more output": the human role elevates to M.05 judgment anchor and M.06 orchestrator; work demands more taste and conviction, less assembly line. If the dividend of AI is directed toward "harder, better problems" rather than "fewer heads," intrinsic motivation is amplified rather than crowded out.
生态位锁定Niche Lock-In
组织生态学(Hannan-Freeman 种群生态视角)提醒:组织的命运不只由内部效率决定,也由它在生态中的位置与依赖结构决定。当核心生产要素来自少数外部供应商,组织的生存权被锁进了别人的生态位。
Organizational ecology (the Hannan-Freeman population-ecology perspective) reminds us: an organization's fate is determined not only by internal efficiency, but also by its position and dependency structure within the ecosystem. When core production inputs come from a small number of external suppliers, the organization's survival rights are locked into someone else's niche.
越深地"加 AI",越深地把核心能力外包给 OpenAI/Anthropic/Google 等少数模型供应商——算法封建主义。围绕单一供应商的 API 怪癖优化,短期最快,却在条款、定价、可用性变动时被挟持。这不是工具问题,是结构性的生态位依赖——单点故障被写进了组织命脉。
The deeper you "add AI," the deeper you outsource core capabilities to a small number of model suppliers (OpenAI, Anthropic, Google): algorithmic feudalism. Optimizing around a single vendor's API quirks is fastest short-term, but leaves you held hostage when terms, pricing, or availability change. This is not a tooling problem; it is structural niche dependency: single points of failure written into the organization's life support.
支柱 04 多模型架构是这条瓶颈的正解——把模型层当作可替换的商品而非命脉,保留"主权"。抽象出供应商无关的内部接口、保留可迁移的上下文资产(M.03)、关键路径双供应商。生态位依赖不可消除,但可以从"命脉"降级为"成本项"。
Pillar 04 (multi-model architecture) is the correct answer to this bottleneck: treat the model layer as a replaceable commodity rather than a lifeline, preserving "sovereignty." Abstract out vendor-agnostic internal interfaces, maintain portable context assets (M.03), dual-source critical paths. Niche dependency cannot be eliminated, but it can be downgraded from "lifeline" to "cost item."
回到每张卡片底部的"检验信号",对照你的组织按下「命中」。这不是测验,是一次结构透视——命中的每一项,都是工作流图上一条还没被删掉的串行边。
Return to the "Test Signal" at the bottom of each card and click "Hit" wherever it matches your organization. This is not a quiz; it is a structural X-ray. Every item you hit is a serial edge on the workflow graph that has not yet been deleted.
这些结构瓶颈几乎都被当成"协调/信息"问题。换一片透镜,同一张图上会亮起不同的受灾点——也会暴露原清单没覆盖的盲区(朱色行即补画的新瓶颈)。点透镜看某一维度的受力,点格子看判词。
These structural bottlenecks are almost always treated as "coordination/information" problems. Swap in a different lens and different damage points light up on the same diagram, exposing blind spots the original list did not cover (rows in vermilion are newly drawn bottlenecks). Click a lens to see stress on one dimension; click a cell to see the verdict.
| 瓶颈 \ 维度Bottleneck \ Dimension | 信息Information | 激励Incentive | 权力Power | 认知Cognition | 时间Time | 生态Ecology |
|---|
十六个瓶颈的重构方案各不相同,但日常动词只有一个——压缩。判断本身不可并行(战略方向、投资决策、产品审美、内容把关必须由人承担),但判断之前的一切都可以:让 agent 预先读完材料、找出证据、对齐分歧观点、列出关键假设、整理反方论据、标注不确定性——把"原始材料的洪流"压缩成"一张决策地图"。压缩的对象不是人的判断,是判断之前的等待。
The sixteen bottlenecks have different restructuring prescriptions, but there is only one daily verb: compress. Judgment itself cannot be parallelized (strategic direction, investment decisions, product taste, editorial gatekeeping must be carried by humans), but everything before judgment can be: have agents pre-read materials, surface evidence, align divergent viewpoints, enumerate key assumptions, compile counter-arguments, and flag uncertainties, compressing "a flood of raw material" into "a decision map." What is being compressed is not human judgment; it is the waiting before judgment.
第二个关键词是持续。瓶颈的边界随模型能力移动:今天必须由人串行处理的,明天可能被 agent 预处理;今天要反复口头解释的背景,明天通过 memory 与 context 系统自动继承。所以 AI Native 组织不是一个固定状态,而是一个持续迭代的产品。Operator 的周期性三问:① 还有哪些事必须由人按顺序处理?② 其中哪些可以被 agent 预处理、被结构化?③ 哪些上下文可以被系统继承、不再依赖口头传递?每一轮回答,都从工作流图上再删掉几条串行边。
The second keyword is continuously. The boundary of bottlenecks moves with model capability: what must be processed serially by humans today may be pre-processed by agents tomorrow; background that requires repeated verbal explanation today will be automatically inherited through memory and context systems tomorrow. The AI Native organization is therefore not a fixed state but a continuously iterated product. The Operator's periodic three questions: ① What still must be processed by humans in sequence? ② Of those, which can be pre-processed and structured by agents? ③ Which context can be inherited by the system without further dependence on verbal hand-off? Each round of answers deletes a few more serial edges from the workflow graph.
什么值得研究、什么不值得——AI 能找答案,难替你决定什么问题最重要。
What is worth investigating, and what is not. AI can find answers; it struggles to decide which questions matter most.
什么是好报告、好产品、好判断。标准不清,agent 跑得越快,垃圾生成得越快。
What counts as a good report, a good product, a good judgment. Without clear standards, the faster agents run, the faster garbage is generated.
历史判断、失败经验、行业框架若不能被系统继承,每次协作都从零开始。
If historical judgments, failure lessons, and domain frameworks cannot be inherited by the system, every collaboration starts from zero.
哪些可自动检查、哪些必须人工 review、哪些错误可容忍、哪些不可接受。
What can be automatically checked, what requires human review, which errors are tolerable, and which are not.
AI 提供信息、证据、反证与模拟,但"要不要做"仍由人承担后果。
AI provides information, evidence, counter-evidence, and simulations, but the consequences of "whether to do it" are still borne by a human.
泰勒制 → 丰田 TPS → 华为 IPD:用流程吃掉个人英雄主义,解决规模化问题[R46]。
Taylorism → Toyota TPS → Huawei IPD: use process to absorb individual heroics and solve the problem of scaling[R46].
Amazon / Google 实验文化 → 字节跳动 A/B 化组织:用数据迭代组织本身,解决迭代速度问题[R47]。
Amazon / Google experiment culture → ByteDance A/B-ified organization: use data to iterate the organization itself and solve the problem of iteration speed[R47].
执行交给 agent 网络,人收敛为判断节点,针对串行瓶颈。公开样本仍薄:Anthropic 自述 10 个团队(含法务、增长营销等非工程团队)已把 agentic 工作流嵌入部分流程[R21]——是嵌入部分流程,不是整体运转。
Execution delegated to agent networks; humans converge to judgment nodes, targeting serial bottlenecks. Public samples remain thin: Anthropic reports that 10 teams (including legal and growth marketing, not just engineering) have embedded agentic workflows into some processes[R21]: embedded into some processes, not running end-to-end.
没有一个瓶颈是"技术不够好"造成的。传统组织是为一个前提设计的机器——人类协调成本高昂且不可压缩。层级、会议、审批链、年度预算、人头制、个人 KPI,全部是那个前提下的最优解。这也是为什么 Ivan Zhao 把 AI 称作"组织的钢铁"——人类沟通不再必须是承重墙,两小时的周对齐会塌缩成五分钟的异步评审[R22]。但钢铁不会自己重盖房子:早期工厂把水车换成蒸汽机却保留其余一切,生产率只微涨;真正的爆发发生在工厂围绕蒸汽机重新设计之后——今天大多数"加 AI"仍停在换水车阶段。这台机器没有坏,它只是在精确地解一道已经被删掉的题。
Not a single bottleneck is caused by "technology that is not good enough." The traditional organization is a machine designed for one premise: human coordination costs are high and incompressible. Hierarchy, meetings, approval chains, annual budgets, headcount-based capacity, individual KPIs: all are the optimal solution under that premise. This is also why Ivan Zhao calls AI "steel for the organization": human communication no longer has to be a load-bearing wall; a two-hour weekly alignment meeting can collapse to a five-minute async review[R22]. But steel does not rebuild houses on its own: early factories swapped waterwheels for steam engines while keeping everything else unchanged, and productivity barely rose; the real explosion came after factories redesigned themselves around the steam engine. Most of today's "adding AI" is still at the waterwheel-swap stage. The machine is not broken; it is just solving a problem that has already been deleted.
AI 删除了前提,但不会自动删除解。把 AI 加装到旧解上,得到的是一个更快的旧组织——十六个瓶颈原样保留,只是每个瓶颈处的队列前进得更体面了。这也是"转型"路径的根本困境:十六个瓶颈中的每一个,在存量组织里都有既得利益的守护者——中层是平方律的雇员,审批链是风险部门的领地,人头预算是权力的度量衡。结构问题之所以是结构问题,就在于它不能在结构内部被投票废除。
AI deleted the premise but does not automatically delete the solution. Overlay AI onto the old solution and you get a faster old organization: sixteen bottlenecks preserved intact, with queues at each bottleneck advancing more gracefully. This is also the fundamental dilemma of the "transformation" path: each of the sixteen bottlenecks has a vested-interest guardian in the incumbent organization: middle managers are employees of the quadratic law, approval chains are the risk department's territory, headcount budgets are the measure of power. The reason structural problems are structural problems is precisely that they cannot be voted out from within the structure.
从新前提重新推导组织——这正是本规约其余部分的内容:SECTION 05 的六个世界观是新前提的公理化,SECTION 07 的七大支柱是推导规则,SECTION 08 的四层底座是物理实现。
Re-deriving the organization from new premises is precisely what the rest of this specification covers: the six worldviews of SECTION 05 are the axiomatization of the new premise; the seven pillars of SECTION 07 are the derivation rules; the four-layer foundation of SECTION 08 is the physical implementation.
六个底层世界观
Six Foundational Worldviews
先于一切支柱的,是看世界的方式。拒绝这六个模型,支柱是空中楼阁;接受它们,支柱几乎是推论。
Before any pillar comes the way you see the world. Reject these six models and the pillars are castles in the air; accept them and the pillars are nearly corollaries.
组织即工作流图
Organization-as-Workflow-Graph
传统组织里,组织图是真相——显示谁向谁汇报。AI Native 组织里,工作流图是真相——显示什么流向哪里、什么触发什么、什么决定什么。组织图如果还存在,是工作流图的下游产物。组织图的权威也比想象中年轻——第一张现代组织图谱是 1855 年 McCallum 为 Erie 铁路所画;Mollick 据此提醒:从组织图谱到敏捷,现有组织技术全部预设"单一的、仅人类的智能"[R8]——所以是重建,不是改装。
In a traditional organization the org chart is the truth: it shows who reports to whom. In an AI Native organization the workflow graph is the truth: it shows what flows where, what triggers what, what decides what. If an org chart still exists it is a downstream artifact of the workflow graph. The authority of the org chart is also younger than we imagine: the first modern org chart was drawn by McCallum for the Erie Railroad in 1855; Mollick uses that fact to remind us that every existing management technology from org charts to agile presupposes "a single, exclusively human intelligence" [R8]. So this is a rebuild, not a retrofit.
Agent 即默认工种
Agent as the Default Role
设计任何任务时的默认假设是——Agent 来做这件事。人类只在有特定理由时介入:判断、问责、关系。这反转了传统偏置:传统问"要不要自动化",AI Native 问"这真的需要人吗"。
When designing any task, the default assumption is an Agent does this. Humans intervene only when there is a specific reason: judgment, accountability, relationships. This inverts the traditional bias: the old question was "should we automate this?" The AI Native question is "does this genuinely need a human?"
上下文即核心资产
Context as the Core Asset
新的资产类别——组织上下文(organizational context)。结构化、可被 Agent 检索的组织思考、决策、运营。它是 AI Native 组织建立的护城河,而且复利积累。Karpathy 把这件事的必要性说得更狠——LLM 患有"顺行性遗忘症",不像同事那样积累语境,上下文必须被显式工程化[R6]:这个资产不是锦上添花,是 Agent 可用的前提。
A new asset class: organizational context, the organization's thinking, decisions, and operations structured so Agents can retrieve them. It is the moat an AI Native organization builds, and it compounds. Karpathy states the necessity more bluntly: LLMs suffer from "anterograde amnesia" and cannot accumulate context the way a colleague does, so context must be explicitly engineered [R6]. This asset is not a luxury; it is the prerequisite for Agent usability.
持续学习即操作系统
Continuous Learning as the Operating System
传统组织通过周期性干预改进。AI Native 组织持续改进——每一次工作流执行都被观察、评估、用来改进工作流本身。这是批处理 vs 流处理,应用到组织学习上。
Traditional organizations improve through periodic interventions. AI Native organizations improve continuously: every workflow execution is observed, evaluated, and used to improve the workflow itself. This is batch processing vs. stream processing, applied to organizational learning.
人即判断锚点
Humans as Judgment Anchors
人不是劳动力。人是判断者、责任承担者、品味设定者、关系持有者。这不是降职,是升职——人在组织中的角色从"执行单位"上升到"判断单位"。Karpathy 的验证瓶颈论与此互证:AI 生成、人类验证——部分自治加滑块,胜过全自治加事故[R6]。
Humans are not labor. Humans are judges, accountability holders, taste-setters, and relationship owners. This is not a demotion but a promotion: the human role in the organization rises from "execution unit" to "judgment unit." Karpathy's verification-bottleneck argument confirms this: AI generates, humans verify. Partial autonomy with a slider beats full autonomy with accidents [R6].
组织即生命系统
Organization as a Living System
传统组织被设计成工厂——部门、流程、岗位说明书,是一组刚性隔间。AI Native 组织被设计成生命系统——一切流动,发现问题的"细胞"被授权直接响应,秩序自下而上涌现而非自上而下指派。这是小型组织能比大型快一个数量级的结构性原因——不是更聪明,是结构不同。它也把光谱两端缝合:N=1 的一人公司是单细胞高密度判断体,N=众多的 agent 网络是多细胞涌现体,同一套生命逻辑。详见 SHEET 06。
Traditional organizations are designed as factories: departments, processes, and job descriptions form a set of rigid compartments. AI Native organizations are designed as living systems: everything flows; the "cell" that detects a problem is authorized to respond directly; order emerges bottom-up rather than being assigned top-down. This is the structural reason a small organization can move an order of magnitude faster than a large one: not because it is smarter, but because the structure is different. It also joins the two ends of the spectrum: the N=1 one-person company is a single-cell, high-density judgment body; the N=many agent network is a multi-cell emergent body; the same living logic applies to both. See SHEET 06.
操作者即编排者Operator as Orchestrator
Operator as Orchestrator
前面五个心智模型描述什么——AI Native 组织看世界的角度。第六个心智模型描述怎么做——操作者在这个组织里实际扮演什么角色。传统组织的操作者是 individual contributor——写代码、做产品、管理人、跑流程。AI Native 组织的操作者是 orchestrator——注意力上移:从执行到引导、从生产到判断、从单点工作到系统设计。
The first five mental models describe what: the angles from which an AI Native organization sees the world. The sixth describes how: what role the operator actually plays in that organization. The operator in a traditional organization is an individual contributor: writing code, building product, managing people, running processes. The operator in an AI Native organization is an orchestrator, whose attention moves up: from execution to guidance, from production to judgment, from point work to system design.
这个角色转换需要一组新的核心技能。上下文工程——让 Agent 持续对齐你的组织而不是漂移。Prompt 与 Skills 设计——把判断标准外显化为 Agent 可执行的指令。Evaluation 框架——让你看见 Agent 表现而不是猜测。判断节点设计——决定工作流的哪些步骤必须人介入、哪些可以放手。这些技能不再是工程师的专属,它们是 AI Native 组织里每个 operator 的必修课——产品经理、销售、运营、HR、财务,全员适用。
This role shift demands a new set of core skills. Context engineering: keeping Agents continuously aligned to your organization rather than drifting. Prompt and Skills design: externalizing judgment standards into Agent-executable instructions. Evaluation frameworks: letting you see Agent performance rather than guessing at it. Judgment node design: deciding which workflow steps must have human intervention and which can be released. These skills are no longer the exclusive domain of engineers; they are required study for every operator in an AI Native organization. Product managers, sales, operations, HR, finance: everyone.
这个模型把前五个模型激活——工作流图(M.01)要有人去画;Agent 作为默认工种(M.02)要有人去配置;上下文(M.03)要有人去工程化;持续学习(M.04)要有人去设计反馈循环;判断锚点(M.05)要有人去定位。没有 orchestrator 的 AI Native 是空架构,没有 AI Native 架构的 orchestrator 是疲惫的杂工。两者互为前提。
This model activates the previous five: someone must draw the workflow graph (M.01); someone must configure Agent as the default role (M.02); someone must engineer the context (M.03); someone must design the feedback loops for continuous learning (M.04); someone must locate the judgment anchors (M.05). AI Native without an orchestrator is empty architecture; an orchestrator without AI Native architecture is an exhausted odd-job worker. Each is the other's prerequisite.
六个模型到此立全。但它们是静态的公理——立的是看世界的方式,还没有回答一个组织如何在时间里活下去。下一张图纸换上生命系统的视角,看这套公理如何自我维持、自我进化。
The six models are now complete. But they are static axioms: they establish a way of seeing the world, not yet how an organization stays alive in time. The next blueprint switches to the living-system view to watch these axioms sustain and evolve themselves.
组织作为生命系统
Organization as a Living System
机器会停摆,生命会适应。把组织当机器设计,你得到一台精确但僵硬的装置;把它当生命系统设计,你得到一个会自我修复、自我进化的有机体。这一张图纸给出那套生物学逻辑——以及它在什么条件下不成立。
Machines break down; living systems adapt. Design an organization like a machine and you get something precise but brittle. Design it like a living system and you get an organism that self-repairs and self-evolves. This blueprint lays out that biological logic, and the conditions under which it does not hold.
五条生物学原理,每条对应一个已被正典使用的设计动作:
Five biological principles, each mapped to a design action already used in the canon:
涌现 · Emergence
Emergence · CAS & Stigmergy
Holland 的复杂适应系统:大量简单单元按局部规则交互,全局秩序自下而上涌现,无需中央设计者。蚁群不开会——它们通过 stigmergy(在共享环境里留痕、读痕;Grassé 1959 提出,Heylighen 2016 给出现代综述)间接协调。对应支柱 02/03:agent 读写共享上下文,而非互相抛接文档。可证伪:若高一致性需求场景下,涌现式自组织系统性劣于显式编排,则此映射受限。
Holland's complex adaptive systems: large numbers of simple units interact under local rules, and global order emerges bottom-up with no central designer. Ant colonies hold no meetings; they coordinate indirectly through stigmergy (leaving and reading traces in a shared environment; coined by Grassé 1959, synthesized in Heylighen 2016). Maps to Pillars 02/03: agents read and write a shared context store rather than passing documents to one another. Falsifiability: if emergence-based self-organization is systematically inferior to explicit orchestration in high-coherence-demand scenarios, this mapping is constrained.
适应度景观 · Fitness Landscape
Fitness Landscape · NK Model (Kauffman)
Kauffman 的 NK 模型:组织在崎岖景观上爬坡,探索(找新峰)与利用(爬当前峰)需动态平衡。self-improving 的本质就是持续的局部爬坡 + 偶发的跳跃探索。对应失败模式"演化失败"——锁死在局部最优。可证伪:若组织绩效与"探索-利用平衡度"无可测相关,则模型不解释现实。
Kauffman's NK model: organizations climb rugged landscapes where exploration (finding new peaks) and exploitation (ascending the current peak) must be dynamically balanced. The essence of self-improving is continuous local hill-climbing plus occasional leap-exploration. Maps to the failure mode "evolutionary stagnation": getting locked into a local optimum. Falsifiability: if organizational performance shows no measurable correlation with explore-exploit balance, the model does not explain reality.
免疫系统 · Distributed Defense
Immune System · Distributed Defense
免疫系统是分布式异常检测——没有中央哨兵,每个局部都能识别并响应异常。对应支柱 05 可观测性与 guardrails:遥测+护栏=组织的免疫细胞,在边缘就地拦截幻觉、越权、数据泄露。可证伪:若集中式审计在等同成本下检出率显著高于分布式,则类比失效。
The immune system is distributed anomaly detection: no central sentinel; every local node can identify and respond to aberrations. Maps to Pillar 05 observability and guardrails: telemetry + guardrails = the organization's immune cells, intercepting hallucinations, privilege escalation, and data leakage at the edge. Falsifiability: if centralized auditing achieves a significantly higher detection rate at equivalent cost, the analogy fails.
菌丝网络 · Resource Reallocation
Mycelium Network · Resource Reallocation (Tero 2010)
菌丝/黏菌按局部信号动态重分配资源到高回报路径,无中央调度(黏菌求最短路径已有 Tero 2010 Science 实证,但映射到组织资源调度仍属 Ⅲ 级类比)。对应工作流图的动态扇出与算力/注意力的按需流动——资源跟着判断走,不跟着科层走。可证伪:若动态重分配的协调开销在规模上超过其收益,则退化为需要调度层。
Mycelium and slime mold dynamically reallocate resources to high-return paths according to local signals, with no central dispatcher (slime mold's shortest-path optimization is empirically demonstrated in Tero et al., Science 2010, though the mapping to organizational resource scheduling remains a Level III analogy). Maps to the workflow graph's dynamic fan-out and the on-demand flow of compute and attention: resources follow judgment, not hierarchy. Falsifiability: if the coordination overhead of dynamic reallocation exceeds its benefits at scale, the system degrades and requires a scheduling layer.
自我进化 · Self-Improving
Self-Evolving · Self-Improving Loop (Argyris-Schön / OODA)
生命的标志是自我改进的闭环:感知→响应→把结果喂回改进自身。组织级实现=遥测 → eval → 自动改进工作流本身,区别于人类组织的周期性干预(年度复盘)。这把 M.04 持续学习从口号变成机制:每一次执行都是一次适应度采样。Argyris-Schön 的双环学习(1978)、Boyd 的 OODA 循环(见 Osinga 2007 的体系化重构)是其人类尺度前身。可证伪:若无人监督的自动改进闭环在实践中系统性引入 reward hacking 而不可治理,则 self-improving 需重新加入人类锚(接支柱 07/05)。
The hallmark of life is a self-improving closed loop: sense → respond → feed results back to improve the system itself. The organizational implementation is telemetry → eval → automatically improving the workflow itself, in contrast to the periodic interventions of human organizations (the annual retrospective). This transforms M.04 continuous learning from slogan into mechanism: every execution is a fitness sample. Argyris-Schön's double-loop learning (1978) and Boyd's OODA loop (see Osinga 2007 for the systematic reconstruction) are its human-scale predecessors. Falsifiability: if unsupervised self-improving loops systematically introduce ungovernable reward hacking in practice, then self-improving must reintroduce a human anchor (connecting to Pillars 07/05).
生命系统逻辑不分大小:N=1 的一人公司是单细胞高密度判断体——一个判断核 + 一座上下文库,靠滚动实验自我迭代(见 SHEET 14 的"同心节奏");N=众多的 agent 网络是多细胞涌现体——局部规则下秩序自组织。两端不是两套方法论,是同一套生命系统在不同细胞数下的表现。这正是本图集把一人公司收进同一体系、而非另立一卷的根本原因:规模是细胞数的选择,连贯性是生命的本征。而无论细胞数取一还是取众,骨架都是同一副——下一张图纸 SHEET 07,逐根立起这七根支柱。
Living-system logic is scale-invariant: the N=1 one-person company is a single-cell, high-judgment-density entity, one judgment core plus one context store, self-iterating through rolling experiments (see SHEET 14, "Concentric Rhythms"); the N=many agent network is a multi-cell emergent body, where order self-organizes under local rules. The two ends are not two different methodologies; they are the same living system expressed at different cell counts. This is precisely why the atlas folds the one-person company into a single framework rather than giving it a separate volume: scale is a choice of cell count; coherence is the intrinsic property of life. And whichever cell count is chosen, the skeleton is the same: the next blueprint, SHEET 07, raises its seven pillars one by one.
方法论的骨架
The Skeleton of the Methodology
七个支柱是相互依存的工程承诺,不是孤立的最佳实践。它们一起,构成 AI Native 组织的可施工蓝图。每根支柱先用一行划清"它不是什么"——歧义是这类术语最大的敌人。
The seven pillars are mutually dependent engineering commitments, not isolated best practices. Together they form a constructible blueprint for the AI Native organization. Each pillar opens with one line establishing what it is not; ambiguity is the greatest enemy of terms like these.
AI 优先即默认
AI-First as Default
每一次工作流设计都从一个问题开始——如果这件事必须由 AI Agent 端到端完成,我们会怎么设计它?这不是思想实验,是实际的设计起点。只有在通过这个设计之后,你才问"哪里会断?人的判断必须插入到哪里?"
Every workflow design starts from one question: if this task had to be completed end-to-end by an AI Agent, how would we design it? This is not a thought experiment; it is the actual design starting point. Only after working through that design do you ask: "Where will it break? Where must human judgment be inserted?"
这反转了传统设计序列。传统设计从现有人类角色出发,问 AI 能在哪里帮忙。AI Native 设计从完全 agentic 的理想出发,问人必须在哪里介入。组织的设计压力把人类推向真正只有人能做的领域——判断、关系、品味、责任。
This inverts the traditional design sequence. Traditional design starts from existing human roles and asks where AI can assist. AI Native design starts from the fully agentic ideal and asks where humans must intervene. The organization's design pressure pushes humans toward the domains only humans can occupy: judgment, relationships, taste, accountability.
- Inversion
- Human-first → AI-first
- Pressure
- Pushes humans up the stack
- Failure
- "AI as helper"
工作流即代码
Workflow as Code
在 AI Native 组织里,工作流不被描述在 PowerPoint 里,不靠记忆运行,不由部落知识维护。它们被规约在code或机器可执行的结构化定义中——可被版本化、被分支、被观察、被持续优化。
In an AI Native organization, workflows are not described in PowerPoint, do not run on memory, and are not maintained by tribal knowledge. They are defined in code or machine-executable structured definitions: versionable, branchable, observable, and continuously improvable.
这听起来像技术细节,实际上是最重要的架构决定。当工作流是代码时,它们可以被测试、被观察、被调试、被优化;当它们不是代码时,它们困在人脑中,产生折磨传统组织的慢性流程漂移。纪律是:永远不要让一个重要的工作流只存在于某个人的头脑里。
This sounds like a technical detail; it is actually the most important architectural decision. When workflows are code, they can be tested, observed, debugged, and optimized; when they are not code, they are trapped in human minds and produce the chronic process drift that plagues traditional organizations. The discipline is: never let an important workflow exist only inside someone's head.
- Stack
- Temporal · n8n · LangGraph · Inngest
- Property
- Versionable, observable
- Failure
- Tribal knowledge drift
上下文工程作为系统实践
Context Engineering as Systematic Practice
Tobi Lütke 在 Shopify 把上下文工程从一种 ad-hoc 技能升格为系统实践。组织主动构建 Agent 运行的信息环境。所有内部文档为 Agent 检索而结构化;维护活的上下文存储;同时为人和 Agent 写作。
Tobi Lütke at Shopify elevated context engineering from an ad-hoc skill to a systematic practice. The organization actively constructs the information environment in which agents operate. All internal documentation is structured for agent retrieval; a living context store is maintained; writing serves both humans and agents simultaneously.
最深的原则是——组织采取的每个动作都应该产生结构化的上下文作为副产品。会议产生 Agent 可检索的总结。决策被记录连同决策理由。客户互动被捕获。日积月累,上下文存储成为组织最有价值的资产——是让你的 Agent 在用同样底层模型的情况下,质量上明显优于竞争对手 Agent 的底层基质。这是 AI 时代的真正护城河。
The deepest principle is this: every action the organization takes should produce structured context as a byproduct. Meetings generate agent-retrievable summaries. Decisions are recorded together with their rationale. Customer interactions are captured. Over time, the context store becomes the organization's most valuable asset: the underlying substrate that allows your agents to outperform competitors' agents in quality even when running on the same base models. This is the true moat of the AI era.
- Stack
- Pinecone · Weaviate · Glean · Notion AI
- Property
- Compounding moat
- Failure
- Context starvation
多模型架构
Multi-Model Architecture
AI Native 设计中最深的单一风险,是算法封建主义(algorithmic feudalism)——把业务深度依赖于一家基础模型供应商,让供应商实际上变成你的地主。
The deepest single risk in AI Native design is algorithmic feudalism: deeply coupling the business to a single foundation-model provider, effectively making that provider your landlord.
架构上的防御是多模型。关键工作流应该被设计成可在数日内切换底层模型——配合质量回归测试。这要求工作流代码与具体模型 API 之间有抽象层;质量评估基础设施可以针对多个模型测试同一工作流;和至少两家供应商保持持续关系。开源权重模型应该被评估,用于可自托管的关键工作流——即使你大多数时候用商用 API,开源模型的可选性本身是战略资产。
The architectural defense is multi-model design. Critical workflows should be built to switch underlying models within days, supported by quality regression testing. This requires an abstraction layer between workflow code and specific model APIs; quality-evaluation infrastructure that can test the same workflow against multiple models; and ongoing relationships with at least two providers. Open-weight models should be evaluated for critical workflows that can be self-hosted. Even if you mostly use commercial APIs, the optionality of open-weight models is itself a strategic asset.
- Stack
- OpenAI + Anthropic + Llama/Qwen
- Property
- Optionality, sovereignty
- Failure
- Provider hostage
可观测性先于规模
Observability Before Scale
NANDA 报告对那 95% 的自家归因是"学习缺口"——工具不持有记忆、不积累上下文、不随使用变好;报告还有一个常被引用者略去的反向发现:外购方案的落地成功率约为自建的两倍。本支柱取其上游含义:无论买还是建,组织若没有观察、评估、改进 AI 行为的基础设施,部署就无从学习——他们在能看见之前就开始扩规模。
The NANDA report attributed 95% of self-reported shortfalls to a "learning gap": tools that hold no memory, accumulate no context, and do not improve with use; the report also contains a finding that most citations omit: purchased solutions succeed at roughly twice the rate of self-built ones. This pillar takes the upstream implication: whether you buy or build, an organization without infrastructure to observe, evaluate, and improve AI behavior cannot learn from deployment. They scale before they can see.
方法论要求反过来。任何 Agent 工作流上线前,可观测性层必须存在:每个 Agent 行动被记录;每次模型调用被追踪;输出被采样以做质量评估;失败被路由到人类审查。在 AI Native 组织里,可观测性之于运营,等同于会计之于财务。你不会不记账就运营公司;你也不应该不可观测就运行 AI Native 工作流。这不是工程奢侈品,是基础设施下限。
The methodology demands the opposite. Before any agent workflow goes live, the observability layer must exist: every agent action is logged; every model call is traced; outputs are sampled for quality evaluation; failures are routed to human review. In an AI Native organization, observability is to operations what accounting is to finance. You would not run a company without bookkeeping; you should not run AI Native workflows without observability. This is not an engineering luxury; it is the infrastructure floor.
- Stack
- LangSmith · Helicone · Arize · Weave
- Property
- Pre-scale infrastructure
- Failure
- Blind scaling
人作为判断与责任锚
Humans as Judgment & Responsibility Anchors
AI Native 不是"无人"或"最少人",而是把人定位在工作流图的最高杠杆点。三类人锚定的决策不可妥协——不可逆决策(任何无法廉价撤回的事);承载声誉的决策(任何组织名字公开附着的事);承载价值观的决策(伦理、品味、关系比效率更重要的事)。
AI Native is not "no humans" or "minimal humans"; it is about placing humans at the highest-leverage nodes of the workflow graph. Three categories of decision require a human judgment anchor, without compromise: irreversible decisions (anything that cannot be cheaply undone); reputation-bearing decisions (anything the organization's name is publicly attached to); and values-bearing decisions (situations where ethics, taste, or relationships matter more than efficiency).
Air Canada(被法庭判决必须为 chatbot 承诺负责)和 Cursor "Sam"(编造公司政策的 AI)说明了这个支柱缺失时会发生什么。把人移出这些决策类别,省下的人力成本远不及导致的代价。
Air Canada (held by a court liable for commitments made by its chatbot) and Cursor's "Sam" (an AI that fabricated company policy) illustrate what happens when this pillar is absent. The labor cost saved by removing humans from these decision categories is nowhere near the cost of the consequences.
- Anchor types
- Irreversible · Reputation · Values
- Mode
- Human-in/on-the-loop
- Failure
- Liability vacuum
持续演化
Continuous Evolution
传统组织每几年"转型"一次——发起一个大变革倡议、重组、重新平台化。AI Native 组织没有"转型事件",因为它在持续演化。组织节奏发生转变——没有"5 年战略",因为接下来 5 年不会像过去 5 年,底层技术移动得太快。
Traditional organizations "transform" every few years: launching a major change initiative, reorganizing, re-platforming. AI Native organizations have no "transformation events" because they are continuously evolving. The organizational cadence shifts: there is no "5-year strategy," because the next five years will not resemble the last five; the underlying technology moves too fast.
有的是 90 天节奏(Anthropic 据报道最长规划周期是 90 天),嵌入在更长期的方向感中(1-3 年愿景),而后者本身随景观变化而更新。这对受过传统规划训练的人不舒服。它是 AI Native 运营的自然模式。
What exists instead is a 90-day cadence (Anthropic's reported maximum planning horizon is 90 days), embedded within a longer-term sense of direction (a 1-3 year vision) that itself updates as the landscape changes. This is uncomfortable for people trained in traditional planning. It is the natural operating mode of AI Native.
- Cadence
- 90-day rolling
- Reference
- Anthropic, Cursor
- Failure
- Static architecture
四层基础设施
Four Layers of Infrastructure
一个 AI Native 组织的运营底层有四层。每一层在组织能宣称这个名号之前都必须就位。缺少任何一层,你不是 AI Native——只是在用 AI。
An AI Native organization's operating substrate has four layers. Every layer must be in place before the organization can claim that name. Miss any one of them and you are not AI Native; you are merely using AI.
可观测性层OBSERVABILITY LAYER
Observability LayerOBSERVABILITY LAYER
让系统持续可学习的东西。日志、追踪、评估、警报,以及把问题路由回人类的审查队列。没有它,你在比人类纠错速度更快地扩展失败。
What keeps the system continuously learnable. Logs, traces, evaluations, alerts, and a review queue that routes issues back to humans. Without it, you are scaling failure faster than humans can correct it.
Arize · W&B Weave
Galileo · Braintrust
上下文层CONTEXT LAYER
Context LayerCONTEXT LAYER
让 Agent 变得组织特定的东西。向量数据库、知识图谱、决策日志,以及让这些保持鲜活的工程实践。没有它,你的 Agent 是泛化的;有了它,它们成为独属于你的。
What makes Agents organization-specific. Vector databases, knowledge graphs, decision logs, and the engineering practices that keep them fresh. Without it, your Agents are generic; with it, they become uniquely yours.
Chroma · Qdrant
Glean · Sana · Notion AI
Agent 层AGENT LAYER
Agent LayerAGENT LAYER
工作流执行的地方。包括编排框架(LangGraph、CrewAI、AutoGen 或自研),Agent 运行时,以及把 Agent 连接到工具、数据库、外部系统的集成层。
Where workflow execution happens. Includes orchestration frameworks (LangGraph, CrewAI, AutoGen, or custom-built), Agent runtimes, and the integration layer that connects Agents to tools, databases, and external systems.
AutoGen · Letta
Pydantic AI · Inngest
模型层MODEL LAYER
Model LayerMODEL LAYER
基础——访问多个基础模型,通常至少一家前沿 API 供应商,加上用于主权工作流的开源权重模型,并有抽象层使模型可被切换。没有这一层,你不是 AI Native,你是 API 依赖。
The foundation: access to multiple foundation models, typically at least one frontier API provider plus open-weight models for sovereign operator workflows, with an abstraction layer that makes models swappable. Without this layer, you are not AI Native; you are API-dependent.
Google · Mistral
Llama · Qwen · DeepSeek
把这四层叠起来,组织的"样子"也变了。传统组织图是层级方框、用岗位说明书定义角色;AI-Native 的"组织图"只有三件——少数判断节点、一张近零边际成本的 agent 网、一层流动的上下文。一个判断者可指挥 50–100 个 agent;结构随工作量伸缩,而不随人数。
Stack those four layers and the shape of the organization changes too. A traditional org chart is boxes in a hierarchy, with job descriptions defining roles; the AI-Native "org chart" has only three things: a few judgment nodes, an agent network at near-zero marginal cost, and one layer of flowing context. One judge can direct 50–100 agents; the structure scales with the workload, not with headcount.
五类实证样本
Five Groups of Verified Cases
方法论不靠雄辩成立,靠样本。2024-2026 年的实践按五类归档——原生、转型、失败、中国实验,以及专门用来拆幸存者偏差的对照与阵亡组——每一个样本,都是对这套图纸的一次受力测试。
A methodology earns its standing through evidence, not rhetoric. Practices from 2024-2026 are filed in five groups: born-native, transitioning, failures, China experiments, and a controls-and-casualties group designed specifically to dismantle survivorship bias. Every case is a stress test of this blueprint.
- PATTERN
- 原生型最成功,转型型次之,客户面 AI 化最危险
- Born-native performs best; transitioning next; customer-facing AI is most dangerous
- RULE OF THUMB
- 从后台开始
从内部开始
从可逆决策开始 - Start from the back office
Start from the inside
Start from reversible decisions
这五类样本合起来,给出了一张AI Native 实践的实证地图。原生型(Group A)证明了"从零架构"路径的可行性,但需要极强的创始人判断与执行力。转型型(Group B)证明了"逐步推进"是可能的,但措辞与节奏决定生死——Shopify 成功,Duolingo 翻车,IBM 半成。失败案例(Group C)一致指向客户面 AI 是最高风险区域——任何对外承诺、任何不可逆决策都不应交给 AI 单独处理。中国案例(Group D)展示了另一条道路——不是 1 人公司,也不是传统大公司,而是"超级个体集群"+ 政策协同+ 大集团孵化的混合形态。对照与阵亡组(Group E)的任务不同——它不往地图上添新路,它负责拆掉幸存者偏差:零融资对照证明叙事不是必需品,阵亡名单标出哪些路真的致命。
Taken together, these five groups of cases form an empirical map of AI Native practice. Born-native cases (Group A) demonstrate the viability of the "build-from-zero" path, but it demands exceptional founder judgment and execution. Transitioning cases (Group B) prove that incremental advancement is possible, yet tone and pacing are life-or-death: Shopify succeeded, Duolingo crashed, IBM landed halfway. Failure cases (Group C) consistently point to customer-facing AI as the highest-risk zone: no external commitment and no irreversible decision should be handed to AI alone. China cases (Group D) reveal a different path: neither the one-person company nor the traditional large corporation, but a hybrid of "sovereign operator clusters" + policy coordination + large-group incubation. The control-and-casualty group (Group E) plays a different role: it adds no new routes to the map, it tears down survivorship bias. Zero-funding controls prove the narrative is optional; the casualty list marks which paths are actually fatal.
同一现象的四个截面
Four Cross-Sections of the Same Phenomenon
AI Native 同时是经济、监管、哲学与劳工现象。这四个截面不是背景知识——它们是终将反过来修改图纸的现实约束。
AI Native is simultaneously an economic, regulatory, philosophical, and labor phenomenon. These four cross-sections are not background knowledge; they are real-world constraints that will eventually come back to revise the blueprint.
Daron Acemoglu 2024 年在 MIT 的研究《The Simple Macroeconomics of AI》给出谨慎评估——AI 未来 10 年累计 GDP 增长贡献约 1.1-1.6%(年均 ~0.05%),远低于行业普遍宣称的数倍效应。MIT NANDA 2025/7 预印报告测得 95% 的定制化企业 GenAI 试点在六个月窗口内没有可衡量的 P&L 影响——同一报告也记录了员工自带通用工具的"影子 AI"被大规模采用且有效:95% 说的是组织级试点的失败,不是 AI 本身的失败。
Daron Acemoglu's 2024 MIT study The Simple Macroeconomics of AI offers a cautious assessment: AI's cumulative GDP contribution over the next 10 years will be roughly 1.1-1.6% (≈ 0.05% per year), far below the multi-fold effects industry commonly claims. The MIT NANDA July 2025 preprint found that 95% of customized enterprise GenAI pilots showed no measurable P&L impact within a six-month window; the same report also documented that employee-initiated "shadow AI" using general-purpose tools was adopted at scale and proved effective: the 95% figure describes the failure of org-level pilots, not of AI itself.
这意味着——AI 经济效益正面但远低于炒作;真正受益的是少数能真正实现 AI Native 重构的组织,多数公司是在为表演买单。Acemoglu 警告 AI 主要影响数据汇总、视觉匹配、模式识别这类白领办公任务,但仍预测 2030 年记者、金融分析师、HR 等职位仍存在。同时他强调 AI 会扩大资本对劳动的分配差距而非缩小白领内部不平等——这是 AI Native 组织的政治经济学背景。
The implication: AI's economic benefits are real but far below the hype; the organizations that genuinely benefit are the minority that can achieve true AI Native reconstruction, while most companies are paying for performance. Acemoglu warns that AI primarily affects white-collar office tasks such as data aggregation, visual matching, and pattern recognition, yet still predicts that roles such as journalist, financial analyst, and HR professional will persist through 2030. He also emphasizes that AI will widen the capital-to-labor distribution gap rather than narrow inequality within white-collar ranks. This is the political-economy backdrop for the AI Native organization.
欧盟 AI Act 2026/8/2 是 Annex III 高风险系统全面适用日(招聘评估、信用决策、教育评分、执法等);最高罚款 €3,500 万 或全球营收 7%。美国联邦层面 Biden EO 14110 被 Trump 政府 2025/1 废除,州层面(科罗拉多 SB 24-205、加州 SB 1047 被否决)拼盘形成。中国《生成式 AI 服务管理暂行办法》2023 年实施。
The EU AI Act's 2026/8/2 is the full-application date for Annex III high-risk systems (recruitment screening, credit decisions, educational scoring, law enforcement, etc.); maximum fines reach €35 million or 7% of global revenue. At the US federal level, Biden Executive Order 14110 was revoked by the Trump administration in January 2025, leaving a patchwork of state-level legislation (Colorado SB 24-205; California SB 1047 was defeated). China's Interim Measures for the Management of Generative AI Services entered force in 2023.
对 AI Native 组织的实操含义——合规不是事后处理,而是架构约束。如果你的核心 workflow 涉及 EU 公民的招聘、信用、教育数据,2026/8 之后必须有 human-in-the-loop、决策可审计、训练数据可追溯。这就是为什么"可观测性先于规模"和"人作为判断与责任锚"在七大支柱中是基础性的而非可选的。
The practical implication for AI Native organizations: compliance is not a post-hoc fix but an architectural constraint. If your core workflows touch EU citizens' recruitment, credit, or educational data, human-in-the-loop, auditable decisions, and traceable training data will all be mandatory after August 2026. This is precisely why "observability before scale" and "humans as judgment and accountability anchors" are foundational rather than optional among the seven pillars.
Hannah Arendt 在《人的境况》(1958) 划分劳动(labor,维持生命)、工作(work,制造耐用品)、行动(action,公共领域中以言行显现自我)。AI Native 时代如果连 work 都被 AI 接管,"action" 在哪里?这不是花拳绣腿的问题——它直接关乎 AI Native 组织如何为"人"定义角色。
In The Human Condition (1958) Hannah Arendt distinguishes labor (sustaining life), work (fabricating durable goods), and action (appearing in the public realm through word and deed). If even work is taken over by AI in the AI Native era, where does "action" go? This is not an ornamental question; it goes directly to how an AI Native organization defines the role of "the human."
Acemoglu 的回答是"专长与信息提供者",Tobi Lütke 的回答是"context engineer",Anthropic Hive Mind 的回答是"品味与判断",Buurtzorg 的回答是"完整自我"。这些答案都对,但都不完整。最稳健的回答是——人是承担后果的能力(accountability)。当 AI 可以无穷生成,人类的稀缺性在于"承担后果的能力"——这是 Air Canada 案、Lattice 撤回、Klarna 回招的共同启示,也是七大支柱中"人作为判断与责任锚"的哲学基础。
Acemoglu's answer is "expert and information provider"; Tobi Lütke's answer is "context engineer"; the Anthropic Hive Mind's answer is "taste and judgment"; Buurtzorg's answer is "the whole self." Each answer is correct, yet none is complete. The most robust answer is: the human is the capacity to bear consequences (accountability). When AI can generate infinitely, human scarcity lies in the capacity to bear consequences. This is the shared lesson of the Air Canada case, the Lattice withdrawal, and the Klarna re-hire; it is also the philosophical foundation of "humans as judgment and accountability anchors" among the seven pillars.
SAG-AFTRA 2023/7/14-11/9 的 118 天大罢工是首个明确以 AI 为核心议题的劳工运动。胜利成果包括对"合成演员"(Synthetic Performers)和"数字替身"(Digital Replicas)的合同保护、强制 informed consent。2024/7 又对游戏公司发起 AI 议题罢工。2026/3 推动"Tilly tax"——对 AI 角色征税。
The SAG-AFTRA 118-day strike from 2023/7/14 to 11/9 was the first labor action to place AI squarely at its center. Victories included contractual protections for "Synthetic Performers" and "Digital Replicas," and mandatory informed consent. In July 2024, another AI-focused strike was launched against gaming companies. In March 2026, the union began pushing a "Tilly tax," a levy on AI-generated characters.
这预示着 2030 年代劳动者运动的新主题。AI Native 组织必须预判这种张力,否则会被工会运动反噬。Klarna 的回招、Duolingo 的撤回、Lattice 的退步——都是劳工力量在工会化之前已经通过舆论和市场表达的反向作用。在欧洲、加拿大、巴西等更工会化的市场,这种张力会更早进入直接对抗。AI Native 不是"绕开劳工政治"的方法,是"必须更细致地处理劳工政治"的方法。
This foreshadows the new themes of labor movements in the 2030s. AI Native organizations must anticipate this tension or face union-driven backlash. Klarna's re-hiring, Duolingo's reversal, and Lattice's retreat all show labor expressing counter-pressure through public opinion and market signals before formal unionization. In more highly unionized markets such as Europe, Canada, and Brazil, this tension will reach direct confrontation sooner. AI Native is not a method for "bypassing labor politics"; it is a method that demands handling labor politics with greater care.
已被记录的陷阱
The Documented Traps
失败不是意外,是结构的伏笔。以下每一种坍塌方式都已被记录在案、结构性地重复出现——把它们当作图纸上预先标注的裂缝:看见了,就不必等墙塌。
Failure is not an accident; it is something the structure sets up in advance. Every mode of collapse below has been documented and recurs structurally. Treat them as cracks marked ahead of time on the blueprint: once you can see them, you need not wait for the wall to fall.
CLAUDE.md、架构规约文档),会从零推导基础决策,决策之间漂移。最终你得到的不是任何单一片段不好的代码库,而是没有一致心智模型的代码库——因为各部分从未被设计成相互配合。这是 Anthropic Founder's Playbook 反复强调的核心陷阱。CLAUDE.md or an architecture-spec document) re-derives the basic decisions from scratch, and those decisions drift apart. What you end up with is not a codebase where any single piece is bad, but a codebase with no consistent mental model, because the parts were never designed to work together. This is the core trap the Anthropic Founder's Playbook returns to again and again.一套不可能错的方法论不值得信。以下任一证据成立,即动摇本规约的核心论断——读者应当与作者一起盯住这三条线:
A methodology that cannot be wrong is not worth trusting. If any one of the following holds, it shakes the core claims of this specification; readers should watch these three lines alongside the author:
① 到 2028 年,按本规约从零建造的组织,在三年存活率或毛利结构上并不优于同品类的"加装式"对照组;② 出现足量存量组织不重画工作流图、仅靠采购与流程微调便稳定获得端到端吞吐的量级提升(NANDA"外购成功率约为自建两倍"的发现,已经是一个需要持续跟踪的反向信号);③ Agent 对工作流遥测指标的系统性博弈(reward hacking)被证明不可治理——那将直接拆掉支柱 05/07 的地基。
① By 2028, organizations built from scratch under this specification are no better, in three-year survival rate or gross-margin structure, than a comparable "bolt-on" control group; ② enough incumbent organizations achieve an order-of-magnitude gain in end-to-end throughput without redrawing their workflow graph, on procurement and process tweaks alone (NANDA's finding that "buying succeeds at roughly twice the rate of building" is already a counter-signal worth tracking); ③ agents' systematic gaming of workflow telemetry (reward hacking) proves ungovernable, which would tear out the foundation of Pillars 05 and 07 directly.
2026-2032+:推演幕
2026 to 2032+: The Speculation Act
这一幕不画一条加速曲线,而是张开一个可能性空间——不是预测哪条线会发生,而是画出哪些分支可能、各自的先行指标与证伪条件。
This act does not draw a single acceleration curve; it opens a possibility space. It does not predict which line will occur but maps which branches are possible, each with its leading indicators and falsification conditions.
推演不是畅想。SHEET 03 与 SHEET 14 已经确立:公司是一种约 400 年的、分层叠加的发明(股份制 1602 / 有限责任 1855 / 科层 1870s[R23]),它的奠基约束正被 AI 溶解。一个四百年的形态走到约束失效处,下一种形态的出现不是会不会,而是哪一种。这一幕不预测单一未来,它画出可能性空间:四条正在汇流的技术曲线决定边界,两条不确定性轴张开四个世界,三件来自那些世界的文物让推演可触——每一处都附先行指标与证伪条件,因为能被证伪的推演才值得推演。
Speculation is not daydreaming. SHEET 03 and SHEET 14 have already established that the company is a roughly 400-year-old, layered invention (the joint-stock form in 1602, limited liability in 1855, bureaucracy in the 1870s[R23]), and its founding constraints are being dissolved by AI. When a four-century-old form reaches the point where its constraints fail, the arrival of the next form is not a question of whether but of which. This act does not predict a single future; it maps a possibility space: four converging technology curves set the boundaries, two axes of uncertainty open four worlds, and three artifacts from those worlds make the speculation tangible. Each point carries leading indicators and falsification conditions, because only speculation that can be falsified is worth speculating.
四条汇流的技术曲线Four Converging Curves
Four Converging Curves
AI-Native 不止是 LLM 变强。它背后是四条独立成熟、正在汇流的技术曲线——每一条都松动一组旧约束,决定推演空间的边界。更准确地说,AI 是这一轮组织重构的前台技术,但不是唯一核心技术:协议决定 agent 能否互操作,支付决定机器能否自主交易,能源/算力决定自治的边际成本,机器人决定组织能否越过 bits 进入 atoms,生物/脑机远场决定认知边界是否再次移动。每条只问三件事:成立则解锁什么组织形态 / 当前成熟度(TRL)/ 什么信号会证伪这条曲线。
AI-Native is more than LLMs getting stronger. Behind it are four technology curves that are maturing independently and now converging; each loosens a set of old constraints and sets the boundaries of the speculation space. More precisely, AI is the front-stage technology of this round of organizational redesign, but not the only core technology: protocols decide whether agents can interoperate; payments decide whether machines can transact autonomously; energy and compute set the marginal cost of autonomy; robotics decides whether organization crosses from bits into atoms; and the bio/brain-computer far field may move the boundary of cognition again. Each curve asks only three things: what organizational form it unlocks if it holds, its current maturity (TRL), and what signal would falsify it.
四条曲线划定边界,但走向哪个世界取决于两条高影响、高不确定的力量。切换两轴,看 2032 落在哪个象限——以及什么先行指标说明我们正滑向它、什么证据会证伪它(GBN 双轴情景法[R45])。
Four curves mark the boundaries, but which world we move toward turns on two high-impact, high-uncertainty forces. Toggle the two axes to see which quadrant 2032 falls into: which leading indicators say we are sliding toward it, and what evidence would falsify it (the GBN two-axis scenario method[R45]).
Agent-heavy 组织成为常态
Agent-heavy organizations become the norm
Agent 数量超过员工数量。Microsoft 引用 IDC 预测 2028 年全球 13 亿活跃 AI Agent;Salesforce 内部预测 2027 年 50% 的客服案件由 AI 处理。Microsoft Agent 365、ServiceNow Agentic Workforce Management 已经把"Agent ID + Agent Blueprint + kill switch"机制明确化。
Agents outnumber employees. Microsoft cites an IDC forecast of 1.3 billion active AI agents worldwide by 2028; Salesforce internally projects that 50% of service cases will be handled by AI in 2027. Microsoft Agent 365 and ServiceNow Agentic Workforce Management have already made the "Agent ID + Agent Blueprint + kill switch" mechanism explicit.
员工人均 ARR 从 $50 万-$200 万跃升至 $500 万-$1,000 万。Anysphere 670 万美元/员工、Cognition 1,500 万美元/员工已是 leading indicator。Henry Shi 的 Lean AI Native Companies Leaderboard(员工 ≤ 50 门槛)系统化追踪此趋势。
ARR per employee jumps from $0.5M-$2M to $5M-$10M. Anysphere at $6.7M per employee and Cognition at $15M per employee are already leading indicators. Henry Shi's Lean AI Native Companies Leaderboard (a headcount threshold of 50 or fewer) tracks this trend systematically.
Agent 工资按使用量计价标准化。Cursor、Claude Code、GitHub Copilot 在 2025 年都从"座位制"转向"使用量计价"——Agent 的"工资"按其实际生产力收费,传统人力成本模型在这个领域失效。
Usage-based pricing for agent "wages" becomes standard. Cursor, Claude Code, and GitHub Copilot all shifted from per-seat to usage-based pricing in 2025: an agent's "wage" is charged by its actual productivity, and the traditional labor-cost model fails in this domain.
校准锚:这是十年曲线的头两年,不是终点。Karpathy 2025/6 明确拒绝"2025 是 Agent 之年"的说法,主张这是"Agent 的十年"——他举的证据是自动驾驶:2013 年他坐过一次零干预的 Waymo 演示,12 年后这个问题仍未收尾[R6]。Gartner 同月的预测从反面校准同一条曲线:到 2027 年底超过 40% 的 agentic AI 项目将被取消[R10]。两条放在一起读:方向成立,斜率被普遍高估——本块所有短期数字都应打上这个折扣再用。
Calibration anchor: these are the first two years of a decade-long curve, not its endpoint. In June 2025 Karpathy explicitly rejected the framing of "2025 is the year of agents," arguing instead that this is "the decade of agents." His evidence was self-driving: in 2013 he took a zero-intervention Waymo demo ride, and twelve years later the problem still is not closed[R6]. Gartner's forecast that same month calibrates the same curve from the opposite side: by the end of 2027, more than 40% of agentic AI projects will be cancelled[R10]. Read together: the direction holds, but the slope is widely overestimated. Every short-term figure in this block should be discounted by that before use.
第一家 AI 主导决策的上市公司
The first publicly listed company whose decisions are AI-led
Sam Altman 在 2024 年透露"科技 CEO 群"在赌哪一年出现首家"一人独角兽";Dario Amodei 2025 年以 70-80% 信心度预测 2026 年。这些是预测,未发生——但"第一家 AI 主导决策的上市公司"在 2028-2030 年间出现的概率显著大于 50%。
In 2024 Sam Altman revealed that a "group chat of tech CEOs" was betting on which year the first "one-person unicorn" would appear; in 2025 Dario Amodei predicted 2026 with 70-80% confidence. These are forecasts that have not come to pass, but the probability that "the first publicly listed company whose decisions are AI-led" appears between 2028 and 2030 is well above 50%.
AI Agent 的法人地位讨论。类似 1819 年 Dartmouth College v. Woodward 把"法人"地位赋予公司。Wyoming DAO LLC(2021)、Marshall Islands DAO 法已为非人法人开了一个口子,但 AI Agent 直接享有法人地位仍是法学讨论。这个问题不会在 2030 年前解决,但讨论会越来越具体。
The debate over legal personhood for AI agents. Comparable to how Dartmouth College v. Woodward in 1819 granted "legal person" status to the corporation. Wyoming's DAO LLC (2021) and the Marshall Islands DAO act have opened a crack for non-human legal persons, but legal personhood held directly by an AI agent remains a matter of jurisprudential discussion. This question will not be resolved before 2030, but the discussion will grow ever more concrete.
欧盟 AI Office、各国 AI Act 体系成型。2026/8 欧盟硬期限是关键节点;中国数据局、英国 AI Safety Institute、美国各州拼盘共同形成全球马赛克。"算法封建主义"成为反垄断议题。
The EU AI Office and national AI Act regimes take shape. The EU's hard deadline in August 2026 is a key milestone; China's data bureau, the UK AI Safety Institute, and the patchwork of US states together form a global mosaic. "Algorithmic feudalism" becomes an antitrust topic.
组织形态的多元而非趋同
Plurality of organizational forms, not convergence
最深的趋势是组织形态光谱的多元化而非趋同。一人公司 + AI Native + DAO + 平台型 + 传统科层制 + 青色组织(Buurtzorg 类)将共存而非互相替代。"公司"的概念本身在被重新定义——从"人的协作工具"转向"判断 + Agent 编排单元"。
The deepest trend is plurality rather than convergence across the spectrum of organizational forms. The one-person company, AI Native, DAO, platform, traditional bureaucracy, and teal organizations (the Buurtzorg type) will coexist rather than replace one another. The very concept of "the company" is being redefined: from "a tool for human collaboration" toward "a unit of judgment plus agent orchestration."
与"多元化"判断对赌的理论预测也应记录在案:Hadfield-Koh 引用的相变模型(Chen-Elliott-Koh, Journal of Economic Theory, 2023[R2])预测的恰恰是反向收敛——AI 压低维持异质能力的组织成本后,经济从大量专业化企业突变为少数横跨众多行业的巨型企业。多元光谱与巨头相变谁成为 2030 年代的主图景,是本章最值得跟踪的分歧点。
The theoretical prediction that bets against the "plurality" judgment should also be recorded: the phase-transition model cited by Hadfield-Koh (Chen-Elliott-Koh, Journal of Economic Theory, 2023[R2]) predicts exactly the reverse convergence. Once AI lowers the organizational cost of maintaining heterogeneous capabilities, the economy jumps from a large number of specialized firms to a few giants that span many industries. Whether the plural spectrum or the giant phase-transition becomes the main picture of the 2030s is the most trackable point of divergence in this chapter.
Acemoglu 强调"complementary use of AI 不会自动出现",需主动政策与产业方向引导。UBI 讨论与 AI Native 组织的关系会在这个阶段成为政治议题——Sam Altman、Worldcoin(现 World Network)继续推动;OpenAI Foundation 2024 年宣布支持 UBI 研究。"工作"的形态本身比 20 世纪要复杂得多——这是 2030 年代劳动者面对的新现实。
Acemoglu stresses that "complementary use of AI" will not appear automatically; it requires active policy and industrial direction. The relationship between the UBI debate and AI Native organizations becomes a political issue at this stage: Sam Altman and Worldcoin (now World Network) keep pushing it, and the OpenAI Foundation announced support for UBI research in 2024. The form of "work" itself is far more complex than in the twentieth century; this is the new reality that workers of the 2030s face.
后人类组织(post-human organization)仍是科幻领域;现实中最接近的是 Anthropic Project Vend / Sakana AI Scientist 的小规模实验。这些实验不会在 2030 年代成为主流,但它们会持续作为"可能性的实证"存在,影响监管、哲学、劳工各个领域的讨论。
The post-human organization remains the domain of science fiction; the closest real approximations are the small-scale experiments of Anthropic's Project Vend and the Sakana AI Scientist. These experiments will not become mainstream in the 2030s, but they will persist as "existence proofs of the possible," influencing discussion across regulation, philosophy, and labor.
"Human-only" 作为差异化卖点
"Human-only" as a differentiating selling point
所有强趋势都会激发反趋势。"human-only" 作为差异化卖点正在心理咨询、临终关怀、儿童教育、深度治疗等领域出现。一些品牌开始明确标注"100% 人类制作 / 服务"作为溢价标志。
Every strong trend provokes a counter-trend. "Human-only" as a differentiating selling point is emerging in counseling, end-of-life care, childhood education, and deep therapy. Some brands have begun to mark "100% human-made / human-served" explicitly as a premium signal.
慢公司(Slow Company)运动复兴。Allwork 2025/12 文章《想要在 2026 年革命你的业务?忘了 AI——试试 Teal 模型》直接把 Buurtzorg 的成功(14,000 名护士、900 个自管团队、开销占比 8% vs 行业 25%)作为"反 AI 优先"叙事的标杆。非 AI Native 的成功不是可有可无的反例,是结构性地存在的另一条路径。
A revival of the Slow Company movement. Allwork's December 2025 article "Want to revolutionize your business in 2026? Forget AI; try the Teal model" holds up Buurtzorg's success (14,000 nurses, 900 self-managing teams, overhead at 8% versus the industry's 25%) directly as the benchmark for an "anti-AI-first" narrative. The success of the non-AI-Native is not a dispensable counterexample; it is a structurally present alternative path.
反 AI 工会运动扩张。SAG-AFTRA 2023 年大罢工建立的 AI 角色保护合同先例,2024 年扩展到游戏公司,2026 年推动"Tilly tax"——这种工会式对抗会在更多行业出现。2030 年代的劳动者运动,可能会以"AI 边界"为核心议题展开。数字戒断与"AI-free zones"在学校、医院、心理咨询场域出现明确"无 AI"标签。
The anti-AI union movement expands. The precedent of AI-role-protection contracts established by the 2023 SAG-AFTRA strike spread to game companies in 2024 and drove the "Tilly tax" in 2026; this union-style resistance will appear in more industries. The labor movements of the 2030s may unfold with "the boundary of AI" as their core issue. Digital detox and "AI-free zones" appear with explicit "no AI" labels in schools, hospitals, and counseling settings.
来自那些世界的三件文物Three Artifacts from Those Worlds
Three Artifacts from Those Worlds
推演若只有论断会显得抽象。下面三件是 design fiction——明确虚构的未来文物,用以让"判断密度的组织"可触。它们不是预测,是把命题投影到 2032 的一种方式。
Speculation made only of assertions would feel abstract. The three pieces below are design fiction: explicitly fictional future artifacts that make "the organization of judgment density" tangible. They are not predictions; they are a way of projecting the thesis onto 2032.
- 判断密度
- 11 名判断者 · 约 2,400 个常驻 agent · 人均承载判断节点 218 个
- Judgment density
- 11 judges · about 2,400 resident agents · 218 judgment nodes carried per person
- 人机比
- 1 : 218(2029 为 1 : 31)
- Human-to-machine ratio
- 1 : 218 (1 : 31 in 2029)
- Agent 工时计价
- $0.0007 / 推理千次 · 季度算力地租占毛利 23%(最大单项成本,已超薪酬)
- Agent time pricing
- $0.0007 per thousand inferences · quarterly compute rent is 23% of gross margin (the largest single cost, now exceeding payroll)
- 组织连贯性指标
- 方向偏移度 0.4%(季度判断与年度命题一致性)——取代了 KPI 达成率
- Organizational coherence metric
- Directional drift of 0.4% (consistency of quarterly judgments with the annual thesis); it has replaced the KPI attainment rate
「我们不再统计人头或产出。我们统计两件事:判断的质量,和上下文的连贯。其余的,系统自己长出来。」——致股东信
"We no longer count heads or output. We count two things: the quality of judgment and the coherence of context. The rest, the system grows on its own." (Letter to shareholders)
2032-03,三个相互调用的采购 agent 在一次价格预言机抖动下形成正反馈,11 分钟内超额承诺 $420 万。无人逐笔下令——按 Perrow[R39] 的视角,这是一次正常事故(紧耦合 + 交互复杂度的系统里,事故是必然产物),不是某个 agent 的错。
In March 2032, three procurement agents calling one another formed a positive feedback loop during a jitter in the price oracle, over-committing $4.2M within 11 minutes. No one issued the orders transaction by transaction. In Perrow's[R39] terms, this was a normal accident (in a system with tight coupling and interactive complexity, accidents are an inevitable byproduct), not the fault of any single agent.
- 根因
- 多 agent 紧耦合 + 共享预言机 = 交互不可预见(这是 NAT 正常事故学派的判断)
- Root cause
- Tight coupling of multiple agents plus a shared oracle equals unforeseeable interaction (this is the judgment of the NAT normal-accident school)
- 责任链
- 落在授权该工作流上线的人类判断者——不是"AI 说错了"(呼应 Air Canada 案[R17]:公司不能以 AI 失误免责)
- Chain of responsibility
- Falls on the human judge who authorized the workflow to go live, not on "the AI got it wrong" (echoing the Air Canada case[R17]: a company cannot disclaim liability by blaming an AI error)
- 修复
- 解耦 + 熔断(kill switch)+ 人类判断节点前移到不可逆动作前——这是 HRO 高可靠性组织学派的标准动作。NAT 与 HRO 在此并非同一回事:前者说事故不可根除,后者说仍可把概率压到极低;二者是张力中的两面,复盘同时借两只眼睛看。
- Remediation
- Decoupling, a circuit breaker (kill switch), and moving the human judgment node ahead of any irreversible action: this is the standard move of the HRO high-reliability-organization school. NAT and HRO are not the same thing here: the former says accidents cannot be eradicated, the latter says their probability can still be pressed very low. The two are sides of a tension, and the postmortem looks through both eyes at once.
「你不会写代码、不会画图、不会起草合同——这些 agent 都做。你做它们做不了的:决定什么值得做、在备选间选择、为后果承担法律与声誉责任、维持组织方向。」
"You will not write code, draw designs, or draft contracts; agents do all of that. You do what they cannot: decide what is worth doing, choose among alternatives, bear the legal and reputational responsibility for the consequences, and maintain the organization's direction."
- 职责
- 验证而非生成 · 设定品味与边界 · 承担不可逆决策的后果 · 持有关键关系
- Responsibilities
- Verify rather than generate · set taste and boundaries · bear the consequences of irreversible decisions · hold the key relationships
- 不要求
- 任何单一执行技能的熟练度
- Not required
- Proficiency in any single execution skill
- 考核
- 判断质量与方向正确度(非产出量)——印证 M.05 人即判断锚点的 2032 岗位形态
- Evaluation
- Quality of judgment and correctness of direction (not output volume): the 2032 form of the role that confirms M.05, the human as judgment anchor
更深远的影响Second-Order Effects
Second-Order Effects
推演的终点不是组织本身,是它溢出的东西。以下每条都标注在哪个情景下成立——没有无条件的预言。
The endpoint of speculation is not the organization itself but what spills over from it. Each item below is annotated with the scenario under which it holds; there are no unconditional prophecies.
- 新模式:判断市场(判断作为可交易服务,按质量计价)〔寒武纪/主权护城河〕;agent 工会与"Tilly tax"式劳工对抗〔算法封建/收紧象限〕;组织即可分叉的开源协议(fork-able org,治理像代码一样被复制改写)〔寒武纪〕。
- New patterns: a market for judgment (judgment as a tradable service, priced by quality) [Cambrian / Sovereign Moat]; agent unions and "Tilly tax"-style labor resistance [Algorithmic Feudalism / tightening quadrant]; the organization as a fork-able open-source protocol (a fork-able org whose governance is copied and rewritten like code) [Cambrian].
- 新方法工具:情景对冲(组织同时为多个象限保留期权)〔不确定性高的全象限〕;agent 单位经济学计价器的成熟形态(算力地租→定价)〔算力成为最大成本项的象限〕;连贯性度量取代 KPI〔判断密度组织成型后〕。
- New methods and tools: scenario hedging (the organization holds options across several quadrants at once) [all high-uncertainty quadrants]; a mature form of the agent unit-economics calculator (compute rent feeding into pricing) [quadrants where compute becomes the largest cost item]; coherence metrics replacing KPIs [once the judgment-density organization has taken shape].
- 二阶影响:就业——判断岗 vs 执行岗的重构与断层〔全象限,强度随集中度变〕;治理——AI 法人地位/责任法演进(Air Canada[R17] 是第一块判例)〔收紧象限加速〕;认识论——当"知道"可外包给 agent,人保留的是"判断什么值得知道"〔全象限〕。
- Second-order effects: employment, the restructuring and rupture between judgment roles and execution roles [all quadrants, intensity scaling with concentration]; governance, the evolution of AI legal personhood and liability law (Air Canada[R17] is the first piece of case law) [accelerating in the tightening quadrant]; epistemology, when "knowing" can be outsourced to agents, what humans keep is "judging what is worth knowing" [all quadrants].
谁应当采用这套方法论
Who Should Adopt This Methodology
敢标注适用边界的方法论,才值得信任。这一套只为 greenfield 而画——错配对象,是它最常见的死法。
A methodology willing to mark its own boundary of applicability is the only kind worth trusting. This one is drawn for greenfield alone; mismatching whom it is for is its most common way of dying.
2026 年起从零开始构建的创业者。AI Native 架构的成本上游高(你在学习以不同方式构建),下游低(你扩展非常高效)。对于 greenfield,这是正确的权衡。
Founders building from scratch from 2026 onward. AI Native architecture is expensive upstream (you are learning to build a different way) and cheap downstream (you scale very efficiently). For greenfield, that is the right trade-off.
大型组织内部有真正架构权的事业部负责人——也就是说,他们能构建一个新单元而不继承母公司的流程。当母公司的引力越强,适配度越弱。
Division heads inside a large organization who hold real architectural authority: that is, who can build a new unit without inheriting the parent company's processes. The stronger the parent's gravity, the weaker the fit.
公共部门和非营利运营者,他们的使命允许工作流重设计。许多这样的组织戏剧性地未能充分利用 AI——不是因为负担不起,而是因为没有重新设计运营。
Public-sector and nonprofit operators whose mission allows workflows to be redesigned. Many such organizations dramatically underuse AI, not because they cannot afford it, but because they have not redesigned their operations.
寻求"转型"的大型传统组织——在未修改形式下不适用。那些组织需要不同的方法论,聚焦于阶段性分解、变革管理、组织内受保护的 greenfield 区域。那是相邻的方法论,不是这一套。
Large traditional organizations seeking a "transformation": not applicable in unmodified form. Those organizations need a different methodology, one focused on phased decomposition, change management, and protected greenfield zones inside the organization. That is the adjacent methodology, not this one.
如果你身处这种环境想推动 AI Native,正确的策略不是"转型整个公司",而是在公司内争取一块独立土地,按这套方法论从零开始构建一个新单元——让它的产出与传统单元形成对照,让对照本身推动更广的变化。
If you are in such an environment and want to push AI Native, the right strategy is not to "transform the whole company" but to win a patch of independent ground inside it and build a new unit from scratch under this methodology, letting its output stand in contrast to the traditional units and letting that contrast drive broader change.
对人有强情感劳动需求的领域(深度心理咨询、临终关怀、儿童教育核心环节)——AI Native 可以辅助,但不应主导。
Domains with strong emotional-labor demands on people (deep psychological counseling, end-of-life care, the core of childhood education): AI Native can assist, but should not lead.
一人公司:N=1 的极限解
The One-Person Company: the N=1 Limiting Solution
规模是选择,连贯性才是目的。这张图纸把"组织必须是很多人"这个隐含假设,永久地变成一个待论证的命题——这里的"一"有两副面孔:立证时是字面 N=1,落地时是连贯性的单位(见下「两种读法」)。它是 T1 在 N=1 处的极限解,也是 T1 的试金石:如果判断的分布与上下文的流动是组织的本质,那么一个判断节点加一座上下文库,就已经是一个完整的组织。
Scale is a choice; coherence is the purpose. This sheet turns the buried assumption that "an organization must be many people" permanently into a proposition awaiting proof. The "one" here has two faces: at the moment of proof it is the literal N=1, in practice it is a unit of coherence (see "two readings" below). It is the limiting solution of T1 at N=1, and also T1's litmus test: if the distribution of judgment and the flow of context are the essence of an organization, then one judgment node plus one context store is already a complete organization.
把全卷的承重墙搬到极限:T1 说组织是判断的分布与上下文的流动,那么"需要多少人"就不是组织的定义性属性,而是一个工程参数——由判断需要多少个不可替代的承担者决定。当执行可以全部外置给 agent 网络与无需许可的杠杆(代码、内容、API),这个参数的下限触到 1。最小可行组织 = 一个判断节点 + 一座上下文库。
Take the load-bearing wall of the whole volume to its limit: T1 says an organization is the distribution of judgment and the flow of context, so "how many people are needed" is not a defining property of the organization but an engineering parameter, set by how many irreplaceable bearers the judgment requires. When execution can be fully externalized to a network of agents and to permissionless leverage (code, content, APIs), the lower bound of that parameter reaches 1. The minimum viable organization = one judgment node + one context store.
这不是把人变少的成本游戏,而是一次定义的收紧。一人公司之所以成立,不是因为一个人能干完所有活——恰恰相反,是因为几乎所有活都不再需要那个人干。他保留的,是 agent 无法代偿的那部分:不可逆决策、承载声誉的承诺、承载价值观的取舍(M.05 的三类锚点,在 N=1 时全部压回同一个人身上)。"组织必须是很多人"——这句一直被当作公理的话,从此降级为一个可以被反例驳倒的命题。
This is not a cost game of using fewer people; it is a tightening of the definition. A one-person company holds together not because one person can do all the work; on the contrary, it is because almost none of the work still needs that person to do it. What the operator retains is the part no agent can substitute for: irreversible decisions, reputation-bearing commitments, value-bearing trade-offs (the three anchor types of M.05, all pressed back onto a single person at N=1). "An organization must be many people," a sentence long treated as an axiom, is from here downgraded to a proposition that a counter-example can refute.
全书把这一章叫"N=1 的极限解",但"一"有两个精确、互相嵌套的读法——不是矛盾,是同一命题的两个变焦档位。
This sheet is titled "the N=1 limiting solution," yet "one" carries two precise, nested readings: not a contradiction, but two zoom levels of the same proposition.
读法一 · 作为下限(存在性证明)。N=1 严格成立:一个判断节点 + 一座上下文库,就是一个完整的组织。它把"组织必须是很多人"这条被当作公理的话,降级为一个能被单个反例驳倒的命题。这是数学锚、是试金石——要的就是字面那个 1。
Reading one · as a lower bound (existence proof). N=1 holds literally: one judgment node + one context store is already a complete organization. It downgrades the axiom "an organization must be many people" into a proposition a single counter-example can refute. This is the mathematical anchor, the litmus test: it wants the literal 1.
读法二 · 作为连贯性单位(本质)。当"一"用作处方而非证明,它指的不是 headcount,而是判断与叙事的单一连贯锚:判断从同一个意志发出,上下文在同一座库里复利。这个锚通常是一个人,也可以是一个高连贯的小团队(as-if-one-mind)。定义性属性是连贯密度,不是人数等于一。Jarvis 的 "company of one" 正是此读法:以小为常态的经营哲学、含小团队,≠ 字面一个人[R38]。
Reading two · as a unit of coherence (the essence). When "one" is used as prescription rather than proof, it means not headcount but a single coherent anchor for judgment and narrative: judgment issues from one will, context compounds in one store. That anchor is usually one person, but can be a small, highly coherent team operating as-if-one-mind. The defining property is coherence density, not a headcount of one. Jarvis's "company of one" is exactly this reading: a philosophy of staying small by default, small teams included, ≠ literally one person[R38].
桥接(把两者缝成一个命题)。N=1 是这条原理最锋利的实例(供立证);"连贯性单位"是可推广的原理(供落地)。真实世界的探索几乎都落在严格极限右侧一点(1-5 人、small-by-design),跑的却是同一条逻辑。所以:证明时,"一"是数字;落地时,"一"是单位。
The bridge (stitching the two into one proposition). N=1 is the sharpest instance of the principle (for proof); "unit of coherence" is the generalizable principle (for practice). Real-world exploration almost always sits just to the right of the strict limit (1-5 people, small-by-design), yet runs the same logic. So: in proof, "one" is a number; in practice, "one" is a unit.
四个世界观Four Worldviews of the One
Four Worldviews of the One
一人公司不是"创业公司的迷你版",而是另一套看待企业的方式。和 SHEET 05 的六个世界观平行,这里有四个只在 N=1 极限才显形的世界观——它们决定了一人公司的设计起点。
A one-person company is not a miniature startup; it is a different way of seeing the enterprise. In parallel with the six worldviews of SHEET 05, here are four that surface only at the N=1 limit: they set the design starting point of the one-person company.
公司即生命体
The Company as a Living Organism
一人公司不是缩小的科层,是一个单细胞高密度判断体——一个判断核 + 一座上下文库,靠滚动实验自我迭代。这正是 M.06「组织即生命系统」在 N=1 处的具体形态(回指 SHEET 06):没有部门隔间需要拆,因为从来没有隔间;秩序不靠指派,因为只有一个细胞。生命系统逻辑不分大小,一人公司是它密度最高的实例。
A one-person company is not a shrunken hierarchy; it is a single-cell, high-density judgment body: one judgment core plus one context store, iterating on itself through rolling experiments. This is the concrete form that M.06 "the organization as a living system" takes at N=1 (back-reference to SHEET 06). There are no departmental compartments to tear down, because there never were any; order needs no assignment, because there is only one cell. The living-system logic is scale-agnostic, and the one-person company is its highest-density instance.
杠杆而非员工
Leverage, Not Employees
传统组织靠雇人扩张产能,一人公司靠无需许可的杠杆。Naval 把杠杆分四类——劳动力、资本、代码、媒体;前两者要别人点头(permissioned),后两者无需许可、无复制边际成本,可无限扩展[R38b]。一人公司的整个产能曲线建在 code+media+agent 上:不是"没有团队",是把团队换成了不要工资、不要管理、可被版本化的杠杆资产。
Traditional organizations expand capacity by hiring; the one-person company runs on permissionless leverage. Naval sorts leverage into four kinds: labor, capital, code, and media. The first two require someone else's nod (permissioned); the latter two are permissionless, carry no marginal cost of replication, and scale without limit[R38b]. The entire capacity curve of a one-person company is built on code + media + agents: it is not "having no team," but swapping the team for leverage assets that take no salary, need no management, and can be version-controlled.
韧性高于增长
Resilience Over Growth
默认目标不是做大,是刻意保持小并持久。Jarvis《Company of One》(2019) 把"以小为常态"当成一种经营哲学而非过渡阶段——增长不是默认值,而是需要被论证的选项[R38c]。注意:Jarvis 的 "company of one" 实指"小为常态"、含小团队,不等于字面上严格一个人——这里借用它的规范取向,而非把它读成 N=1 的同义词。韧性来自低固定成本、不可被一纸条款掐断的多供应商架构、以及不被融资节奏绑架的自由。
The default goal is not to grow large but to stay deliberately small and durable. Jarvis's Company of One (2019) treats "staying small by default" as a business philosophy rather than a transitional phase: growth is not the default but an option that must be argued for[R38c]. Note: Jarvis's "company of one" really means "small by default" and includes small teams; it does not equal a strictly literal single person. Here we borrow its normative stance rather than read it as a synonym for N=1. Resilience comes from low fixed costs, a multi-vendor architecture that no single clause can sever, and freedom from being held hostage to a financing cadence.
利润即氧气
Profit as Oxygen
对一人公司,利润不是分配给股东的剩余,是维持生命的氧气。没有融资跑道兜底,第一天就必须有正向现金流——最小可行利润(MVPr)取代最小可行产品成为真正的里程碑:能不能养活这个判断节点,决定它能不能继续判断。这反转了风投式创业的氧气来源:那里氧气是下一轮融资,这里氧气是这个月的毛利。
For a one-person company, profit is not a surplus distributed to shareholders; it is the oxygen that keeps life going. With no financing runway to fall back on, there must be positive cash flow from day one. Minimum viable profit (MVPr) replaces the minimum viable product as the real milestone: whether it can feed this judgment node decides whether the node can keep judging. This inverts the source of oxygen in venture-style startups: there, oxygen is the next funding round; here, oxygen is this month's gross margin.
七个支柱Seven Pillars of the Sovereign Operator
Seven Pillars of the Sovereign Operator
SHEET 07 的七大架构支柱是为 N=众多画的工程承诺;这里的七个支柱是它在 N=1 的对偶——同样相互依存,缺一根,一人公司就从"主权操作者"塌回"过劳的个体户"。编号用 SO 前缀(Sovereign Operator),与架构支柱 01-07 区分。
The seven architectural pillars of SHEET 07 are engineering commitments drawn for N=many; the seven pillars here are their dual at N=1, equally interdependent. Remove one and the one-person company collapses from "sovereign operator" back into "an overworked sole trader." They are numbered with the SO prefix (Sovereign Operator) to distinguish them from architectural pillars 01-07.
主权操作者
The Sovereign Operator
三重主权是一个整体:失去任何一重,"公司"就退化成一份伪装成生意的工作。财务主权失守,你为下一笔钱打工;叙事主权失守,平台改一次算法就掐断你的命脉;操作主权失守,你成了自己流程的人肉解释器。主权操作者的全部设计,是把这三者牢牢攥在一个人手里。
The three sovereignties are one whole: lose any one and the "company" degrades into a job disguised as a business. Lose financial sovereignty and you work for the next paycheck; lose narrative sovereignty and a single algorithm change can sever your lifeline; lose operational sovereignty and you become the human interpreter of your own process. The entire design of the sovereign operator is to keep all three firmly in the hands of one person.
- Sovereignty
- 财务 · 叙事 · 操作financial · narrative · operational
- Anti-pattern
- 伪装成生意的工作a job disguised as a business
反规模化即设计
Un-scaling as Design
规模在传统创业里是默认正方向,在一人公司里是需要被论证的选项(O.03)。每多一个人,协调税以 n² 增长(SHEET 04),而一人公司的全部竞争力恰恰来自 n=1 时协调税为零、判断密度为 100%。把"不scale"当成设计约束,等于把这份结构优势锁死在资产负债表里。
In traditional startups scale is the default positive direction; in a one-person company it is an option that must be argued for (O.03). With each added person the coordination tax grows as n² (SHEET 04), while the entire competitive edge of a one-person company comes precisely from the coordination tax being zero and judgment density being 100% at n=1. Treating "not scaling" as a design constraint locks that structural advantage into the balance sheet.
- Default
- 不增长,除非被论证no growth unless argued for
- Filter
- 产能 vs 主权capacity vs sovereignty
杠杆复利
Compounding Leverage
这是 O.02 的执行形态:杠杆不会自己积累,它来自每周被刻意保护出来的那 30%。代码写一次被调用无数次,一篇内容发一次被搜索无数年,一个上下文库一旦建成就让每个后续判断更快——这些是会在睡觉时增值的资产。劳动则相反:停下来就归零。一人公司的长期产能,由复利资产与消耗性劳动的比例决定。
This is the executable form of O.02: leverage does not accumulate on its own; it comes from the 30% deliberately protected each week. Code is written once and called countless times, a piece of content is published once and searched for years, and a context store, once built, makes every later judgment faster: these are assets that appreciate while you sleep. Labor is the opposite: stop and it returns to zero. The long-term capacity of a one-person company is set by the ratio of compounding assets to consumable labor.
- Cadence
- ≥30%/周 投入复利资产≥30%/week into compounding assets
- Assets
- 代码 · 内容 · 受众 · 上下文code · content · audience · context
公开建造
Build in Public
一人公司没有市场部,公开建造就是市场部。它把 O.02 的"媒体杠杆"落地为一个可执行节奏:持续公开让陌生人变成关注者,关注者变成第一批客户,客户变成口碑。更深一层,公开建造是叙事主权(SO.01)的日常维护——你的受众长在你自己的渠道上,而不是租来的平台流量里。
A one-person company has no marketing department; building in public is the marketing department. It grounds O.02's "media leverage" into an executable cadence: sustained openness turns strangers into followers, followers into first customers, and customers into word of mouth. At a deeper level, building in public is the daily maintenance of narrative sovereignty (SO.01): your audience grows on your own channels, not in rented platform traffic.
- Cadence
- ≥3 次/周 公开输出≥3 public outputs/week
- Doubles as
- 营销 · 受众 · 叙事主权marketing · audience · narrative sovereignty
利基聚焦
Niche Focus
规模型公司靠覆盖广面取胜,一人公司靠占领窄缝取胜。利基越窄,你的判断密度优势越能转化成别人给不了的深度;"谁/什么/为什么是你"三问回答得越具体,营销、产品、定价的所有决策就越自动收敛。模糊的定位在 N=1 是致命的——你没有部门去对冲一个错的方向。
Scale-type companies win by covering a broad surface; a one-person company wins by occupying a narrow crevice. The narrower the niche, the more your judgment-density advantage converts into depth others can't provide; the more concretely you answer "who / what / why you," the more every decision in marketing, product, and pricing converges automatically. Fuzzy positioning is fatal at N=1: you have no department to hedge against a wrong direction.
- Three Q's
- 谁 · 什么 · 为什么是你who · what · why you
- Moat
- 窄 × 深,非广 × 浅narrow × deep, not broad × shallow
战略性拒绝
Strategic Refusal
在没有团队稀释负载的结构里,每一个"是"都直接吃掉操作者本人的带宽,而带宽是整个公司唯一的瓶颈。anti-list 把拒绝从临场情绪升级为预先承诺的策略门:什么样的客户、功能、机会一律不碰,写在纸上,免去每次重新动摇。SO.02 的反规模化在产能层面说"不雇人",SO.06 在注意力层面说"不接活"——二者是同一个主权的两面。
In a structure with no team to dilute the load, every "yes" directly eats into the operator's own bandwidth, and that bandwidth is the company's only bottleneck. The anti-list upgrades refusal from in-the-moment emotion to a pre-committed policy gate: which clients, features, and opportunities are off-limits, written down, sparing you from wavering anew each time. SO.02's un-scaling says "don't hire" at the capacity level; SO.06 says "don't take the work" at the attention level. The two are two faces of the same sovereignty.
- Artifact
- anti-list(明文不做清单)anti-list (an explicit will-not-do list)
- Scarce resource
- 操作者注意力the operator's attention
生活先于事业
Life Before Business
这是把前六根支柱收束起来的那一根,也是一人公司区别于"超小型创业公司"的根本。财务、叙事、操作三重主权(SO.01)、刻意的反规模化(SO.02)、战略性拒绝(SO.06)——它们最终都为了同一件事:让这门生意服务于一个被亲手设计过的生活,而不是把人异化成自己公司的最高效员工。失去这根支柱,一人公司在效率上可以很成功,在意义上却背叛了它存在的全部理由。
This is the pillar that gathers the previous six, and the root of what separates a one-person company from an "ultra-small startup." The threefold financial, narrative, and operational sovereignty (SO.01), the deliberate un-scaling (SO.02), and the strategic refusal (SO.06) all ultimately serve one thing: making the business serve a life designed by your own hand, rather than alienating the person into the most efficient employee of their own company. Lose this pillar and a one-person company can be very successful in efficiency while betraying, in meaning, the entire reason it exists.
- Order
- 先设计生活,再倒推生意design the life first, then back into the business
- Telos
- 公司是工具,不是主人the company is a tool, not a master
SHEET 06 的 L.05 把"自我改进"定为生命系统的核心机制——系统持续观察自己、评估自己、改写自己。在 N=1,这套机制没有 telemetry 流水线,也不需要——它收缩成一组嵌套的同心节奏,由操作者本人作为唯一的反馈回路亲自运转:
L.05 of SHEET 06 defines "self-improving" as the core mechanism of a living system: the system continuously observes itself, evaluates itself, and rewrites itself. At N=1 this mechanism has no telemetry pipeline, nor does it need one; it contracts into a set of nested concentric rhythms, run by the operator in person as the sole feedback loop:
周 · 实验——这一周押一个可证伪的小赌注(一个功能、一篇内容、一次定价试探),周末看数据,留下能复利的、砍掉不工作的。月 · 反思——这个月的实验合起来在说什么?哪条复利曲线在变陡,哪条在变平?季 · 方向——利基(SO.05)还对吗?anti-list(SO.06)该加哪一条?年 · 哲学——这门生意还在支撑我想要的生活吗(SO.07)?四圈节奏由内向外,频率递减、可逆性递减——周实验随时可弃,年哲学一旦改写就是重定方向。这就是 self-improving 在人类尺度上的实现:不是飞轮自转,是一个人按四种周期亲手转动它。
Week · experiment. Place one falsifiable small bet this week (a feature, a piece of content, a pricing probe), read the data at the weekend, keep what compounds and cut what does not work. Month · reflection. What do this month's experiments say together? Which compounding curve is steepening, which is flattening? Quarter · direction. Is the niche (SO.05) still right? What line should be added to the anti-list (SO.06)? Year · philosophy. Is this business still supporting the life I want (SO.07)? The four rings run from inside out, with decreasing frequency and decreasing reversibility: the weekly experiment can be abandoned at any time, while rewriting the yearly philosophy is a change of direction. This is the realization of self-improving at human scale: not a flywheel spinning on its own, but one person turning it by hand on four cycles.
陷阱与不适用Pitfalls & Boundaries
Pitfalls & Boundaries
陷阱一 · 主权而无能力。握住三重主权却没有把判断兑现成产出的能力——拥有自己的渠道却没有值得分发的东西,拥有操作主权却写不出能跑的工作流。主权是必要条件,不是充分条件;一人公司放大判断密度的同时,也放大判断者的一切弱点:没有第二个判断节点做冗余校验,孤立决策的质量衰减是结构性风险,不是情绪问题。
Pitfall one · sovereignty without capability. Holding the threefold sovereignty without the ability to cash judgment out into output: owning your own channels but having nothing worth distributing, holding operational sovereignty but unable to write a workflow that runs. Sovereignty is a necessary condition, not a sufficient one. While a one-person company amplifies judgment density, it also amplifies every weakness of the judge: with no second judgment node for redundant verification, the quality decay of isolated decisions is a structural risk, not an emotional one.
陷阱二 · 利基崇拜而无市场。SO.05 要求窄,但窄到没有人愿意付费,利基就从护城河变成无人区。把"小众"误当"高端"、把"没人做"误当"蓝海"——很多时候没人做只是因为没人要。利基聚焦必须先验证市场存在,再收窄,而不是先爱上一个窄定位再去找根本不存在的需求。
Pitfall two · niche worship without a market. SO.05 demands narrowness, but narrow to the point where no one will pay turns the niche from a moat into a no-man's-land. Mistaking "niche" for "premium," mistaking "no one does it" for "blue ocean": often no one does it simply because no one wants it. Niche focus must first verify that the market exists, then narrow, rather than falling in love with a narrow positioning first and then hunting for demand that does not exist at all.
不适用 · 三类业务请勿照搬。① 重资产——需要工厂、库存、物理供应链的生意,杠杆无法 permissionless,一人撑不起资本密集度;② 强协调——产出本质上需要多个不可替代判断者实时咬合的工作(大型工程、复杂谈判、需要现场多工种协同的交付),N=1 在结构上做不到;③ 需要被管理的人——如果业务的价值恰恰来自一支需要被领导、被发展、被组织的团队,那它的本质就是 N=众多,一人公司的全部前提不成立。本章的"下限"是存在性证明,不是适用性声明。
Not applicable · do not copy this to three kinds of business. ① Capital-heavy: businesses that need factories, inventory, or a physical supply chain, where leverage cannot be permissionless and one person cannot bear the capital intensity. ② Coordination-heavy: work whose output inherently requires several irreplaceable judges meshing in real time (large engineering projects, complex negotiations, delivery that needs on-site coordination across trades), which N=1 structurally cannot do. ③ People who need to be managed: if a business's value comes precisely from a team that needs to be led, developed, and organized, then its essence is N=many and the entire premise of the one-person company fails. The "lower bound" of this sheet is an existence proof, not a statement of applicability.
光谱左端的现实标定Empirical Markers · 均为自报口径
Empirical MarkersAll Self-Reported
SHEET 03 的组织形态光谱把规模降级为自由变量,它最左端的极限解就是本章。那条光谱左端已有现实标定——这些不再是思想实验。Sam Altman 2024 年 2 月转述他与一群科技公司 CEO 朋友的赌局:赌"第一家一人十亿美元公司"会在哪一年出现。而光谱左端早已有可被引用的样本(以下数字均为当事人自报口径,未经独立审计):
SHEET 03's spectrum of organizational forms demotes scale to a free variable, and its leftmost limiting solution is this sheet. The left end of that spectrum already has real-world markers; these are no longer thought experiments. In February 2024 Sam Altman recounted a wager with a group of tech-company CEO friends, betting on which year the "first one-person billion-dollar company" would appear. The left end has long had citable samples (the figures below are all self-reported by the people involved and have not been independently audited):
- Pieter Levels — 一人产品组合年收入约 $1.6M-3M(公开仪表盘)
- Marc Lou - 2025 年收入约 $1.03M(约 20 个产品)
- Justin Welsh — 一人累计收入破 $10M(自报毛利约 89%)
- Altman, 2024/2 — 一人十亿美元公司赌局(CEO 朋友群)
- Pieter Levels: one-person product portfolio, annual revenue roughly $1.6M-3M (public dashboard)
- Marc Lou: about $1.03M revenue in 2025 (around 20 products)
- Justin Welsh: one-person cumulative revenue past $10M (self-reported gross margin around 89%)
- Altman, 2024/2: the one-person billion-dollar company wager (a group of CEO friends)
这些数字对照同光谱上 Cursor 量级的 $2B ARR 并不大——但结构信号极强:它们证明组织的下限已经脱离人数约束,正如光谱中段的 Anysphere 与 Anthropic 证明了人均产出的上限同样脱离了直觉约束。两端是同一命题(T1)在参数空间两侧的不同解。
Against the $2B ARR of a Cursor-scale company on the same spectrum, these figures are not large, but the structural signal is very strong: they prove the lower bound of organization has already detached from a headcount constraint, just as Anysphere and Anthropic in the middle of the spectrum proved that the upper bound of output per person has likewise detached from intuitive limits. The two ends are different solutions of the same proposition (T1) on opposite sides of the parameter space.
但左端有诚实的注脚,必须连同样本一起读。其一,孤立判断没有冗余——一人公司在放大判断密度的同时放大了判断者的盲区,没有第二个节点做交叉校验,决策质量的衰减是结构性的。其二,单一供应商即生存风险——单一模型供应商一纸条款变更就能掐断命脉,多模型架构(SHEET 07 支柱 04)在 N=1 时不是支柱,是命脉。其三,样本有偏——左端样本几乎全部来自低边际成本的数字产品,外推到重资产、强监管或物理交付,目前没有证据。一人公司是 T1 的极限解与试金石,不是普遍处方——它在光谱上的全部意义,是把"组织必须是很多人"这个隐含假设,永久地变成了一个待论证的命题。
But the left end carries honest footnotes that must be read together with the samples. First, isolated judgment has no redundancy: while a one-person company amplifies judgment density, it amplifies the judge's blind spots too, and with no second node for cross-verification the decay of decision quality is structural. Second, a single vendor is a survival risk: a single model vendor can sever the lifeline with one clause change, so a multi-model architecture (SHEET 07, pillar 04) is, at N=1, not a pillar but the lifeline. Third, the sample is biased: the left-end samples come almost entirely from low-marginal-cost digital products, and there is currently no evidence for extrapolating to capital-heavy, heavily regulated, or physically delivered businesses. The one-person company is T1's limiting solution and litmus test, not a universal prescription: its whole meaning on the spectrum is to turn the buried assumption that "an organization must be many people" permanently into a proposition awaiting proof.
光谱右端:判断无中心
The Other Pole · Judgment Without a Center
一人公司是规模轴的极限(N=1)。但 T1 有两根轴——规模之外,还有判断的分布。在这根轴上,一人公司同样站在一端:判断极致集中于一个核。它真正意义上的"另一个极限",是判断极致分散——没有中心,决策权按规则散布在一张自治的网络里。常规科层、网络/平台、holacracy 依次落在中段;最右端,是分布式自治组织。两端不是孤立形态,而是同一根轴的两个极限解——都因协调成本坍塌而第一次可行。
The one-person company is the limit of the scale axis (N=1). But T1 has two axes; beyond scale lies the distribution of judgment. On that axis the one-person company again sits at an end: judgment maximally concentrated in a single core. Its true "other limit" is judgment maximally distributed: no center, with decision rights spread by rule across a self-governing network. Conventional hierarchy, network/platform, and holacracy fall in the middle; at the far right sits the distributed-autonomous organization. The two ends are not isolated forms but two limiting solutions of the same axis, each made viable for the first time by the collapse of coordination cost.
这一极最真实的当代样本,是 DeSci(去中心化科学):没有一个中央机构决定做什么、谁对谁错;研究、资助与评议分散给大量自治的贡献者。它靠三件事维持连贯,而不是靠层级裁决——上下文全部公开可读(人和 agent 都能继承)、贡献与验证有公开协议、判断质量经开放同行评价沉淀为声誉。AI 在这里是放大器:把"综合海量分散判断"的成本压到可行。"无中心如何不散"的答案,不在记账技术,而在共享上下文与开放评议——这恰是 T1 两根轴在另一极的样子。
The most concrete contemporary instance of this pole is DeSci (decentralized science): no central body decides what to pursue or who is right; research, funding, and review are spread across many autonomous contributors. It holds together not by hierarchical adjudication but by three things: context kept fully open and legible (inheritable by humans and agents alike), open protocols for contribution and validation, and judgment quality settling into reputation through open peer review. AI is the amplifier here, pushing the cost of synthesizing vast distributed judgment down to the feasible. The answer to "how does a center-less organization stay coherent" lies not in a ledger technology but in shared context and open review, which is just what T1's two axes look like at the other pole.
把这张图纸收成一句话:商业是生活的工具,不是它的主人。一人公司之所以值得作为一种严肃的组织设计选项被画进这套图集,不是因为它能赚多少钱,而是因为它把组织的下限钉死在了"一个判断节点 + 一座上下文库"——从此,规模彻底成为自由变量。你可以选择 N=1,可以选择 N=众多,但无论选哪一端,组织的本质都没变:判断在哪里发生,上下文如何抵达。一人公司是这条命题在最孤独的极限处,依然成立的证明。
To gather this sheet into one sentence: business is a tool for life, not its master. The one-person company earns its place in this atlas as a serious option in organizational design not because of how much money it can make, but because it nails the lower bound of organization to "one judgment node + one context store." From there, scale becomes fully a free variable. You can choose N=1, you can choose N=many, but whichever end you choose, the essence of the organization does not change: where judgment happens, and how context arrives. The one-person company is the proof that this proposition still holds at its loneliest limit.
AI 时代创业的四个阶段
The Four Stages of Building in the AI Era
稳态架构画完了,这一章画路径:从想法到 Scale 的四个阶段。阶段的意义不在命名,而在判据——知道自己在哪一段,才知道哪些错误此刻致命、哪些可以先欠着。
The steady-state architecture is drawn; this chapter draws the path: four stages from idea to scale. A stage matters not for its name but for its criteria. Knowing which stage you are in is how you know which mistakes are fatal right now and which can be deferred.
- Anthropic 2026 - The Founder's Playbook
- Y Combinator failure analyses
- Lean AI Native Leaderboard
AI 时代的创业不只是"用 AI",更是"用 AI 创业"。Anthropic 2026 年发布的《The Founder's Playbook: Building an AI-Native Startup》系统化了这条新路径——AI 重新定义了传统创业生命周期的每一阶段。Idea 阶段的核心不再是抢先建造,而是抵御"过早建造"的诱惑;MVP 阶段不再只是写代码,而是积累持久上下文;Launch 阶段不再是抢市场,而是开始"消化技术债 + 释放创始人";Scale 阶段不再是堆人,而是把领域专长编码为不可复制的护城河。
Building in the AI era is not just "using AI"; it is "founding a company with AI". The Founder's Playbook: Building an AI-Native Startup, published by Anthropic in 2026, systematizes this new path: AI redefines every stage of the traditional startup lifecycle. The Idea stage is no longer about building first; it is about resisting the temptation to build too early. The MVP stage is no longer just writing code; it is accumulating compounding context. The Launch stage is no longer a land grab; it is the start of "paying down technical debt and freeing the founder". The Scale stage is no longer adding headcount; it is encoding domain expertise into a moat no one can copy.
把这四个阶段叠在一起看,会发现一个核心规律——创始人的位置在每一阶段都向"系统设计者"上移。这条角色演化曲线本身,比任何工具都更接近 AI Native 方法论的本质。
Layer the four stages on top of one another and one rule appears: the founder's position rises toward "system designer" at every stage. That curve of role evolution is itself closer to the essence of the AI Native methodology than any tool.
研究而非工程的阶段
A stage of research, not engineering
目标——在投入资源建造前,组装足够证据证明问题真实存在、解决方案能够解决它。这是研究、客户访谈、竞品分析、诚实评估反证的阶段,而不是写一行 production code 的阶段。
Goal: before committing resources to building, assemble enough evidence that the problem is real and that the solution can solve it. This is the stage for research, customer interviews, competitive analysis, and an honest reckoning with disconfirming evidence; it is not the stage for writing a line of production code.
退出条件——找到 problem-solution fit。能精确说出谁有这个问题、多频繁、多严重、当前如何处理;能给出可测试的具体假设("中型公司的财务经理每周花 4+ 小时对账,因为现有工具与会计系统不兼容")而非泛化观察("人们对账很麻烦")。
Exit condition: problem-solution fit. You can state precisely who has the problem, how often, how severely, and how they handle it today; you can give a specific testable hypothesis ("finance managers at mid-size firms spend 4+ hours a week reconciling accounts because their current tools don't integrate with the accounting system") rather than a vague observation ("reconciliation is a pain").
AI 时代陷阱——把建造当成验证(Mistaking Building for Validating)。Anthropic Founder Playbook 引用的数据令人警醒:42% 的传统创业失败是因为造了没人要的东西。AI 让 prototype 几分钟可成,但 prototype 不是证据——它是与潜在用户对话的道具。在 prototype 被当成"原因相信假设"而非"压力测试假设的工具"时,方法论已经失败。第二个陷阱是过早扩展——agentic coding 让 execution 跑在 validation 之前,AI 不会问"这值得造吗",它会以同样的热情把好想法和坏想法都建造出来。第三个陷阱是客观性丧失——问 AI 验证想法,它会找到支持证据;问它压测想法,它会找到反证。AI 跟随你的方向,所以 prompt 必须是"argue against my idea / find disconfirming evidence"。
AI-era trap: mistaking building for validating. The figure cited in the Anthropic Founder's Playbook is sobering: 42% of traditional startup failures come from building something nobody wanted. AI makes a prototype possible in minutes, but a prototype is not evidence; it is a prop for the conversation with a potential user. The moment a prototype becomes the "reason to believe the hypothesis" rather than a tool for stress-testing it, the method has already failed. The second trap is premature scaling: agentic coding lets execution run ahead of validation, and AI never asks "is this worth building?"; it builds good ideas and bad ideas with equal enthusiasm. The third trap is loss of objectivity: ask AI to validate an idea and it finds supporting evidence; ask it to stress-test the idea and it finds counterevidence. AI follows your direction, so the prompt must be "argue against my idea / find disconfirming evidence".
工具组合——Claude 作为 adversarial thinker 做 devil's advocate;Claude Cowork 综合用户访谈纪要、竞品 review、行业报告生成 themed findings;只有在最后才用 Claude Code 构建轻量 prototype——而且必须用于真实对话,不是作为产品发布。
Tool stack: Claude as an adversarial thinker playing devil's advocate; Claude Cowork synthesizing interview notes, competitive reviews, and industry reports into themed findings; only at the very end, Claude Code to build a lightweight prototype, and only for real conversations, not as a product launch.
持久上下文的建造期
The build period for compounding context
目标——把验证的问题翻译为真实用户会用的产品。但 MVP 阶段同等重要的目标是——建立持久上下文(如 CLAUDE.md 文件),让每个新 Agent session 不需要从头解释代码库。AI 时代的代码库是你与 AI 一次次协作累积出来的,可读性变成基础性而非装饰性。
Goal: translate the validated problem into a product that real users will use. But an equally important goal of the MVP stage is to build a context store (such as a CLAUDE.md file) so that each new agent session need not explain the codebase from scratch. In the AI era the codebase is what you and the AI accumulate through one collaboration after another, and readability becomes foundational rather than decorative.
退出条件——product-market fit 的真实证据:特定群体足够认可产品以保留(retention)、付费(revenue)、传播(referral)。Sean Ellis 测试(问活跃用户"如果再也不能用,你会怎么样",40%+ 回答"非常失望"是 PMF 指标)和 Effort 测试(产品开始自我拉动而非靠创始人推动)是常用 litmus test。
Exit condition: real evidence of product-market fit, where a specific group values the product enough to retain, to pay, and to refer. The Sean Ellis test (ask active users "how would you feel if you could no longer use it?"; 40%+ answering "very disappointed" signals PMF) and the effort test (the product begins to pull itself rather than relying on the founder's push) are common litmus tests.
AI 时代陷阱——Agentic 技术债(Agentic Technical Debt)是最深的失败模式。不像传统技术债线性累积,AI 技术债是复利的——没有写下来的架构约束,每个 session 重新推导基础决策,决策之间漂移,代码库失去连贯的心智模型。其次是零摩擦 scope creep——加一个 feature 在 agentic coding 下几小时就能完成,每个单独的添加都"合理",但产品边界会脱缰。第三是insecure by inexperience——AI 生成 working code,但不是 inherently secure code。功能漏洞容易被发现(要么 work 要么不 work),安全漏洞要被利用了才浮现。第四是误把早期热度当 PMF——朋友圈、投资人的 portfolio 公司、Hacker News 一篇热门帖产生的 spike 都不能预测第六周。
AI-era trap: agentic technical debt is the deepest failure mode. Unlike traditional technical debt, which accrues linearly, AI technical debt compounds: with no written-down architectural constraints, every session re-derives the same foundational decisions, the decisions drift apart, and the codebase loses any coherent mental model. Next is frictionless scope creep: under agentic coding a feature takes a few hours, every single addition looks "reasonable", and the product boundary slips its leash. Third is being insecure by inexperience: AI generates working code, but not inherently secure code. Functional bugs are easy to catch (the thing either works or it doesn't); security holes surface only once exploited. Fourth is mistaking early buzz for PMF: a spike from your social circle, an investor's portfolio companies, or one hot Hacker News post predicts nothing about week six.
工具组合——先用 Claude 设计架构约束并写入 CLAUDE.md(项目持久记忆,每个 session 自动加载);然后用 Claude Code 在约束内建造,Plan Mode 强制结构化输出;每个 session 结束更新上下文文档;用 Claude Code Security 在任何真实用户接触前做安全审查;从 Day 0 就建立 measurement framework,不要等数据来了再选 metric。
Tool stack: first use Claude to design the architectural constraints and write them into CLAUDE.md (project-persistent memory, auto-loaded each session); then use Claude Code to build within those constraints, with Plan Mode forcing structured output; update the context document at the end of each session; run Claude Code Security before any real user touches the product; and stand up a measurement framework from day zero, without waiting for data to arrive before choosing the metric.
从"做工作"转向"设计做工作的系统"
From "doing the work" to "designing the system that does the work"
目标——把早期 traction 转化为可重复、可持续的增长引擎。同时把创始人从"个人持有每一根线"的位置转向"设计让线自动运转的系统"的位置。这不是放弃控制,是把控制从微观操作升级到系统设计。
Goal: turn early traction into a repeatable, sustainable growth engine. At the same time, move the founder from "personally holding every thread" toward "designing the system that runs the threads automatically". This is not surrendering control; it is upgrading control from micro-operation to system design.
退出条件——三个并行的里程碑必须同时达成:增长可重复且通道化(CAC、LTV、payback 是已知数字、可被外人质疑也能站住脚的数字);产品能承受真实的生产负载(不只是你测试时的那种负载);运营无需创始人瓶颈即可运转(你出差一周,公司不应该停摆)。
Exit condition: three parallel milestones must be met together. Growth is repeatable and channeled (CAC, LTV, and payback are known numbers that hold up when an outsider challenges them); the product can bear real production load (not just the load you generate while testing); and operations run without the founder as a bottleneck (if you travel for a week, the company should not stall).
AI 时代陷阱——技术债开始还款。MVP 阶段为速度做的取舍,到 Launch 阶段开始计利息——产品流量、新功能、复杂度上升,让 MVP 的捷径变成结构性负债。创始人成为瓶颈——hands-on 在 MVP 是优势,在 Launch 是约束。可观察的症状:本该 1 小时的决策拖了一周;support ticket 堆积,因为只有你知道答案;运营任务只在你个人记得时才发生。过早扩张——新市场看起来像增长机会,但它们重新引入未验证的变量(用户行为、合规要求、支付基建、品类预期),让你失去对自己数据的解读能力。安全与合规不再可推迟——真实用户、真实数据、真实企业合同上桌后,"假设性风险"瞬间变成"真实暴露"。
AI-era trap: the technical debt comes due. The trade-offs the MVP stage made for speed start charging interest at Launch; rising traffic, new features, and growing complexity turn the MVP's shortcuts into structural liabilities. The founder becomes the bottleneck: hands-on is an advantage in MVP and a constraint at Launch. Observable symptoms include a one-hour decision dragging out for a week, support tickets piling up because only you know the answers, and operational tasks happening only when you personally remember them. Premature expansion: new markets look like growth opportunities, but they reintroduce unvalidated variables (user behavior, compliance requirements, payment infrastructure, category expectations) and cost you the ability to read your own data. Security and compliance can no longer wait: once real users, real data, and real enterprise contracts are on the table, "hypothetical risk" instantly becomes "real exposure".
工具组合——Claude Code 做架构审计与重构(输出技术债优先级清单);Claude 把 founder 的当前注意力清单化为"可完全自动化 / 可委托但非 founder / 必须 founder"三类,前两类交给 Claude Cowork 自动化;产品管理流程系统化——sprint 节奏、bug 路由树、metric 报告自动按时运转,不需要创始人触发;Claude Code Security 配合人工审查做企业级安全姿态。
Tool stack: Claude Code for architecture audit and refactoring (producing a prioritized technical-debt list); Claude to sort the founder's current attention into three buckets, "fully automatable / delegable but not founder-only / founder-required", with the first two handed to Claude Cowork for automation; product-management processes systematized, so sprint cadence, bug-routing trees, and metric reports run on schedule without the founder triggering them; and Claude Code Security plus human review for an enterprise-grade security posture.
从内部执行到对外角色
From internal execution to an external-facing role
目标——从数千用户到数百万,从单一市场到多市场。同时构建护城河——不是"我们用了 AI"这种被立刻复制的卖点,而是领域专长 × 用户数据 × 集成深度的复利。创始人的工作从产品内部转向公司外部——分析师简报、IPO 路演、企业级合同、监管与公关。
Goal: from thousands of users to millions, from a single market to many. At the same time, build a moat, not the "we use AI" pitch that is copied at once, but the compounding of domain expertise, user data, and integration depth. The founder's work shifts from inside the product to outside the company: analyst briefings, the IPO roadshow, enterprise contracts, regulation, and public relations.
退出条件——不再是单一里程碑而是阈值事件。三种典型形态:(一)可持续盈利无需外部资本;(二)IPO-ready,治理、合规、财务控制、战略叙事经得起公开市场审视;(三)被收购,且收购方愿意为护城河付溢价而非仅为团队。三种都要求增长系统化且可审计、产品护城河经得起 scrutiny、组织运营成熟到不再依赖创始人个人。
Exit condition: no longer a single milestone but a threshold event. Three typical shapes: (1) sustainable profitability with no outside capital; (2) IPO-ready, with governance, compliance, financial controls, and strategic narrative that withstand public-market scrutiny; (3) acquisition, where the acquirer pays a premium for the moat rather than for the team alone. All three require growth that is systematic and auditable, a product moat that survives scrutiny, and operations mature enough to no longer depend on the founder personally.
AI 时代陷阱——护城河错觉是 Scale 阶段最危险的失败:以为"我们用了 AI"就是差异化,但通用 AI 能力两年内会被全行业平价化。真护城河是领域专长 × 时间锁定的用户数据 × 集成深度——竞争对手即使有同样模型也无法复制。委托危机——创始人难以放手已经习惯的运营层,handoff 标准不清,结果系统不被信任、决策回流到创始人。GTM 真空——Idea/MVP/Launch 阶段的 founder-led selling 撞墙后必须建立正式 GTM 功能:市场分层、信息架构、分析师关系、销售剧本——多数技术创始人从未做过这些。扩张前规模化——还没准备好就进入新市场或新品类,把验证过的 PMF 稀释回未验证状态。
AI-era trap: the moat illusion is the most dangerous failure of the Scale stage. Believing that "we use AI" is differentiation, when general AI capability gets commoditized across the whole industry within two years. The real moat is domain expertise, time-locked user data, and integration depth, which a competitor cannot copy even with the same model. The delegation crisis: the founder struggles to let go of the operating layer they have grown used to, handoff standards are unclear, and the result is a system no one trusts and decisions flowing back to the founder. The GTM vacuum: once the founder-led selling of the Idea, MVP, and Launch stages hits a wall, a formal go-to-market function must be built, with market segmentation, messaging architecture, analyst relations, and a sales playbook, none of which most technical founders have ever done. Scaling before expansion is ready: entering a new market or category before you are prepared dilutes a validated PMF back into an unvalidated state.
工具组合——Claude 把创始人的领域专长编码为产品专有知识(Skills、CLAUDE.md、Memory 系统的组合)——这是护城河的基础设施;Claude Code 构建企业级 infrastructure(公共 API、SDK、第三方集成、SLA-grade observability);Claude Cowork 接管 GTM 执行层(content pipelines、analyst briefings、CRM hygiene、PR cadence);最终的护城河不是 AI 本身——是 AI 与不可复制的领域知识的复合,时间越长越深。
Tool stack: Claude to encode the founder's domain expertise into product-proprietary knowledge (a combination of Skills, CLAUDE.md, and the Memory system), which is the infrastructure of the moat; Claude Code to build enterprise-grade infrastructure (public APIs, SDKs, third-party integrations, SLA-grade observability); Claude Cowork to take over the GTM execution layer (content pipelines, analyst briefings, CRM hygiene, PR cadence). The final moat is not AI itself; it is the compound of AI and domain knowledge no one can copy, deepening the longer it runs.
- CORE INSIGHT
- 创始人位置在每一阶段都向"系统设计者"上移
- The founder's position rises toward "system designer" at every stage
- WARNING
- "快速建造"不是 AI Native 的胜利——"系统化释放创始人"才是
- "Building fast" is not the AI Native win; "systematically freeing the founder" is
- COMMON FAILURE
- 在 Stage 1 跳过验证、在 Stage 2 跳过上下文、在 Stage 3 跳过释放、在 Stage 4 误判护城河
- Skipping validation in Stage 1, context in Stage 2, founder release in Stage 3, and misjudging the moat in Stage 4
把四个阶段叠在一起看,AI Native 创业的核心节奏不是"加速建造"——这是浅层的误读。核心节奏是"加速验证 + 持续积累上下文 + 系统化释放创始人 + 把专长编码为护城河"。每一阶段,创始人的位置都在向"系统设计者"上移:Idea 阶段从"建造者"上移到"验证设计者";MVP 阶段从"代码作者"上移到"上下文工程师";Launch 阶段从"决策者"上移到"系统设计者";Scale 阶段从"内部执行者"上移到"对外角色"。
Layer the four stages together and the core rhythm of AI Native founding is not "build faster"; that is the shallow misreading. The core rhythm is "validate faster, keep accumulating context, systematically free the founder, and encode expertise into a moat". At every stage the founder's position rises toward "system designer": in the Idea stage from "builder" to "validation designer"; in MVP from "code author" to "context engineer"; in Launch from "decision maker" to "system designer"; in Scale from "internal executor" to "external-facing role".
这条角色演化曲线,就是 AI Native 方法论的终极产物。它解释了为什么 Anysphere、Cognition、Replit 这样的公司能用数十到数百人创造十亿到数百亿美元估值——他们不只是"用 AI 的传统创业团队",他们的创始人在每一阶段都把更多执行交给系统、把更多判断留给自己。媒体报道中 Anthropic 的"Hive Mind"工作方式(90 天最长规划、Slack 长文替代会议、Project Vend 让 Claude 独立运营——前两条出自报道与访谈,未经独立验证;第三条是官方公开的负结果实验[R18])是同一逻辑的极端版本。AI Native 方法论的真正胜利不在工具——而在创始人本身的角色演化。如果你走完这四个阶段,发现自己还在做 Stage 1 时做的工作,那么方法论失败了,不论 ARR 多高。
That curve of role evolution is the ultimate product of the AI Native methodology. It explains why companies like Anysphere, Cognition, and Replit can create valuations of one to tens of billions of dollars with tens to hundreds of people. They are not just "traditional startup teams that use AI"; at every stage their founders hand more execution to the system and keep more judgment for themselves. Anthropic's "Hive Mind" way of working as reported in the press (planning horizons of up to 90 days, long Slack write-ups in place of meetings, and Project Vend letting Claude run a business on its own; the first two come from reporting and interviews and are not independently verified, while the third is an officially published negative-result experiment [R18]) is an extreme version of the same logic. The real win of the AI Native methodology is not in the tools; it is in the evolution of the founder's own role. If you finish these four stages and find yourself still doing the work you did in Stage 1, the methodology has failed, however high the ARR.
四阶段的操作者手册
The Four-Stage Operator's Handbook
上一张图回答"你在哪",这一张回答"明天早上做什么"。同样的四个阶段,换成操作者视角:每一段的目标、退出条件、专属陷阱,以及 Claude 三种形态(Chat / Cowork / Code)各自的岗位。底本是 Anthropic《The Founder's Playbook》(2026),放大到组织,而不只是创业。
The previous diagram answers "where are you"; this one answers "what to do tomorrow morning." The same four stages, seen from the operator's point of view: each stage's goal, exit condition, signature trap, and the role of each of Claude's three forms (Chat / Cowork / Code). The source text is Anthropic's The Founder's Playbook (2026), scaled up to the organization rather than just the startup.
第一性原理对齐 — First Principles Alignment
不要先选工具栈,先回答 5 个问题。(1) 你的工作流图(不是组织图)是什么?把组织视为流的网络而非角色的网络。(2) 哪些步骤 AI 可以完成?哪些必须人来?(3) 你的判断锚点(不可逆决策、声誉决策、价值观决策)在哪里?(4) 你的数据飞轮在哪里?组织行动如何反哺 Agent 训练?(5) 当 AI 出错,责任在谁?这一个月不写代码,不部署 Agent。所有早期失败的根因都是第一性原理含糊。
Don't pick the tool stack first; answer five questions first. (1) What is your workflow graph (not your org chart)? See the organization as a network of flows, not a network of roles. (2) Which steps can AI do, and which must a human do? (3) Where are your judgment anchors (irreversible decisions, reputational decisions, values decisions)? (4) Where is your data flywheel? How does organizational action feed back into agent training? (5) When the AI gets it wrong, who is responsible? This month you write no code and deploy no agents. The root cause of every early failure is vague first principles.
工作流代码化 — Workflow as Code
选 3 个最高频的工作流,用 Temporal / n8n / LangGraph 写出可执行版本。这一步是最痛但最关键的——它把组织流程从"人脑里"变成"代码里"。完成标准:3 个工作流可以由代码执行,可以版本化,可以被测试。没完成不要进入下一步——你还在用人脑跑流程,AI 加进来只会放大混乱。
Pick the three highest-frequency workflows and write executable versions with Temporal / n8n / LangGraph. This step is the most painful but the most important: it moves organizational process out of "people's heads" and into "code." Done when: the three workflows can be executed by code, versioned, and tested. Don't move to the next step before this is done: you are still running processes in people's heads, and adding AI will only amplify the chaos.
上下文层建设 — Context Layer
建立向量数据库 + 决策日志 + Agent 可读文档结构。所有重要会议产生 Agent 可检索的总结。所有客户互动被结构化捕获。这是组织复利积累的开始——3 个月之后,你的 Agent 会比同样模型的竞争对手 Agent 显著更对齐你的组织。完成标准:核心知识可被 Agent 在 < 5 秒内检索到正确上下文。
Build a vector database plus a decision log plus an agent-readable document structure. Every important meeting produces an agent-retrievable summary. Every customer interaction is captured in structured form. This is where the organization's compounding accumulation begins: three months on, your agents will be markedly better aligned to your organization than a competitor's agents running the same model. Done when: an agent can retrieve the correct context for core knowledge in under 5 seconds.
多模型架构 + 可观测性 — Multi-Model + Observability
配置至少两家模型供应商(Anthropic + OpenAI 或 + Google),建立 evaluation harness(quality regression)。部署 LangSmith / Helicone / Arize。在能看见之前不要扩规模。完成标准:所有 Agent 调用被记录、可追溯、可重放;模型切换是 1 周以内的工程任务而不是 3 个月的重构。
Configure at least two model vendors (Anthropic + OpenAI, or + Google) and build an evaluation harness (quality regression). Deploy LangSmith / Helicone / Arize. Don't scale before you can see. Done when: every agent call is logged, traceable, and replayable; switching models is an engineering task of under a week rather than a three-month rebuild.
第一波 Agent 部署 — First Production Agents
从最低风险的工作流开始——内部知识检索、报告生成、代码审查、内部 ticket 分诊。不要从客户面开始(Klarna、Cursor "Sam"、Air Canada 都死在这)。设定明确的 human-in-the-loop 节点。每周 retro 一次。完成标准:至少 1 个 Agent 在生产环境稳定运行 4 周以上,错误率可量化、可改进。
Start with the lowest-risk workflows: internal knowledge retrieval, report generation, code review, internal ticket triage. Don't start customer-facing (Klarna, Cursor's "Sam," and Air Canada all died there). Set explicit human-in-the-loop nodes. Run a retro once a week. Done when: at least one agent has run stably in production for more than 4 weeks, with an error rate that can be quantified and improved.
节奏建立 — Establish Cadence
确立 90 天滚动规划周期(Anthropic 模式)。形成 Agent 部署 + 监控 + 改进的稳定循环。开始考虑哪些工作流可以从 human-in-the-loop 升级到 human-on-the-loop。这不是"完成 AI Native 转型"——AI Native 没有完成态,只有持续演化态。完成标准:组织能够在不增加员工的情况下,每月增加 1-3 个新 Agent 工作流到生产环境。
Establish a 90-day rolling planning cycle (the Anthropic model). Form a stable loop of agent deployment, monitoring, and improvement. Begin considering which workflows can be promoted from human-in-the-loop to human-on-the-loop. This is not "completing the AI Native transformation": AI Native has no finished state, only a state of continuous evolution. Done when: the organization can add 1 to 3 new agent workflows to production each month without adding headcount.
- RULE
- 没完成上一步
不要进入下一步 - Don't move to the next step
until the previous one is done - SIGN OF FAILURE
- 第 6 月仍在演示而非生产
- Still demoing rather than in production by month 6
- RECOVERY
- 回到 Month 1 的 5 个问题
- Return to the five questions of Month 1
这个路径刻意做得最低限度——它不是关于"如何成为下一个 Anysphere",是关于"如何不在前 6 个月陷入 AI Theater"。多数 AI Native 转型在第 3 个月就崩盘了,原因不是技术问题,是第一性原理没清就开始堆工具栈。建立了原则、代码化了流程、有了上下文层和可观测性——剩下的是时间和复利的工作。没建立这些底层,再多的 Agent 也只是 Theater。
This path is deliberately kept minimal: it is not about "how to become the next Anysphere" but about "how not to fall into AI Theater in the first six months." Most AI Native transformations collapse by month 3, not for a technical reason but because they start stacking tool stacks before the first principles are clear. Once you have established the principles, turned process into code, and built the context layer and observability, what remains is the work of time and compounding. Without those foundations, any number of agents is still just Theater.
施工工具包
Operator's Toolkit
前十七张图纸讲"为什么"与"按什么顺序";从这一张开始,给图纸本身——可拷贝、可施工的模板。首件:把 M.01「组织即工作流图」从口号,变成你今天就能填的脚手架。这是一个会生长的工具箱,本版含四件:① 工作流图建模模板,② 可执行的 AI-Native 架构师 skill,③ 更简单入手的轻量工具(自测 / 诊断 / 画布 / 提示词),④ 对齐目的层的三张自检卡(人的尺度 / 判断分布定位 / 组织拓扑)。开源 · MIT。
The first seventeen blueprints cover "why" and "in what order"; from this one on, the toolkit gives you the blueprints themselves: copyable, buildable templates. First item: turning M.01, organization-as-workflow-graph, from a slogan into a scaffold you can fill in today. This is a toolbox that will grow; this edition ships four pieces: ① the workflow-graph template, ② the executable AI-Native Architect skill, ③ a lite tier of lower-barrier tools (self-test / diagnostic / canvas / prompt), and ④ three self-check cards for the purpose layer (the human measure / locating judgment / mapping topology). Open-source, MIT.
① 工作流图建模Workflow Graph Modeling
① Workflow Graph Modeling
M.01 说"工作流图是真相"。但真相得能被画出来才可施工。建模法只有三类节点、四个标注——足够暴露"瓶颈在图的哪条边上"(这正是 SHEET 04 十六瓶颈反复指认的:吞吐是图的属性,不是节点的属性)。
M.01 says "the workflow graph is the truth." But a truth has to be drawable before it can be built on. The modeling method has only three node types and four annotations: enough to expose "which edge of the graph the bottleneck sits on" (exactly what SHEET 04's sixteen bottlenecks keep pointing at: throughput is a property of the graph, not of a node).
agent · 执行默认工种(M.02),近零边际成本生成/转换/执行。 human · 判断锚决定什么值得做、为后果担责(M.05)。 policy · 门禁不可逆动作前的自动门 + 例外上报(支柱 05)。
agent · runsthe default worker (M.02): generate, transform, and execute at near-zero marginal cost. human · judgment anchordecides what is worth doing and owns the consequences (M.05). policy · gatean automatic gate before irreversible actions, plus exception escalation (pillar 05).
四个标注:可并行扇出(拆 B.01 串行链)· 判断锚(人承担后果处)· 不可逆门禁(policy 必签)· 复利上下文写入(M.03/M.04,下游可检索)。
Four annotations: parallel fan-out (break B.01's serial chains) · judgment anchor (where a human carries the consequences) · irreversible gate (policy must sign off) · compounding-context write (M.03/M.04, retrievable downstream).
下面是可直接拷贝的骨架(完整可填写版 + 填写四步说明,见 templates/workflow-graph.md ↗)。这是 after 的 VC 流水线:
Below is a skeleton you can copy directly (the full fillable version plus the four-step guide is in templates/workflow-graph.md ↗). This is the after-state VC pipeline:
# 三个 scan 并行扇出;人只在 thesis 选题与 ic 拍板两处three scans fan out in parallel; humans act only at thesis framing and the ic call workflow: VC-research-pipeline nodes: - id: thesis ; type: human ; owner: "GP" ; parallelizable: false - id: scan_a ; type: agent ; owner: "" ; parallelizable: true - id: scan_b ; type: agent ; owner: "" ; parallelizable: true - id: scan_c ; type: agent ; owner: "" ; parallelizable: true - id: synth ; type: agent ; owner: "" ; parallelizable: false - id: ic_judge ; type: human ; owner: "投委会" "IC" ; parallelizable: false - id: term_gate ; type: policy ; owner: "合伙人会签""partner sign-off" ; parallelizable: false judgment_anchors: [thesis, ic_judge] # 选题与拍板是人的判断framing and the call are human judgment policy_gates: [term_gate] # 出 term sheet 必签issuing a term sheet requires sign-off
② AI-Native 架构师 · 可执行 skillThe AI-Native Architect
② The AI-Native Architect
前十七张图纸讲"画什么、按什么顺序";这一件替你把图画出来。给它一个业务、一个创业切入点、或一次组织重构的意图——它先过一道范围闸,分流到四条轨:绿地新建 / 增量"从零切出" / 仅"AI 赋能"(诚实判定为不属于本方法论的目标群体并说明,而非粉饰)/ 情感劳动边界(AI 辅助、不主导);再按 SHEET 03 的 T1 落出判断的分布与上下文的流动:重画工作流图、铺四层底座、按需展开九个深度模块(经济测算要算得拢、合规落到具体法律文书、追到"最后一个被伤害的人"、护城河双向赛跑……),最后收束到内核——当执行近乎免费、判断成为稀缺,这套架构如何沿模型曲线复利、引领而非追赶。
The first seventeen blueprints cover "what to draw and in what order"; this piece draws it with you. Give it a business, a startup wedge, or an intent to rebuild an organization, and it first runs a scope gate into four tracks: greenfield, a from-zero carve-out, mere "AI-enablement" (judged honestly out of scope and told so, not dressed up), or an emotional-labor boundary (AI assists, never leads). Then it designs T1's two structures from SHEET 03, the distribution of judgment and the flow of context: it redraws the workflow graph, specifies the four-layer substrate, opens only the depth modules a case demands (economics that tie out, compliance grounded in the actual legal instrument, tracing harm to the last human harmed, a two-sided moat race), and closes on the kernel: once execution is nearly free and judgment is the scarce factor, the architecture compounds along the model curves, leading rather than catching up.
三类节点,沿用 M.01:agent · 执行human · 判断锚policy · 门禁
Three node types, carried from M.01:agent · runshuman · judgmentpolicy · gate
# 在 Claude Code 里调用invoke inside Claude Code $ /skill ai-native-architect > "帮我把这家公司按 AI 重新设计:……""redesign this company around AI: ..." → 范围闸 · Track A / B / 出域 / 边界scope gate · Track A / B / out-of-scope / boundary → T1 · 工作流图 · 四层底座 · 深度模块 · 内核T1 · workflow graph · four-layer substrate · depth modules · kernel → 一份 AI-Native 架构蓝图one AI-Native Architecture Blueprint
开源仓库:Open-source: github.com/watterfall/ai-native-architect ↗
③ 更简单的入手 · 随取随用Lower-barrier tools, ready to hand
③ Lower-barrier tools, ready to hand
不是每个人都用 Claude Code。这一组是纸笔、或任意聊天框就能上手的轻量工具:无需安装,由易到难,给想先摸到内核、还没准备好上 skill 的人。一个决定一切的问题:
Not everyone uses Claude Code. This set is low-barrier and needs nothing but pen and paper, or any chatbot: no install, simplest first, for people who want to touch the kernel before reaching for the skill. The one question that decides everything:
- 自测卡 · 你是 AI-Native 还是 AI 赋能?在哪条轨?(纸笔 · 约 2 分钟)
- 十六瓶颈诊断表 · 给组织打 0-16 分,读出区段。(纸笔 · 一页)
- T1 画布 · 判断的分布 × 上下文的流动,一页填空。(约 10 分钟)
- 可移植提示词 · 粘进任意聊天框(ChatGPT / Claude / Gemini),跑一份精简蓝图。(无需安装)
- Self-test · AI-Native or AI-enabled? Which track? (pen + paper, ~2 min)
- 16-Bottleneck Scorecard · score your org 0-16, read the band. (one page)
- T1 Canvas · judgment x context, on one page. (fill-in, ~10 min)
- Portable Prompt · paste into any chatbot (ChatGPT / Claude / Gemini) for a lite blueprint. (no install)
四件随取随用:All four, ready to hand: github.com/watterfall/ai-native-architect/tools ↗
④ 对齐目的层 · 三张卡Three Cards for the Purpose Layer
④ Three Cards for the Purpose Layer
前三件让组织跑得动,这三张卡让它跑得对——把内核那句"效率是手段、让人回归于人才是目的",连同判断分布与组织拓扑,变成今天就能填的自检。可直接拷走。
The first three pieces make the organization run; these three cards keep it running toward the right thing: they turn the kernel's "efficiency is the means; returning people to being human is the end," together with judgment distribution and org topology, into self-checks you can fill in today. Copy them as-is.
- 这一季,团队的判断权变多了,还是变少了?
- This quarter, did the team's judgment expand or shrink?
- AI 接走的是杂活,还是把人变成了喂料工?
- Did AI take the drudgery, or turn people into feeders for the machine?
- 有人因为更少琐事、而更投入、更愿意来吗?
- Is anyone more engaged, more willing to show up, because of less busywork?
- 若指标都在涨、人却越来越忙——警报:本末倒置。
- If every metric is up yet people are busier, that is the alarm: the inversion.
- 沿光谱标出你现在的位置:一人 → 小核心 → 科层 → 网络 → holacracy → 分布式自治。
- Mark where you sit on the spectrum: one-person → small core → hierarchy → network → holacracy → distributed-autonomous.
- 该往集中端(少数核,连贯快)还是分散端(开放网络,抗单点盲区)走?
- Move toward the concentrated end (few cores, fast and coherent) or the distributed end (open network, resistant to single-point blind spots)?
- 这一步赌的是什么:速度与连贯,还是多元与冗余?
- What is this move betting on: speed and coherence, or diversity and redundancy?
- 判断节点(人):列出 ___ 个,各自为哪张图担责。
- Judgment nodes (people): list ___, and which graph each is accountable for.
- agent 网:哪些执行可整体下放?预期 ___ 个 agent。
- Agent network: which execution can be handed over wholesale? ___ agents expected.
- 上下文层:你的"共享世界模型"在哪?谁还在靠人肉转译?那就是下一个要删的瓶颈。
- Context layer: where is your shared world model? Who still relays it by hand? That is the next bottleneck to delete.