AI Native 组织方法论

SECTION

PROLOGUE · 开篇

PROLOGUE · Opening

定义 · 先划界

Definition · Draw the line first

AI Native 的分界线

The Boundary of AI Native

把 AI Native 当一把诊断尺用：先看约束有没有真的被重画，再决定要不要贴这个标签。

Treat AI Native as a diagnostic, not a label to hand out: check whether the constraints actually got redrawn before you reach for the word.

一句话In one line

当 AI 只让旧流程更快，它仍是 AI-enabled；当判断、上下文、权限与验证因此被重新安排，才值得把它当作 AI-Native 的候选形态。这个区分是工作假设，不是授勋。When AI only makes the old workflow faster, it remains AI-enabled. When judgment, context, permissions, and verification are rearranged because of it, the organization becomes a candidate for AI-Native. This is a working hypothesis, not a badge.

AI SIDE 00 外接工具只能提升旧流程；原生从工作流重画开始。 An attached tool improves the old flow; native begins with a redrawn one.

UPSTREAM

Coase, 1937 - Theory of Firm
Weick, 1979 - Organizing as a verb
Beer, 1972 - Brain of the Firm
Andreessen, 2011 - Software is eating

你在传统流程上接进了 ChatGPT，让工程师用 Cursor 写代码，让客服用 AI 起草回复——这会带来价值，但未必已经改了组织。先别急着给它命名，顺着一条真实工作流追问：谁决定目标？agent 得到哪些上下文？它能动到哪里？错了由谁发现、谁承担、怎么回滚？若这四件事仍按旧方式发生，比较准确的说法是 AI-enabled。把“用了 AI”直接等同于 AI Native，只会遮住真正需要重画的部分。

You have added ChatGPT to an existing workflow, let engineers write code with Cursor, and let support draft replies with AI. That can create value without yet changing the organization. Before naming it, trace one real workflow: who sets the goal, what context reaches the agent, what it is allowed to change, and who detects, owns, and rolls back an error? If those four things still work as before, AI-enabled is the more accurate name. Treating “uses AI” as AI-Native hides the part that still needs redesign.

差别往往在第一次出错时才露出来。一个 agent 给出了不合适的答复，或改坏了一段代码，流程会把谁推到台前？如果仍是原来的负责人临时补位，仍靠原来的文档拼上下文，仍要沿着原来的层级申请权限，仍在末端补救，那么组织只是跑得更快，并没有换一套运作方式。

The real difference often appears at the first mistake. An agent gives an unsuitable reply or breaks a piece of code: who does the workflow put in front of the problem? If the same owner still steps in at the last minute, assembles context from the same documents, requests permission through the same hierarchy, and repairs the damage at the end, the organization is merely moving faster. It is not operating differently.

这也把“AI 转型”放回它应有的位置。老组织当然可以从局部工作流开始重画；只是迁移成本、既有责任链和遗留系统不会因为模型更强而消失。与其给团队打成熟度分数，不如从一条高频、可回滚的流程开始试验：记录判断、上下文、权限和验证怎样变化。若结果只停在更快产出，就诚实地保留 AI-enabled；只有结构变化同时解释了新能力和新风险，这个词才有分析价值。

This also puts “AI transformation” in its proper place. An established organization can start redrawing one workflow at a time, but migration cost, inherited accountability, and legacy systems do not disappear because models improve. Rather than giving a team a maturity score, begin with one frequent, reversible workflow and record how judgment, context, permissions, and verification change. If the result stops at faster output, honestly keep AI-enabled. The term earns analytical value only when a structural change explains both a new capability and a new risk.

因此它最适合有架构权的人：创业者、能搭建独立 greenfield 单元的负责人，或愿意把局部流程当实验的人。真正的问题是：在今天的能力与责任条件下，哪一部分值得先重画，又该用什么观察证明这次重画没有把风险藏起来？

It is therefore most useful to people with architectural latitude: founders, leaders who can stand up an independent greenfield unit, or teams willing to treat a local workflow as an experiment. The real question is this: under today’s capabilities and accountability conditions, what should be redrawn first, and what observation would show that the redraw has not merely hidden its risk?

SECTION

THREE PERSPECTIVES · 三种视角

定义 · 区分三物

Definition · Distinguish Three Things

"AI Native"三种本质不同的现象

“AI Native”: Three Fundamentally Different Phenomena

先给这个被用滥的词做一次分诊。

First, triage an overused term.

一句话In one line

"AI Native"其实指三件事：结构上把 agent 当工种、运营上让 AI 打头阵、本质上让 agent 网络成为主体。先说清讲哪一种，讨论才不各说各话。“AI Native” actually names three things: agents as a role in the structure, AI leading every workflow in operations, agent networks as the primary actor in essence. Name which one you mean and the discussion stops talking past itself.

P.01 / Structural

结构视角Structural - Agent as team member

Structural Perspective: Agent as Team Member

这个视角把 AI Agent 当员工看——有工号（Microsoft Agent ID）、职责描述、绩效指标，甚至能被解雇。组织图上它和人平起平坐。最容易懂，也最容易陷入"AI 员工"这种拟人化误读。

This perspective treats an AI Agent like an employee: its own employee ID (Microsoft Agent ID), a job description, performance metrics, even the possibility of being “terminated.” Agents sit next to humans on the org chart. It is the easiest reading to grasp, and also the one most prone to the anthropomorphic misreading of “AI as staff.”

Examples Microsoft Agent 365 (2025/11)
Salesforce Agentforce 3
ServiceNow AWM
Lattice "AI Employee" (撤回)Lattice “AI Employee” (withdrawn)

P.02 / Operational

运营视角Operational - AI-first workflows

Operational Perspective: AI-First Workflows

这个视角反过来问：每条 workflow 先问 AI 能接哪一步，再设计人在哪里介入，而不是把 AI 塞进原有流程。产出确实最高，但也最容易变成表演——通稿写着"AI 优先"，流程一步没动，这就是"AI Theater"。

This perspective flips the question: for every workflow, ask which step AI can own first, then design where humans step in, rather than inserting AI into a process that was never redrawn. It is the most productive reading, and the easiest to hollow out into theater: the memo says “AI-first,” and not one step of the actual process has moved.

Examples Tobi Lütke Shopify memo (2025/4)
Luis von Ahn Duolingo memo
IBM AskHR 自动化IBM AskHR automation
Klarna 客服 AI 化（与回撤）Klarna customer-service AI-ification (and partial rollback)

P.03 / Nature

本质视角Ontological - Agent-first organization

Ontological Perspective: Agent-First Organization

这个视角最激进：组织的主体不是人，是 Agent 网络，人退到判断与责任的锚点位置。2026 年它还在边界探测阶段，而且诚实地说，探测结果大多是负的——Project Vend 里的 Claudius 亏本经营，被员工套话打折，还编过收款账户。这些实验证明的是可能性的边界在哪，不是这条路已经走通。但正因为还没走通，才最值得长期盯着。

This is the most radical perspective: the organization’s primary actor is not a person but the agent network, with humans pushed back to the anchor point of judgment and accountability. As of 2026 it is still probing that boundary, and honestly, most of the probes come back negative: Project Vend’s Claudius ran at a loss, got talked into discounts by employees, and once fabricated a payment account. What these experiments prove is where the boundary of possibility sits, not that the road is open. Precisely because it isn’t open yet, this is the perspective most worth watching.

Examples Anthropic Project Vend
Anthropic Project Deal
Sakana AI Scientist
MetaGPT / ChatDev 实验MetaGPT / ChatDev experiments

KEY INSIGHT: 三种视角不矛盾——它们是 AI Native 这个连续光谱上的不同位置; The three perspectives are not mutually exclusive; they mark different positions on a continuous spectrum of AI Native.
WARNING: 多数公开讨论混淆三种视角，导致"AI Native"成为含糊修辞; Most public discussions conflate all three perspectives, reducing “AI Native” to an empty slogan.

这三种视角不矛盾，是同一光谱上的三个位置——而一个真正成熟的 AI Native 组织三者都占：结构上，agent 是正式的生产单位（视角一）；运营上，每条工作流都先问 AI 能接哪一步（视角二）；到了真正想探边界的单元，还有 agent 自主运营的实验（视角三）。分清这三层，后面的讨论才不会各说各话。

These three readings are not options you pick between; they mark three positions on one spectrum, and a genuinely mature AI-Native organization occupies all three at once: structurally, agents hold formal production roles (Perspective 1); operationally, every workflow starts by asking which step AI can own (Perspective 2); and at the units actually pushing the frontier, there are experiments in autonomous agent operation (Perspective 3). Keep these three apart, or the rest of this discussion talks past itself.

本方法论的架构支柱同时回应这几个视角——"AI 优先即默认"是运营视角；"Agent 即默认工种"和"工作流即代码"在结构与运营之间；"人作为判断与责任锚"则锚定本质视角的边界：哪怕在最激进的 Agent-first 实验里，也要有个具体的人接住最终的后果。这是本方法论的押注，不是已经证成的事实——后果能不能一直锚在某个具体的人身上，恰恰是本卷最先可能被反例掀翻的一注。

The architectural pillars of this methodology respond to all three perspectives simultaneously: “AI-first as default” addresses the operational perspective; “Agent as the default worker” and “workflow as code” bridge the structural and operational; and “humans as the anchor of judgment and accountability” fixes the boundary of the ontological perspective: even in the most radical Agent-first experiment, a specific person must still catch the ultimate consequence. This is the methodology’s bet, not an established fact: whether the consequence can stay anchored to a specific person is precisely the first of this volume’s wagers a counter-example might overturn.

SECTION

FIRST PRINCIPLES · 第一性原理

FIRST PRINCIPLES · First Principles

机理 · 为什么是现在

Mechanism · Why Now

前提，正在同时失效

The Premises Are Failing

逐一点名：几股力量，各抽掉旧设计的哪条地基。

Name them one by one: which force pulls which footing out from under the old design.

一句话In one line

几股力量同时抽走旧组织设计的底层假设：协调不再必须由人做，岗位让位于工作流。AI 只是当下最可落地的触发器，真正在变的是组织约束往哪迁移。Several forces pull the founding assumptions of old organizational design out at once: coordination no longer needs humans, roles give way to workflow. AI is merely the most buildable trigger today; the real shift is where organizational constraints migrate.

FIG. 2.0 / CONVERGING FORCES · 合力 FIG. 2.0 / CONVERGING FORCES 看懂：旧设计的前提为何失效 Reading guide: why the premises of the old design have failed

这几股力量各自击穿传统组织设计的一个底层假设：协调必须由人、岗位先于工作流、产能等于人数。这些假设同时失效，修补不再可能：这就是"转型方法论触不到根"的原因，也是第 3 节推导链的起点。

Each of these forces punctures one foundational assumption of traditional organizational design: coordination requires humans; roles precede workflows; capacity equals headcount. When all three fail simultaneously, patching is no longer possible: this is why transformation methodologies never reach the root, and it is the starting point of the Section 3 derivation chain.

AI SIDE 02 这几股力量同时换位，传统组织的优化目标失效。 These forces shift at once, invalidating the old optimization target.

KEY TERMS

Coase boundary
Workflow inversion
Judgment scarcity
Algorithmic feudalism

第一股力量，协调机器化。科斯（Coase）的交易成本（搜寻、议价、监督、执行）在 AI agent 介入后能被部分甚至整体机器化，内部协调的成本曲线因此整体下移。以前非靠层级盯着不可的活，现在靠 telemetry 和 agent guardrails 就能盯住。

The first force: coordination gets mechanized. Coase’s transaction costs (search, negotiation, monitoring, enforcement) can now be partly or wholly taken over by AI agents once they intervene, pulling the whole curve down. Work that used to need a manager watching over it now gets watched by telemetry and agent guardrails instead.

第二股力量是工作单位的反转。在传统组织里，你定义角色，工作流从角色之间的互动中涌现出来。在 AI Native 组织里，你定义工作流，角色从工作流的需求中涌现出来。这是设计逻辑的彻底反转：组织的核心文档是工作流规约，而非岗位说明书。

The second force is the work-unit inversion. In a traditional organization, you define roles and workflows emerge from the interactions between them. In an AI Native organization, you define workflows and roles emerge from the requirements of those workflows. This is a complete inversion of design logic: the organization’s core document is not a job description but a workflow specification.

谱系 · 你不是第一个想重画工作流的人LINEAGE · you are not the first to redraw the workflow

承认谱系：「先定义工作流、角色随之涌现」并非全新念头——Hammer-Champy 的业务流程再造（BPR）与 Team Topologies 早已主张先重画流程、结构随之跟随。旧 BPR 多数失败，败在流程一经重画即僵化、跨环节隐性协调成本被低估。AI 改的正是这两条根因：agent 把重画工作流与跑一轮迭代的成本压低约一个量级，重画因此可频繁、可逆；共享上下文中枢又把靠人默会的协调显性化。但须标注新风险——协调塌进单一中枢，认知负载集中到一点，有上限，也是新的单点脆弱。Acknowledge the lineage: “define the workflow first and roles emerge” is not a new idea. Hammer-Champy’s Business Process Reengineering (BPR) and Team Topologies both argued for redrawing process first and letting structure follow. Old BPR mostly failed, on two counts: a redrawn process turned rigid, and cross-step tacit coordination costs were underestimated. AI changes exactly those two root causes: agents drop the cost of “redrawing a workflow” and “running one iteration” by roughly an order of magnitude, so redrawing can be frequent and reversible; a shared context hub makes formerly tacit coordination explicit. But label the new risk honestly: collapsing coordination into a single hub concentrates cognitive load at one point, which has a ceiling and is itself a new single-point fragility.

还有一股力量：瓶颈从执行挪到了判断。AI 能以接近零的边际成本生成、转换、总结、执行；它做不好的事——决定什么值得做、在几个选项里怎么选、为后果担责、把组织方向稳住——变成了新的稀缺资源。组织里最值钱的人，因此从能干活的人变成了能拍板负责的人。

Another force: the bottleneck moves from execution to judgment. AI generates, transforms, summarizes, and executes at near-zero marginal cost; what it still cannot do well (deciding what’s worth doing, choosing among options, owning the consequences, holding the organization’s direction) becomes the new scarce resource. The most valuable person in the org stops being the one who can get things done and becomes the one who can call the shot and answer for it.

这三股力量合起来说的是一件事：传统组织设计一直在为错的目标优化——清晰的角色、可预测的流程、人居中协调。AI Native 设计要的是快的工作流、嵌进流程里的判断、机器居中协调。换的是优化目标本身，不是调几个参数。

Put these three forces together and one thing follows: traditional organizational design has been optimizing for the wrong target: clear roles, predictable process, human-mediated coordination. AI-Native design optimizes for fast workflows, judgment embedded in the process, machine-mediated coordination. The target itself changed; this is not a parameter tweak.

还有一个常被忽略的结构性事实加固这个判断：LLM 反转了技术扩散的历史方向。电力、计算、GPS 都是政府与企业先用、消费者后用；LLM 反过来：先触达数十亿消费者，组织反而滞后。Karpathy 2025/6 把这件事当作主题来讲[R6]，实证也跟上了：Bick-Blandin-Deming 的全国调查（NBER WP 32966 → Management Science, 2026[R7]）测得 2024 年底约 40% 的 18–64 岁人口已在使用生成式 AI——整体采纳快于 PC 与互联网同期，且由消费端驱动；而企业的正式采纳率仅 5-9%。这意味着组织重构的知识此刻在个体手里、不在制度里——AI Native 创业者不在等技术成熟，而在等组织形态追上技术。口径的诚实注脚：同一研究显示企业工作场所采纳两年达 28%，与 PC 时代速度相当——组织是相对消费浪潮慢，不是绝对慢。

There is a structural fact frequently overlooked that reinforces this judgment: LLMs reversed the historical direction of technology diffusion. Electricity, computing, and GPS all reached governments and enterprises first, consumers later; LLMs went the other way: touching billions of consumers first, while organizations lagged. Karpathy made this the theme of his 2025/6 talk[R6], and the empirical record followed: the Bick-Blandin-Deming national survey (NBER WP 32966 → Management Science, 2026[R7]) found that by end-2024, ~40% of the US population aged 18–64 were already using generative AI: adoption faster overall than the PC or the internet at the same stage, and driven by the consumer end. Formal enterprise adoption stood at only 5-9%. This means the knowledge of how to restructure organizations currently resides with individuals, not institutions. AI Native founders are waiting for organizational forms to catch up with the technology, not for the technology to mature. An honest footnote on the data: the same study shows two-year workplace adoption at 28%, comparable to the PC era: organizations are relatively slow against the consumer wave, not slow in absolute terms.

Coase 边界的当代重画The Coase Boundary, Redrawn

The Coase Boundary, Redrawn

Ronald Coase 在 1937 年提出企业之所以存在，是因为内部协调比市场协调便宜。这个回答稳定了 80 年，直到 AI Agent 出现。Williamson、Jensen-Meckling 进一步把"代理成本"加入对比，给出了"企业最优规模 = 内部追加一笔交易的边际成本 = 市场完成同一交易的边际成本"这个均衡条件。

Ronald Coase proposed in 1937 that the firm exists because internal coordination is cheaper than market coordination. That answer held for eighty years, until AI agents arrived. Williamson and Jensen-Meckling later added “agency costs” to the comparison, yielding the equilibrium condition: the optimal firm size is the point at which the marginal cost of adding one more transaction internally equals the marginal cost of completing that same transaction through the market.

AI Agent 的引入根本性地改变了这个均衡。三类成本同时下降——搜寻成本（RAG / 向量库让组织记忆秒级可达）、议价成本（Agent-to-Agent 协议如 MCP、A2A 让自动议价成为可能）、监督成本（实时观察性如 LangSmith / Helicone 让远程异步监督优于现场监督）。其逻辑结果是：传统企业的边界（哪些活动留在内部 vs 外包给市场）会大规模重画。Anysphere 以约 300 人做到 $20 亿 ARR（2026/2，人均约 $600 万，仍是 SaaS 巨头的十倍量级）、Cognition 以收购 Windsurf 前累计净烧钱不足 $2,000 万走到 $260 亿投后估值（2026/5〔TechCrunch〕）——是Coase 边界向"市场端"压缩的早期实证，不是孤立的异常值。

The introduction of AI agents fundamentally alters this equilibrium. Three categories of cost fall simultaneously: search costs (RAG / vector stores make organizational memory accessible in seconds), negotiation costs (Agent-to-Agent protocols such as MCP and A2A make automated negotiation possible), and monitoring costs (real-time observability tools such as LangSmith / Helicone make remote asynchronous supervision superior to on-site supervision). The logical consequence: the boundaries of the traditional firm (which activities stay internal, which get outsourced to the market) will be redrawn at scale. Anysphere reached ~$2B ARR with roughly 300 people (2026/2, ~$6M revenue per person, still ten times the figure for SaaS giants); Cognition reached a $26B post-money valuation (2026/5〔TechCrunch〕) on under $20M of cumulative net burn before its acquisition of Windsurf. These are early empirical evidence of the Coase boundary compressing toward the market end, not isolated outliers.

这条推演在 2025 年获得了正面的学术对话对象。NBER 工作论文《The Coasean Singularity?》（Shahidi, Rusak, Manning, Fradkin & Horton, WP 34468, 2025/11[R1]）把话说得更直接：交易成本的全部构成要素（查询价格、谈判条款、签订合约、监督履约）恰好是 AI Agent 能以极低边际成本执行的任务类型；一旦有效执行，1937 年定义的 make-or-buy 边界将显著移动。论文给存量市场画的三阶段路径是：增强人类 → 整任务替代（人类转向判断、监督与关系工作）→ 工作流围绕 Agent 能力重组，和本规约的瓶颈诊断说的是同一个道理；而它对全新市场的判断是：agent-first 市场将直接从终点状态设计——这正是「只为 greenfield 而写」这一立场的学理版本。诚实的另一半也要引：标题里的问号是作者自己打的——拥塞、价格混淆、监管构成新摩擦，"有效执行"的前提在今天尚未满足，这是理论预测，不是已实现的事实。

This reasoning found a direct academic interlocutor in 2025. The NBER working paper The Coasean Singularity? (Shahidi, Rusak, Manning, Fradkin & Horton, WP 34468, 2025/11[R1]) states it more bluntly: every component of transaction costs (querying prices, negotiating terms, signing contracts, monitoring performance) is precisely the type of task that AI agents can execute at near-zero marginal cost; if effectively executed, the make-or-buy boundary defined in 1937 will shift substantially. The paper charts a three-stage path for incumbent markets: augmenting humans → whole-task substitution (humans shift to judgment, oversight, and relational work) → workflows reorganized around agent capabilities. This path lines up with this atlas’s bottleneck diagnosis; and its verdict on greenfield markets is: agent-first markets will be designed directly from the endpoint state. That is the academic formulation of why this atlas is drawn only for greenfield. The caveats must be cited too: the question mark in the title is the authors’ own. Congestion, price confusion, and regulation constitute new frictions, and the precondition of “effective execution” has not yet been met. This is a theoretical prediction, not an accomplished fact.

但同时，一种新的成本兴起——算法封建主义（algorithmic feudalism）。当 AI 能力被 OpenAI、Anthropic、Google、Microsoft 四家巨头垄断，"AI Native 组织"实际上把核心生产要素外包给一个高度集中的供应商寡头，构成一种新的"地租依附"关系。这就是为什么"多模型架构"在架构支柱中是基础性的——它是当代 Coase 边界设计中的"主权保留"。

At the same time, a new cost is rising: algorithmic feudalism. When AI capabilities are monopolized by four giants (OpenAI, Anthropic, Google, and Microsoft), an “AI Native organization” in effect outsources its core means of production to a highly concentrated supplier oligopoly, creating a new form of rent dependency. This is why the “multi-model architecture” is foundational among the architectural pillars: it is the “sovereignty reservation” in contemporary Coase boundary design.

代理理论的扩展Agency Theory, Extended

Agency Theory, Extended

Jensen-Meckling 1976 年的代理理论建立在一个清晰的二元结构上——委托人（principal，如股东）与代理人（agent，如经理）。代理成本来自信息不对称、目标分歧、激励错位。整个公司治理结构（董事会、薪酬委员会、KPI、绩效考核）都是这个理论的工程化实现。

The Jensen-Meckling (1976) agency theory rests on a clear binary structure: principal (e.g., shareholders) and agent (e.g., managers). Agency costs arise from information asymmetry, goal divergence, and misaligned incentives. The entire apparatus of corporate governance (boards, compensation committees, KPIs, performance reviews) is the engineering implementation of this theory.

AI Agent 介入后，这个二元结构变成了三元结构——principal-agent-agent。人类经理代理股东，AI Agent 又代理人类经理。责任链多了一层，问题是这一层的责任如何分配？Air Canada 案（2024 年 BC 省 BCCRT 在 Moffatt v. Air Canada 中判决公司必须为 chatbot 承诺承担法律责任）首次明确了第一层答案——公司不能用"我们的 AI 说错了"作为免责理由。但更深的问题尚未解决——当 AI Agent 之间互相调用、互相代理（如 Anthropic Project Deal 中员工授权 Claude Opus 代为议价），责任链如何追溯？

Once AI agents intervene, this binary structure becomes a ternary structure: principal-agent-agent. Human managers act as agents for shareholders; AI agents in turn act as agents for human managers. The accountability chain gains one more layer, and the question is how accountability at that layer is allocated. The Air Canada case (2024, BC BCCRT ruling in Moffatt v. Air Canada that the company must bear legal responsibility for its chatbot’s commitments) settled the first-layer answer for the first time: a company cannot use “our AI got it wrong” as a disclaimer. But the deeper question remains unresolved: when AI agents call and proxy one another (as when an employee in Anthropic Project Deal authorizes Claude Opus to negotiate on their behalf), how is the accountability chain traced?

2025 年的组织经济学给这个三元结构补上了更激进的一块。Hadfield 与 Koh 在《An Economy of AI Agents》（arXiv:2509.01063，NBER 变革性 AI 经济学手册章节[R2]）中重新检视经典文献（Coase 的协调摩擦、Williamson 的交易成本、Grossman-Hart 的产权、Holmström-Milgrom 的代理模型）并指出：这些理论识别的企业规模上限，全部源于人类固有约束（沟通速率受限、偷懒倾向），而这些约束"似乎内在于人类、却不内在于 AI"——Agent 近即时通信，奖励函数可以被直接设计为不偷懒，于是监督与履约这两大组织开销在理论上变得不必要。论文进一步引用 Chen-Elliott-Koh 的形式模型（Journal of Economic Theory, 2023）：当 AI 压低维持异质能力的组织成本时，经济可能发生突变式相变——从大量专业化企业转向少数横跨众多行业的巨型企业。引用须保留原文的虚拟语气：这是条件性预测（"如果 Agent 确实能……"），NBER 同卷评论人 Kevin Bryan 也对变化速度提出了制度性异议——但它把"组织规模的旧均衡正在失效"从直觉升格成了可检验的理论命题。

Organizational economics in 2025 added a more radical piece to this ternary structure. Hadfield and Koh, in An Economy of AI Agents (arXiv:2509.01063, a chapter in the NBER Handbook on the Economics of Transformative AI[R2]), revisit the canonical literature (Coase’s coordination friction, Williamson’s transaction costs, Grossman-Hart property rights, Holmström-Milgrom agency models) and observe that every size limit on the firm identified by these theories derives from constraints inherent to humans (bounded communication rates, shirking tendencies), and that these constraints “seem intrinsic to humans but not to AI”: agents communicate near-instantaneously, and reward functions can be designed directly against shirking, making monitoring and enforcement (the two largest organizational overheads) theoretically unnecessary. The paper further cites the formal model of Chen-Elliott-Koh (Journal of Economic Theory, 2023): when AI lowers the organizational cost of maintaining heterogeneous capabilities, the economy may undergo a discontinuous phase transition: from many specialized firms to a small number of mega-firms spanning numerous industries. Citations must preserve the paper’s conditional mood: this is a conditional prediction (“if agents truly can…”); NBER co-volume commentator Kevin Bryan also raised institutional objections about the speed of change. But the paper elevates “the old equilibrium of firm size is failing” from intuition to a testable theoretical proposition.

这就是为什么"人作为判断与责任锚"在架构支柱中是不可妥协的——不是因为人比 AI 决策更准，而是因为只有人能承担后果。Lattice 2024 年 7 月把 AI 列为"正式员工"3 天后撤回，是因为 HR 框架要求"员工"能承担责任，而 AI 无法。这是代理理论在 AI 时代仍然成立的最深部分——责任不能委托给无法承担后果的实体。

This is why “humans as judgment and accountability anchors” is non-negotiable among the architectural pillars: not because humans make more accurate decisions than AI, but because only humans can bear consequences. When Lattice listed AI as a “formal employee” in July 2024 and reversed course three days later, the essential reason was that the HR framework requires “employees” to be capable of bearing responsibility, which AI cannot do. This is the deepest part of agency theory that still holds in the AI era: accountability cannot be delegated to an entity incapable of bearing consequences.

控制论的回归Cybernetics, Returning

Cybernetics, Returning

Stafford Beer 在 1972 年《Brain of the Firm》中提出 Viable System Model（VSM）——任何能持续运转的组织都需要五个子系统：S1 操作单元（执行）、S2 协调（避免冲突）、S3 控制（资源分配与短期优化）、S4 智能（外环境扫描与长期规划）、S5 政策（身份与价值观）。这个模型在 1980 年代被广泛尝试但最终未能主流化，因为人类无法实时执行 S2 和 S3 所需的反馈密度。

Stafford Beer proposed the Viable System Model (VSM) in his 1972 Brain of the Firm: any organization that can sustain itself requires five subsystems: S1 operational units (execution), S2 coordination (conflict avoidance), S3 control (resource allocation and short-term optimization), S4 intelligence (external environment scanning and long-term planning), S5 policy (identity and values). The model was widely attempted in the 1980s but ultimately failed to go mainstream, because humans could not sustain in real time the feedback density that S2 and S3 require.

当代 AI Agent 让 VSM 重新成为可行的组织设计语言。具体的映射是——生成式 Agent 处于 S1（执行）与 S4（探索）；telemetry 与 guardrails 处于 S2（协调）与 S3（实时控制）；人类保留 S5（身份与价值观）。这就是为什么 Anthropic 据报道的"90 天最长规划周期"能够运转——不依赖年度战略来对齐组织，而依赖 S2/S3 层的实时反馈密度（口径注：该规划周期出自高管访谈与媒体报道，未经独立验证）。Cursor、Replit、Cognition 的极速迭代节奏也是同一逻辑——VSM 在 AI 时代第一次有了可落地的实现路径。

Contemporary AI agents make VSM viable once more as an organizational design language. The concrete mapping is: generative agents occupy S1 (execution) and S4 (exploration); telemetry and guardrails occupy S2 (coordination) and S3 (real-time control); humans retain S5 (identity and values). This is why Anthropic’s reported “90-day maximum planning horizon” can function: not by relying on annual strategy to align the organization, but by relying on real-time feedback density at the S2/S3 layer (sourcing note: this planning horizon comes from executive interviews and media reports; it has not been independently verified). The hyper-velocity iteration cadence at Cursor, Replit, and Cognition follows the same logic: for the first time, VSM has an implementable execution path in the AI era.

判断稀缺性的经济学The Economics of Judgment Scarcity

The Economics of Judgment Scarcity

Daron Acemoglu 2024 年在 MIT 的研究《The Simple Macroeconomics of AI》给出了一个谨慎的测算：AI 未来 10 年累计 GDP 贡献约 1.1-1.6%（年均 ~0.05%），远低于行业普遍宣称的数倍效应。MIT NANDA 2025/7 预印报告《The GenAI Divide》测得：定制化企业 GenAI 试点在约六个月观察窗口内，95% 没有可衡量的 P&L 影响（报告自述口径：52 家访谈 + 153 份问卷 + 300+ 公开部署，非同行评议；媒体流传的"150 访谈 + 350 问卷"系二手转述，详见 R14）。这两组材料不能证明一个普遍瓶颈，却提出了一个值得检查的问题：当执行被加速时，价值是否卡在流程其余部分，还是成本、证据与时间窗口尚不足以看见效果？

Daron Acemoglu’s 2024 MIT research The Simple Macroeconomics of AI offers a cautious estimate: AI’s cumulative GDP contribution over the next ten years will be approximately 1.1-1.6% (annual average ~0.05%), far below the multiples commonly claimed by the industry. The MIT NANDA 2025/7 preprint report The GenAI Divide measured that 95% of customized enterprise GenAI pilots showed no measurable P&L impact within an approximately six-month observation window (report’s self-stated method: 52 interviews + 153 surveys + 300+ public deployments, non-peer-reviewed; the widely circulated “150 interviews + 350 surveys” is a second-hand restatement; see R14). These materials do not prove a universal bottleneck. They pose a question worth checking: when execution accelerates, is value stuck elsewhere in the workflow, or are cost, evidence, and the time window still too weak to show an effect?

"判断稀缺"在经济学里有正主文献——出自 Agrawal、Gans 与 Goldfarb 的 prediction-vs-judgment 框架，并非 Acemoglu。他们 2018 年的形式模型（NBER WP 24626；同行评审版刊于 Information Economics and Policy, 2019[R3]）给出三个本规约直接继承的结论：① AI 降低的是"预测"这一特定任务的成本：预测是决策的输入，不是决策本身；② 判断被形式化定义为"目标函数无法被描述或编码时人类所行使的能力"：并非所有人类判断都与 AI 互补，更便宜的预测以相反方向影响不同类型判断的回报；③ 委托定理：即便人类参与能产出更优决策，人类仍会理性地把部分决策完全委托给机器：Agent 在严格优于人类之前就获得完全自治，是模型内最优选择的推论，并非失误。

口径校正：AGG 这条"目标函数无法被编码时的人类能力"是按 AI 能力反向定义的移动靶——AI 每强一次，"无法编码"的外延就收缩一寸。所以本方法论不把判断锚在"AI 做不到"上，而锚在一个正面、不随能力移动的定义：判断＝为不可逆后果承担责任的授权节点（与全站责任锚同义；再往下退一层是"谁定义什么算好的后果"，见下方断裂点）。它押的是"收缩有下界"——这是可错的赌注，不是定理。还要说清这注押在哪一层：判断变稀缺不是前提，是执行变便宜之后被推出来的结论；而把执行推向便宜的同一股力量，没理由在判断门口自动停步。所以本卷最先该盯的改判信号，不是外面谁来反驳，而是那条一向只压执行的成本曲线，开始稳定地压低拿主意本身的成本——真到那天，要改写的就不止哪根支柱，而是这卷立论的前提本身。

2025 年他们把"判断"进一步拆开（《The Economics of Bicycles for the Mind》, NBER WP 34034[R4]）：机会判断（识别什么值得启动）在模型中恒为认知工具的互补品——AI 提升而非侵蚀它的价值；收益判断（知道在给定状态下采取何种行动）只在工具不过度削减人类努力时才互补；而实现技能被建模为认知工具的替代品。一句话翻译：AI 吃掉实现、抬高机会判断、对收益判断态度暧昧——这恰好是"人即判断锚点"与"操作者即编排者"两个世界观的经济学坐标。同一谱系里，Gans 的《AI as Strategist》（NBER WP 33650, 2025[R5]）从控制权角度独立推出一个和 FIG 5.1 判断锚点地图讲同一个道理的结论：授予战略家正式控制权的增量价值随其可信度单调递减——所以组织应当逐域（domain-by-domain）而非统一地分配 AI 的控制与影响力：判断密集域人类主导、数据丰富域 AI 主要靠透明推理产生影响力而非权威。三篇全是理论模型而非实证：引用它们，是给"判断稀缺"找可对话、可证伪的学术对象，不是宣称已被证明。

“Judgment scarcity” has a canonical economics literature: the prediction-vs-judgment framework of Agrawal, Gans, and Goldfarb (AGG), not Acemoglu. Their 2018 formal model (NBER WP 24626; peer-reviewed version published in Information Economics and Policy, 2019[R3]) yields three conclusions this atlas inherits directly: ① AI reduces the cost of “prediction” as a specific task: prediction is an input to decisions, not the decision itself; ② judgment is formally defined as “the capability humans exercise when the objective function cannot be described or encoded”: not all human judgment is complementary to AI; cheaper prediction affects the returns to different types of judgment in opposite directions; ③ the delegation theorem: even when human involvement yields better decisions, humans will rationally delegate some decisions entirely to machines. Agents acquiring full autonomy before they strictly outperform humans is a corollary of optimal choice within the model, not a failure.

A calibration: AGG’s “the capability humans exercise when the objective function cannot be encoded” is a moving target defined in reverse from AI’s reach: each time AI strengthens, the extension of “cannot be encoded” contracts an inch. So this methodology does not anchor judgment on “what AI can’t do” but on a positive definition that does not move with capability: judgment = the authorized node that bears responsibility for irreversible consequences (synonymous with the site-wide accountability anchor; one layer deeper it retreats to “who defines what counts as a good consequence,” see the fracture below). Its bet is that “the contraction has a lower bound,” a fallible wager, not a theorem. It is worth naming which layer the wager sits on: judgment’s scarcity is not a premise but a conclusion pushed out once execution gets cheap, and the same force that drove execution cheap has no reason to halt politely at judgment’s door. So the first revision signal this volume should watch is not who argues against it from outside, but the moment that cost curve, which only ever pressed on execution, starts steadily pressing down the cost of making the call itself. The day it does, what gets rewritten is not one pillar but this volume’s own founding premise.

In 2025 they decomposed “judgment” further (The Economics of Bicycles for the Mind, NBER WP 34034[R4]): opportunity judgment (identifying what is worth starting) is modeled as a persistent complement to cognitive tools: AI raises rather than erodes its value; payoff judgment (knowing which action to take given a state) is complementary only when the tool does not excessively reduce human effort; implementation skill, by contrast, is modeled as a substitute for cognitive tools. In one sentence: AI consumes implementation, elevates opportunity judgment, and is ambivalent about payoff judgment. That is precisely the economic coordinate of “humans as judgment anchors” and “operators as orchestrators.” In the same lineage, Gans’s AI as Strategist (NBER WP 33650, 2025[R5]) independently derives from a control-rights angle a conclusion that lines up with the FIG 5.1 judgment-anchor map: the incremental value of granting a strategist formal control rights decreases monotonically with their credibility, so organizations should allocate AI control and influence domain-by-domain rather than uniformly: in judgment-dense domains humans lead; in data-rich domains AI exerts influence primarily through transparent reasoning rather than authority. All three are theoretical models, not empirical studies. Citing them gives “judgment scarcity” a scholarly interlocutor that can be engaged and falsified; it is not a claim that the thesis has been proven.

顺着这条线读，Acemoglu 的谨慎测算与 AGG 的微观模型提出同一个条件性诊断：当 AI 压低某类执行成本，而目标、验证与责任仍未同速自动化时，组织会更频繁地遇到判断问题——决定什么值得做、怎样在备选中选择、谁来承担后果、怎样维持方向（第 3 节把它写成工作模型）。这不等于 AI 永远不能参与判断，也不等于责任永远只能落在自然人身上；它描述的是当前法律、社会与治理安排下，授权与追责尚未被无损外包的现实。KPI 因而不该只量执行产出，还应量决策质量、验证覆盖和风险代价。Anysphere 人均创收约 $600 万（按 2026 年 ~300 人与 $2B ARR 同期口径计）是一条有趣但不足以解释机制的观察；人均 ARR 一旦成为目标，同样会被博弈。Goodhart 定律不给本方法论豁免权。

Reading along this line, Acemoglu’s cautious estimate and AGG’s microeconomic model suggest a conditional diagnosis: when AI lowers the cost of a class of execution while goals, verification, and responsibility are not automated at the same rate, organizations encounter judgment questions more often: what is worth doing, how to choose among alternatives, who bears consequences, and how to maintain direction (Section 3 writes this as a working model). This does not mean AI can never participate in judgment, nor that responsibility can forever rest only with natural persons. It describes a current legal, social, and governance arrangement in which authorization and accountability have not been losslessly outsourced. KPIs should therefore measure more than execution output: decision quality, verification coverage, and risk cost matter too. Anysphere’s approximate $6M revenue per head (using ~300 people and $2B ARR for 2026 on a same-period basis) is an interesting observation, not sufficient evidence for a mechanism; once revenue per employee becomes the target, it can be gamed too. Goodhart’s Law grants no exemption to this methodology.

断裂点 · ③↔④ 脱钩 —— 后果承担正被工程化外包FRACTURE · ③↔④ decoupling: accountability is being engineered into a cost

上一段把"承担后果"当作人不可替代的最后理由。但这条正在被侵蚀：欧盟修订版《产品责任指令》（PLD，2024 通过、成员国自 2026 年 12 月起适用）把严格责任扩展到软件与 AI 系统——后果不再需要某个人"承担"，而是被归因为产品缺陷、由保险池化定价、转成一笔可计算的成本；学界提出的"电子人格"（e-personhood）被欧盟明确否决，进一步说明责任会向法人与保险塌缩，而非停在某个自然人身上。含义："后果承担"若被这样工程化，就不再能当基岩——它降级为④的一个可见执行机制（谁在名义上签字、谁的保单赔付）。真正不可外包的，要再往下退一层：谁来定义"什么算好的后果"——这个从根上定义"什么算数"、又因人而异的价值判断，保险定不了价、严格责任也归因不了。这正是本系列的基岩为何落在④、而非"后果承担"本身。The paragraph above treated “bearing consequences” as the final reason humans are irreplaceable. That line is being eroded: the EU’s revised Product Liability Directive (PLD, adopted 2024, applicable in member states from December 2026) extends strict liability to software and AI systems: consequences no longer need a person to “bear” them; they are attributed to a product defect, priced by insurance pooling, and converted into a computable cost. The “e-personhood” some scholars proposed was explicitly rejected by the EU, further showing liability collapses toward legal persons and insurers rather than resting on a natural person. Implication: if “accountability” is engineered this way, it can no longer be the bedrock; it is demoted to one visible execution mechanism of ④ (who nominally signs, whose policy pays out). What is genuinely un-outsourceable retreats one layer deeper: who defines “what counts as a good consequence” (a constitutive, heterogeneous value judgment that insurance cannot price and strict liability cannot attribute). This is exactly why this series’ bedrock sits at ④, not at “accountability” itself.

〔探索清单：PLD 是已生效法规（证据级 Ⅰ）；"责任被稀释到不能再承重"是据此的推断，不是已证的事实。证伪条件：若出现一种制度安排，让某个具体的人对 AI 造成的后果负起不能转手、也不能用保险摊平的责任，这个断裂点就不成立。〕[Exploration ledger: the PLD is enacted regulation (evidence grade Ⅰ); “liability diluted past load-bearing” is an inference from it, not a proven fact. Falsifying condition: if an institutional arrangement emerges that holds a specific person to accountability for AI consequences that cannot be handed off or priced away by insurance, this fracture does not hold.]

HISTORICAL DEPTH · 公司是一种约 400 年的发明

HISTORICAL DEPTH · The corporation is an invention roughly 400 years old

把"公司"当作组织的自然形态，是一种近视——它是一组为特定历史约束临时拼装、并且分层叠加起来的解，绝非永恒的容器；每一层都比想象中年轻：

Treating the “corporation” as the natural form of organization is myopic. It is a set of solutions provisionally assembled for specific historical constraints and layered on top of one another, never an eternal container, each layer younger than we imagine:

1602

可公开交易、份额可转让的股份公司——荷兰东印度公司 VOC，把"陌生人共担风险"工程化[R23]

Publicly tradeable, transferable-share joint-stock company: the Dutch East India Company (VOC), engineering “strangers sharing risk”[R23]

1855-56

现代有限责任——英国《有限责任法》与《股份公司法》，"亏损止于出资"才成为默认[R23]

Modern limited liability: the UK Limited Liability Act and Joint Stock Companies Act; “losses capped at contribution” became the default[R23]

1870s

科层管理——铁路成为首个"大企业"，催生中层与组织图谱（Chandler《看得见的手》）[R23]

Managerial hierarchy: railroads became the first “big business,” generating middle management and the organizational chart (Chandler, The Visible Hand)[R23]

三层加起来不过四百年，且每一层都是对当时人类协调约束的回应：信息只能逐级传递、信任只能靠科层背书、资本只能长期绑定。这正是 Coase（1937）与 Hadfield-Koh（2025[R2]）指认的同一件事——企业规模的上限源于人类约束，不是自然法则。AI 真正动的不是"改造公司"，而是拆掉公司赖以成立的那些老前提：协调、信任、讨价还价的成本一被重写，这套用了几百年的拼装就不再是唯一活法。Notion 创始人 Ivan Zhao 把话说得更直白——"公司是一项晚近的发明，它在规模化时退化，并触及上限"[R22]。本方法论要写的，是约束老前提被拆掉之后、从终点状态重新设计的那一种。

These three layers together span barely four hundred years, and each was a response to the human coordination constraints of its time: information could only travel step by step, trust could only be underwritten by hierarchy, capital could only be committed long-term. This is the same thing that Coase (1937) and Hadfield-Koh (2025[R2]) both identify: the upper limit on firm size derives from human constraints, not natural law. What AI actually moves is not “reforming the corporation” but dismantling the old premises the corporation stands on: once the cost of coordination, trust, and bargaining is rewritten, this centuries-old assembly stops being the only way to live. Notion co-founder Ivan Zhao put it more plainly: “The company is a recent invention; it degrades at scale and hits a ceiling”[R22]. What this atlas is drawing is the kind designed from the endpoint state, after the constraints have dissolved.

这几股力量各自瓦解了一个旧假设。它们合起来指向什么？把散落的线索装配成一个命题、检验它能否承重，是下一节的工作。

Each of these forces dismantles one old assumption. What they point to together is the work of the next section: assembling the scattered threads into a proposition and testing whether it can bear weight.

METHOD NOTE · 两份清单与技术束 · Two Ledgers and a Technology Bundle

这份文档必须同时做两件互相拉扯的事：一边保持证据纪律，一边保留探索空间。处理方式是分两份清单，而非把所有话都说得保守——两份分开记的清单：一份记已被证实的，一份记还在探索的；分开记，才不会把猜想当结论。证据清单只登记已经有来源、口径和等级的事实或模型；探索清单允许提出尚未被证明的组织形态，但必须附先行指标、适用边界和证伪条件。证据清单负责不骗人，探索清单负责不僵死。两份清单混在一起，本方法论会退化成愿景营销；只剩证据清单，它又会失去对新形态的感知能力。

This atlas has to do two things that pull against each other: preserve evidence discipline while leaving room for exploration. The answer is not to make every sentence cautious, but to keep two ledgers, kept apart so a conjecture is never mistaken for a conclusion. The evidence ledger records only claims with sources, measurement bases, and grades. The exploration ledger permits organizational forms that are not yet proven, but only with leading indicators, scope boundaries, and falsification conditions attached. The evidence ledger keeps the method honest; the exploration ledger keeps it from going rigid. Merge the two and the methodology degrades into vision marketing. Keep only the evidence ledger and it loses sensitivity to new forms.

同理，AI 也不应被写成唯一原因。更本质的变量是组织约束的迁移：信息如何流动、判断如何承担、执行如何外包、资本如何结算、能源与算力如何定价、物理行动如何被机器化、责任如何被法律与社会承认。AI 是当前最强的触发器，因为它同时压低执行和协调成本；但未来组织形态会由一束技术共同塑造——agent 协议、机器支付、机器人、能源/算力基础设施、生物与脑机接口都可能改变不同约束。本章先从 AI 切入，是因为它现在最可落地；不是因为其他技术不重要。

By the same logic, AI should not be written as the only cause. The more fundamental variable is the migration of organizational constraints: how information flows, how judgment is borne, how execution is outsourced, how capital settles, how energy and compute are priced, how physical action is mechanized, and how responsibility is recognized by law and society. AI is the strongest current trigger because it lowers execution and coordination costs at once, but future organizational forms will be shaped by a bundle of technologies: agent protocols, machine payments, robotics, energy and compute infrastructure, and bio/brain-computer interfaces can each move a different constraint. This chapter starts with AI because it is the most buildable lever now, not because the other technologies are irrelevant.

SECTION

THE CORE · 内核

工作命题 · 等待反例

Working Thesis · Awaiting Counterexamples

组织中可被重画的部分

The Designable Part of Organization

这不是一条已被证明的等式，而是一副可被反例推动改写的模型。

This is not a proven equation. It is a model that counterexamples should force us to redraw.

一句话In one line

当某类执行变得便宜、而判断仍需要人承担且上下文能够外化时，组织的设计重心会从产能转向判断与上下文。这是本卷的工作命题；它的边界与反例比口号更重要。When a class of execution becomes cheap, judgment still needs human ownership, and context can be externalized, organizational design shifts from capacity toward judgment and context. This is the volume’s working thesis; its scope and counterexamples matter more than the slogan.

UPSTREAM

Fayol, 1916 - Five functions
Coase, 1937 - Theory of Firm
Simon, 1947 - Bounded rationality
Dunbar, 1992 - Social brain
Jarvis, 2019 - Company of One
Altman, 2024 - One-person unicorn

这次推导要做的，是把争论放到可检查的地方，不是宣布一条终局结论。Coase 1937 年问：市场明明有效，公司为什么还要存在？答案是交易成本——市场协调有摩擦，于是把一部分协调收进组织内部自己做（完整论述见第 2 节）。Simon 1947 年问：人既是有限理性的，组织又如何可能？答案是结构——用层级和流程补上单个大脑不够用的带宽。AI 没有取消这两问；它改变了其中一部分成本。于是真正要检验的是：哪些成本真的下降了，下降后新的排队点出现在哪里，谁仍要为结果签字，不只是“AI 会不会替代组织”这个问题。

This derivation does not announce an end-state. It moves the argument to things we can inspect. Coase asked in 1937: if markets are efficient, why do firms exist at all? The answer is transaction costs: market coordination has friction, so some coordination is brought inside the organization (full treatment in Section 2). Simon asked in 1947: if humans are boundedly rational, how is organization even possible? The answer is structure: hierarchy and process patch the bandwidth a single mind runs short of. AI does not cancel either question; it changes some of their costs. The test is therefore which costs actually fell, where new queues appear, and who still signs for the outcome — not simply “will AI replace organizations?”

先把这副模型当成一次有条件的押注。它押三件事：判断仍与人的责任相连；足够多的上下文可以外化并可靠流动；生成能力的增长快过验证与承担后果的能力。任一条件反转，模型都该降级或改写。特别是“判断带宽”只指人的带宽，不应偷偷延伸成对所有智能的永恒断言。

Treat this model first as a conditional bet. It bets on three things: judgment remains tied to human accountability; enough context can be externalized and moved reliably; generation improves faster than verification and consequence-bearing. Reverse any condition and the model should be weakened or redrawn. In particular, “judgment bandwidth” names human bandwidth, not an eternal claim about every form of intelligence.

尚未定稿 · 六个会改变这副模型的问题 UNSETTLED · Six Questions That Could Change This Model

这不是常见问答。每一题先给出项目此刻愿意押注的理解，再放进一个足以让它动摇的可能。读的时候不必急着选边；先想想，你会到哪里找下一条线索。

This is not an FAQ. Each question starts with the view this project is willing to use for now, then places a credible challenge beside it. There is no need to pick a side immediately. First ask where you would look for the next clue.

Q-X-04

如果 agent 判断得比人更好，人还应该是最后的决定者吗？

If agents judge better than people, should people still make the final call?

先别用人的现有职责给「判断」下定义。解释取舍、回应质疑、出错后改向，目前确实和人或法人绑在一起；这也可能只是历史上的打包方式。若多 agent 协议能做出更好的选择、公开理由、接受申诉并在失败后修正，拒绝它取得最终决定权，也可能是在保住旧权力。可以去看那些人机意见相反的决定：谁的判断更校准？人的否决减少了不可恢复的错误，还是只增加等待？agent 押错时，又有没有人或制度真正接住后果？这几个观察未必站在同一边。

Do not define “judgment” by the duties people happen to hold today. Explaining trade-offs, answering challenges, and changing course after failure are currently bundled with people or legal entities; that bundle may be historical. If a multi-agent protocol makes better choices, exposes its reasons, accepts appeals, and revises after failure, denying it the final call may preserve old power. Look at decisions where people and agents disagree. Which side is better calibrated? Does the human veto prevent irrecoverable errors or mainly add delay? When the agent is wrong, does anyone actually absorb the consequence? Those observations may point in different directions.

Q-ORG-02

让上下文自由流动，什么时候会变成一套监控系统？

When does free-flowing context become a surveillance system?

完整记录有一面很容易被忽略的好处：决定留下来，后来者可以追问，私人关系和「当时谁在场」就没那么容易垄断解释权。同一套记录也可能滑向监控，让人只写安全的话，把没法编码的犹豫和权力关系留在系统外。分界不在记录多不多，而在谁能看、谁能纠错、谁能删除，以及记录能否反过来伤害提供它的人。下一次决定出错时，不妨追问：更多上下文让弱势者获得了申诉依据，还是让所有人更不敢说真话？

Full records have an easily missed upside: decisions leave a trace, later participants can challenge them, and private relationships or “who was in the room” have less control over the story. The same record can slide into surveillance. People write only the safe version, while hesitation and power stay outside the system. The boundary is not the amount recorded. It is who can see, correct, delete, and use the record against the person who supplied it. After the next bad decision, ask whether more context gave weaker participants grounds for appeal or made everyone less willing to speak plainly.

Q-ORG-01

协调变便宜以后，组织会变小，还是会大到前所未有？

When coordination gets cheaper, do organizations shrink or grow beyond precedent?

一个小团队可以调动更多 agent，不再靠层级和人数购买产能。可同一套技术也能让一个中心控制更大的业务表面，最后出现的，也许是少数超级组织，而非许多小组织。先别只数员工。所有权边界有没有缩小？权限半径到了哪里？效率收益最后落在谁手里？这些变化比“小而美”的故事更早暴露真实方向。

A small team can direct more agent work without buying capacity through hierarchy and headcount. The same technology can also let one center control a much larger operating surface. The result may be a few super-organizations, not many small ones. Headcount is not the first thing to watch. Did the ownership boundary shrink? How far does permission reach? Who ultimately keeps the gains? Those changes reveal the direction earlier than a story about small, elegant teams.

Q-ORG-03

责任一定要落在一个具体的人身上吗？

Must responsibility belong to a specific person?

法律和治理目前仍会找一个具名的人或法人签字、赔付并修正方向。但责任并不天然等于一个人的良心：保险、合同、审计日志、补救基金和集体否决，也可能把承诺与补救写进协议。真正难的测试发生在合同没预见到的伤害出现时。协议能否认出「这不只是已定价的损失」，让受影响者进入修订过程，并真的改变下一次决定？如果能，责任也许可以成为制度属性；如果不能，分布式责任只是把无人判断包装得更完整。

Law and governance still look for a named person or legal entity to sign, pay, and correct course. Yet responsibility is not naturally identical to one person’s conscience. Insurance, contracts, audit logs, remediation funds, and collective veto can write commitments and repair into a protocol. The hard test arrives when harm falls outside what the contract anticipated. Can the protocol recognize that this is more than a priced loss, bring affected people into revision, and change the next decision? If so, responsibility may become an institutional property. If not, distributed responsibility is only a polished form of unheld judgment.

Q-X-01

瓶颈真的搬家了，还是成本只是被藏到了别处？

Did the bottleneck move, or was the cost merely hidden elsewhere?

代码、设计稿和研究线索的局部生成成本确实在下降，但审核者、用户、数据劳动者、模型供应商和公共系统可能接过新的负担。暂时更稳妥的做法，是画完整条后果链，而不是只看内部吞吐。反过来说，成本转移也是专业化分工的常态，不可能把所有影响都算进一张表。问题因此变成：哪些外部成本不进入账本，就会让“效率提升”失去意义？

The local cost of producing code, design candidates, and research leads is falling, while reviewers, users, data workers, model providers, and public systems may pick up new burdens. For now, drawing the full consequence chain is safer than reading internal throughput alone. Yet cost transfer is also part of ordinary specialization, and no ledger can include every effect. Which external costs would make an efficiency claim meaningless if left out?

Q-X-05

AI-Native 原生于模型，还是原生于持续变化的能力条件？

Is AI-Native native to models, or to continuously changing capability conditions?

这里暂且把 native 理解为一种可重画性：能力变化时，权限、上下文、验证和责任也能重新分配。反方可以更窄、更锋利：只有当机器参与者拥有持续状态、明确权限和主动行动能力，组织才称得上 AI-Native；一般的适应性不够。两种定义会导向不同观察。前者看结构能否随能力改写，后者看 agent 是否已经成为行动主体。哪一种更能解释实践中的成败？如果两种都说不出 AI 带来的独有结构，我们是否应该停止使用这个名字？

Here, native means keeping the organization redrawable: as capabilities change, permissions, context, verification, and responsibility can move with them. A narrower rival is sharper. An organization is AI-Native only when machine participants have persistent state, explicit permissions, and initiative; general adaptability is not enough. The definitions direct attention to different evidence. The first asks whether structure changes with capability. The second asks whether agents have become genuine actors. Which better explains success and failure in practice? If neither can name a structure unique to AI, should we stop using the term?

核心图KEY FIGFIG. 3.0 / THE DERIVATION · 推导链看懂：核心命题从哪里推出来 Read: how the core theorem is derived 观察observed 押注bet 竞争解释rival

这张图公开的是模型的铰链——人的判断带宽、上下文的可外化性、以及生成与验证的相对成本，不是把三个公理封成铁律。第 4 节的瓶颈是它的第一批反例测试。一处诚实：“≈”与“×”是为了暴露依赖关系的建模选择；它不覆盖默会知识、权力和法律，也不该被读成被证出的组织等式。

This diagram exposes the model’s hinges: human judgment bandwidth, the externalizability of context, and the relative cost of generation and verification, rather than sealing three axioms into iron laws. Section 4’s bottlenecks are its first counterexample tests. One honesty: “≈” and “×” are modeling choices that expose dependencies. The formula does not cover tacit knowledge, power, or law, and should not be read as a proven organizational equation.

THE THEOREM · 命题，与它的动词The Theorem and Its Verbs

打开你们公司的组织图：一棵按部门画的树，标着谁汇报给谁。现在拿它问两个问题——上周那个把发布拖了三天的决定，卡在了哪个节点？做决定的人拿到了他需要的背景，还是靠三层转述、一路丢失？组织图答不上来，因为它画错了对象。真正决定快慢的，是另外两张图：判断在哪里发生（分布），上下文怎么抵达做判断的人（流动）。这就是 T1——组织不是人的集合，而是这两张图的叠加。两张图都逐项可查：判断是否发生在离上下文最近的位置？上下文是否不经人肉转译就能到？能检查，就能动手改——组织设计从此是工程，不再是画框框。

Open your company’s org chart: a tree drawn by department, marking who reports to whom. Now put two questions to it: last week’s decision that pushed the release back three days, at which node did it stall? Did the person deciding have the context they needed, or was it relayed down three layers and lost along the way? The org chart can’t answer, because it is drawing the wrong object. What actually sets the pace is two other maps: where judgment happens (distribution), and how context reaches the person making the judgment (flow). This is T1: an organization is not a collection of people but the superposition of these two maps. Both are checkable item by item: does judgment happen at the point closest to context? Does context arrive without human relay in the middle? What you can check, you can build: organizational design becomes engineering, no longer drawing boxes.

今天就能量 · 把 T1 变成两个可复现的数Measure it today · turning T1 into two reproducible numbers

"逐项可查"不必停在一句保证上。T1 的两侧，各有一把今天就能拿起来量、且两个人分开量还能对上的粗尺——粗糙，但足以把命题从修辞变成可核对的数。

“Checkable item by item” need not stay a promise. Each side of T1 has a crude ruler you can pick up today, one that two people measuring separately will still roughly agree on. Rough, but enough to turn the thesis from rhetoric into a number you can audit.

判断侧 · 判断密度。取你最常跑的 8–10 条工作流，把每一步标成"判断"（结果可能实质地走向两条不同的路，且有具名的人为之负责）或"执行"（机械、可自动）。数两件事：① 判断步骤里，仍卡在"排队等人"的比例；② 每个具名判断者平均背多少个判断节点。越 AI 化，②越高、①越低。
Judgment side · judgment density. Take your 8–10 most-run workflows and tag each step “judgment” (the outcome could plausibly go two materially different ways, and a named person is accountable for it) or “execution” (mechanical, automatable). Count two things: (1) the share of judgment steps still stuck “waiting in a human queue”; (2) how many judgment nodes each named judge carries on average. The more AI-native, the higher (2) and the lower (1).
上下文侧 · 上下文抵达率。随机抽最近 20 个真实决策，逐个问："做判断的人，是不用问人、不用翻聊天记录，就拿到了所需背景吗？"答"是"的比例，就是抵达率。
Context side · context-arrival rate. Sample 20 recent real decisions and ask of each: “Did the person judging get the context they needed without asking a person and without digging through chat logs?” The share of “yes” is the arrival rate.

为什么算得上"可复现"：两个人各自照这套规则量同一批工作流与决策，判断／执行的分类、抵达率的是非，绝大多数会一致；分歧本身也定位了模糊地带，正是下一步该讲清的地方。比起那份 2032 虚构文物里的精密仪表——第 12 节的"人均承载判断节点 218 个"是投影，不是测量——这更像是今天用一张表就能起步、且可逐季改进的版本。

Why this counts as reproducible: two people applying this rubric to the same workflows and decisions will agree on most of the judgment/execution splits and the arrival calls; where they diverge, the divergence itself locates the fuzzy zone that needs clarifying next. This is closer to a version you can start today with one spreadsheet and improve quarter by quarter than to the precision instrument in the 2032 fiction (the “218 judgment nodes per head” in Section 12 is a projection, not a measurement).

一个反身校准：判断常发生在工作流之前——在决定"这条工作流该不该存在"的那场对话里，它可能不留下任何工件痕迹。只数工件里可见的判断步骤，会系统性低估真实判断密度；所以这把尺子要一并翻决策日志（谁在何处改了方向），而不只看最终产物的 diff。

A reflexive calibration: judgment often happens before the workflow — in the conversation that decided whether this workflow should exist at all, and may leave no artifact trace. Counting only the judgment steps visible in the artifact systematically undercounts true judgment density, so the ruler must also read the decision log (who changed direction, where), not just the diff of the final product.

如果组织就是这两张图，那管理就不再是"管人"，而是把这两张图画好、再让它们持续保持健康——这就是 T2。它有很具体的日常动词：第 4 节章末的 THE KERNEL 把动词压成一句话——持续压缩串行瓶颈，把判断之前的排队等待交给 agent 网络去消化。这一节给的是名词，回答"组织是什么"；那句动词回答的是"每周一早上具体做什么"。名词加动词，才凑成完整的内核。

If the organization just is these two maps, then management is no longer “managing people” but drawing the two maps well and keeping them healthy over time. That is T2. It has very concrete daily verbs: THE KERNEL block at the end of Section 4 compresses them into one line: continuously compress serial bottlenecks, and hand the queue that waits before judgment to the agent network to absorb. This blueprint gives the nouns, answering “what the organization is”; that verb answers “what to actually do on Monday morning.” Noun plus verb is what makes the kernel complete.

但 T1、T2 都只回答了"怎么造"。在它们前面，还压着一个更根本、也更容易被跳过的问题——"为何造"。四百年来这个问题被默认掉了：人手稀缺，效率就是天然的第一目标，组织为效率而建，人的意义被压在效率底下。AI 第一次让效率变得充裕——既然效率唾手可得，它就不再天然值得你把整个组织都围着它转（就像下文里，规模也从目标降成了变量）。真正值得重新围着它设计组织的，是那个被压了四百年的答案：让人去做值得做、也值得热爱的事——判断、探索、创造，为意义和价值担责。所以这套内核得倒过来读：把判断 × 上下文优化到极致，只是前提；让人回到判断、探索与创造，才是重画组织的理由。两者一旦颠倒，人反而成了喂养算法的零件——效率数字再漂亮，方向也是反的。

But T1 and T2 only answer “how to build.” Ahead of them sits a more basic question, and the one most easily skipped: “what to build it for.” For four centuries that question was assumed away: hands were scarce, so efficiency was the natural first goal; organizations were built for efficiency, and human meaning was pinned underneath it. AI makes efficiency abundant for the first time, and once efficiency is cheap to come by, it is no longer automatically worth building your whole organization around (just as scale, below, drops from goal to variable). What is now worth designing the organization around is the answer held down for those four centuries: letting people do work worth doing and worth loving — judgment, exploration, creation, and responsibility for meaning and value. So the kernel reads in reverse: optimizing judgment × context to the limit is only the precondition; returning people to judgment, exploration, and creation is the reason to redraw the organization at all. Invert the two and people become parts feeding the algorithm — however impressive the efficiency numbers, the direction is backwards.

核心图KEY FIGFIG. 3.1 / THE TWO LAYERS · 命题的解剖看懂：判断、上下文与工作流如何咬合 Read: how judgment, context, and workflow interlock

T1 是解剖，而非隐喻。上层回答"判断在哪里发生"：人只出现在标准与不可逆两类显式节点，其余全部并行扇出；下层回答"背景如何抵达"：agent 直读共享上下文，判断与轨迹回写为组织记忆。两层各自可审计。这就是"组织不以人数与层级为主语，而以判断位置与上下文路径为主语"的工程含义。

T1 is a dissection, not a metaphor. The upper layer answers “where judgment occurs”: humans appear only at two classes of explicit nodes (standard-setting and irreversible decisions), and everything else fans out in parallel. The lower layer answers “how context arrives”: agents read the shared context store directly; judgments and execution traces are written back as organizational memory. Both layers are independently auditable. This is the engineering meaning of “the organization takes judgment position and context path as its subject, not headcount and hierarchy.”

反面 · 别把人彻底移出环COUNTER · don’t take the human fully out of the loop

「人只守显式节点」有个反面代价：HCI 自动化偏差文献（Parasuraman-Riley 1997）指出，长期不亲手过常规节点，人对错误的侦测灵敏度会钝化，真到该接管时反而看不出问题。所以它不是把人彻底移出环的许可，需配三道缓解：定期人工过常规节点（在环练习）、判断者轮换、对 agent 常规产出抽检。这与学习卷「错误侦测灵敏度」、工程卷「异步分诊」是同一道账。“Humans guard only the explicit nodes” carries a cost on the flip side: the HCI automation-bias literature (Parasuraman-Riley 1997) shows that when a person stops passing through routine nodes by hand, their sensitivity to error dulls, so at the very moment takeover is needed, they cannot see the problem. Explicit nodes are therefore not a license to take the human fully out of the loop; three mitigations belong with them: a periodic human pass over routine nodes (in-loop practice), rotation of judges, and spot-checks on the agent’s routine output. This is the same ledger as the Learning volume’s “error-detection sensitivity” and the Engineering volume’s “async triage.”

管理五职能的去向What Happens to Fayol's Five Functions

What Happens to Fayol’s Five Functions

1916 年，Henri Fayol 在《工业管理与一般管理》里把管理定义为五种职能：计划、组织、指挥、协调、控制。此后一百一十年，管理学教材都是这五个词的注脚。把 T2 套在这五个词上，可以逐项预言它们的去向——注意，没有一种是"被 AI 增强"，也没有一种是凭空消失：每一种都被拆成两半，可结构化的一半下沉为基础设施，不可结构化的一半上浮为判断。

In 1916, Henri Fayol defined management as five functions in Administration Industrielle et Générale: planning, organizing, commanding, coordinating, and controlling. For a hundred and ten years afterward, management textbooks were footnotes to those five words. Applying T2 to each in turn yields a precise forecast of their fate. Note that not one is “augmented by AI,” nor does any vanish into thin air: each is split in two, with the structurable half sinking into infrastructure and the unstructurable half rising as judgment.

TABLE 3.0 · FAYOL 1916 → AI NATIVE五职能去向表Fate of the Five Functions

职能 · 1916Function · 1916

传统实现Traditional implementation

下沉为基础设施的一半Half that sinks into infrastructure

上浮为判断的一半Half that rises as judgment

计划PlanningPRÉVOIR

年度规划 · 预算周期 · 战略会Annual planning · budget cycles · strategy retreats

agent 持续扫描与预处理，把「原始信息 → 备选方案」的距离压缩成一张决策地图；规划周期从年度坍缩到实时（Anthropic：最长 90 天）Agents continuously scan and pre-process, compressing the “raw signal → option set” distance into a decision map; planning cycles collapse from annual to real-time (Anthropic: 90-day max horizon)

选择哪个方案，并为其后果承担责任Choose which option, and own the consequences

组织OrganizingORGANISER

组织架构图 · 岗位说明书 · 编制Org charts · job descriptions · headcount

工作流即代码（支柱 02）：结构成为可执行、可版本化、可回滚的声明，而不是挂在墙上的图Workflow-as-code (Pillar 02): structure becomes an executable, version-controlled, rollback-capable declaration, not a diagram on the wall

决定图的拓扑——哪些节点必须是人Decide the topology of the graph: which nodes must be human

指挥CommandingCOMMANDER

逐级下达 · 督办 · 例会追踪Top-down directives · supervision · recurring status meetings

消解为标准定义（H.02）：把"什么是好"写清楚，agent 不需要被指挥，只需要被定义Dissolves into standard definition (H.02): write down what “good” means; agents do not need to be commanded, only defined

定义标准本身，并在例外时介入Define the standard itself, and intervene at exceptions

协调CoordinatingCOORDONNER

会议 · 周报 · 中层转译Meetings · weekly reports · middle-layer translation

被共享上下文系统吸收：信道从 O(n²) 网状坍缩为 O(n) 星形——见下方仪器Absorbed by the shared context system: channels collapse from O(n²) mesh to O(n) hub (see instrument below)

维护上下文库本身的质量与边界Maintain the quality and boundaries of the context store itself

控制ControllingCONTRÔLER

KPI · 审批链 · 事后审计KPIs · approval chains · post-hoc audits

遥测＋策略即代码＋回滚制：越界自动拦截，常态全量可观测，审批只剩例外上报Telemetry + policy-as-code + rollback discipline: out-of-bounds actions are intercepted automatically; full observability is the default state; approvals are reserved for exception escalation only

设定不可逾越的边界：不可逆 · 声誉 · 方向Set inviolable boundaries: irreversibility · reputation · direction

表的右侧两列藏着一个组织学结论：中层管理者恰好整层站在这条分界线上。中层的传统职能（信息上传下达、跨组转译、进度协调）几乎全部落在"可结构化"一侧。这不是"AI 取代中层"的耸动说法，而是一个结构事实：当上下文可以被系统继承，人肉路由器就从岗位退化为瓶颈（第 4 节把它列为瓶颈而非职位）。幸存下来的，是中层里真正在做判断的人，而非"中层"这个层级本身——管理幅度（span of control）让位给判断幅度（span of judgment）：一个人能为多大的图承担例外、不可逆与方向三类判断。

The two right-hand columns of the table conceal an organizational conclusion: middle managers as a class stand precisely on this dividing line. The traditional functions of middle management (relaying information up and down, translating across groups, coordinating progress) fall almost entirely on the “structurable” side. This is not the sensationalist claim that “AI replaces middle management”; it is a structural fact: once context can be inherited by a system, the human router degrades from a role into a bottleneck (Section 4 classifies it as a bottleneck, not a position). What survives is the individuals within middle management who are genuinely exercising judgment, not the tier itself. Span of control yields to span of judgment: how large a graph one person can bear responsibility for across the three classes of judgment (exception, irreversibility, and direction).

INSTRUMENT 03 · 协调税计算器INSTRUMENT 03 · Coordination-Tax Calculator ● LIVE

组织人数Team size 12 人12 people

12 个人，66 条点对点信道。拖动滑杆，看协调税如何以 n² 增长——以及为什么星形上下文中枢是唯一不随人数爆炸的对齐方式。12 people, 66 point-to-point channels. Drag the slider to see coordination tax grow as n², and why a star-shaped context hub is the only alignment pattern that does not explode with headcount.

点对点信道 · MESH n(n−1)/2P2P channels · MESH n(n−1)/266

上下文中枢 · HUB nContext hub · HUB n12

组织形态的光谱The Spectrum of Organizational Forms

The Spectrum of Organizational Forms

T1 有一个最容易被忽略的推论：规模从目标变量降级为自由变量。如果组织是判断的分布与上下文的流动，"多少人"就不再是组织的定义性属性，而是一个工程参数——由判断需要多少个不可替代的承担者决定。参数空间的两端，第一次被同一套原理覆盖。

T1 carries one corollary that is most easily overlooked: scale is demoted from a target variable to a free variable. If an organization is the distribution of judgment and the flow of context, “how many people” is no longer a defining attribute of the organization; it is an engineering parameter, determined by how many irreplaceable bearers of judgment the work requires. For the first time, both ends of the parameter space are covered by the same set of principles.

FIG. 3.2 / THE SPECTRUM OF FORMS · 组织形态光谱看懂：规模为什么是自由变量 Read: why scale is a free variable

横轴为组织人数（对数）。左端：一人公司把判断密度推到 100%，执行全部外置给 agent 与杠杆——绝对数字不大，但证明组织的下限已脱离人数约束；中段：Anysphere 与 Anthropic 证明人均产出的上限同样已脱离直觉约束。Dunbar 线右侧，仍以会议网状对齐的组织，协调税以 n² 增长。样本口径见第 9 节。

The horizontal axis is organizational headcount (logarithmic). Left end: the one-person company pushes judgment density to 100%, with all execution externalized to agents and leverage. The absolute numbers are small, but they prove that the floor of organization has already escaped the headcount constraint. Middle: Anysphere and Anthropic demonstrate that the per-person output ceiling has likewise escaped intuitive bounds. To the right of the Dunbar line, organizations that still coordinate via meeting meshes face coordination tax growing as n². For sample methodology see Section 9.

光谱最左端的完整论述详见第 14 节《组织的下限·一人公司》：核心命题「规模是选择，连贯性是目的」、四个世界观与这些支柱、现实标定（Levels / Lou / Welsh 自报口径与 Altman 的一人独角兽赌局）、以及"极限解非普遍处方"的诚实注脚。它是 T1 在 N=1 处的极限解与试金石：把"组织必须是很多人"这个隐含假设，永久地变成一个待论证的命题。

The full treatment of the far-left end of the spectrum is in Section 14: The Lower Bound of Organization · One-Person Company, covering the core proposition “scale is a choice, coherence is the purpose,” the four worldviews and architectural pillars, real-world calibration (the self-reported figures of Levels / Lou / Welsh and Altman’s one-person-unicorn wager), and the honest caveat that “the limiting solution is not a universal prescription.” It is the limiting solution and litmus test of T1 at N=1: it permanently converts “an organization must be many people” from a hidden assumption into a proposition awaiting proof.

接下来几节回到光谱的主流区段，处理一个更难的问题：当组织确实需要不止一个人时，为什么"在旧结构上加 AI"注定失败。这些瓶颈，每一个都是 T1 的反面证明。

The sections that follow return to the mainstream segment of the spectrum, addressing a harder question: when an organization genuinely requires more than one person, why is “adding AI onto the old structure” destined to fail? These bottlenecks, each a proof-by-contradiction of T1.

SECTION

STRUCTURAL BOTTLENECKS · 结构瓶颈

机理 · 加 AI 为何无效Mechanism · Why Overlaying AI Fails

"加 AI"解不开的结构性瓶颈

The Structural Bottlenecks That Overlaying AI Cannot Solve

瓶颈从来不在节点的速度，而在图的形状。

The bottleneck has never been the speed of a node; it is the shape of the graph.

一句话In one line

所以给每个人配上 AI 也几乎不动端到端吞吐：瓶颈长在图的形状里，工具触不到、转型绕不开。当前的押注是只有从底层重画工作流图才松得开它；但要盯着一个同样成立的反面——重画之后瓶颈未必消失，也可能只是从看得见的队列，搬到判断节点上的人、或系统要维护的上下文里，换了个更难计量的位置。So handing everyone AI barely moves end-to-end throughput: the bottleneck lives in the shape of the graph, where tools cannot reach and transformation cannot bypass it. The current bet is that only redrawing the workflow graph from the foundation loosens it; but watch an equally live rival—after the redraw the bottleneck may not vanish so much as relocate, out of the visible queue and onto the people at the judgment nodes or into the context the system must maintain, into a place that is harder to measure.

AI SIDE 04 节点变快以后，真正的瓶颈会被照得更亮。 Once nodes get faster, the real bottleneck becomes brighter.

UPSTREAM

Amdahl, 1967 - Limits of speedup
Conway, 1968 - Committees invent
Galbraith, 1974 - Info-processing view
Brooks, 1975 - Mythical Man-Month
Goldratt, 1984 - Theory of Constraints
METR, 2025 - RCT: AI & dev speed
黄益贺, 2026 - AI原生组织的底层逻辑
Huang Yihe, 2026 - The Underlying Logic of AI-Native Organizations

第 2 节给出了测量结果：绝大多数企业 GenAI 试点对真实盈亏没有可衡量的影响。本章回答"为什么"：因为这些组织把 AI 部署在节点上（让某个人、某个环节更快），而组织的吞吐量是图的属性——由依赖链的拓扑、协调成本的曲线、决策队列的深度决定。工具改变节点的速度，改不了图的形状。

Section 2 surfaced the measurement: the great majority of enterprise GenAI pilots show no measurable profit-and-loss impact. This chapter answers “why”: these organizations deploy AI on the nodes (making a person or a step faster), yet an organization’s throughput is a property of the graph, set by the topology of its dependency chains, the curve of its coordination cost, and the depth of its decision queues. Tools change the speed of a node; they do not change the shape of the graph.

组织的阿姆达尔定律Amdahl’s Law for Organizations · AMDAHL FOR ORGANIZATIONS

一条流程若有 70% 的时间花在串行的等待、交接与审批上，那么即使把其余 30% 的执行加速一百倍，端到端也只快 1.42 倍。这就是"组织大量使用 AI 却没有变快"的数学结构：加速节点是工具问题，重画图是架构问题。AI 解决前者；这份规约的其余部分解决后者。

If 70% of a process’s time is consumed by serial waiting, handoffs, and approvals, accelerating the remaining 30% by a factor of a hundred still yields only a 1.42× end-to-end gain. That is the mathematical structure behind “organizations deploying lots of AI yet not getting faster”: speeding up nodes is a tooling problem; redrawing the graph is an architecture problem. AI solves the former; the rest of this specification solves the latter.

黄益贺（2026）描述过这条定律的一个具象版本（从业者观察，非受控研究）：风投机构给每个分析师配上最强的 AI，让十个 agent 并行生成十份研究报告，然后所有报告仍由同一个人按顺序阅读、由同一个投委会按周期审议。生产端并行了，消费端依旧串行，两到四周的流程几乎没有缩短。他对此有一句精确的总结："AI 不是自动让组织变快，它只是把串行瓶颈照得更亮。"——可并行的部分被 AI 加速之后，真正拖慢组织的环节会暴露得前所未有地明显：慢的不是写报告，是谁来读、谁来判断、谁来拍板。个体层面的证据同样刺眼：METR 2025 年的随机对照试验中（16 名资深开源维护者、246 个任务、各自深耕多年的百万行级代码库），开发者使用 AI 工具后实际慢了 19%，却自认为快了约 20%。研究者明确警告此结果不应外推到陌生代码库或从零构建的场景——但它至少确认了一件事：叠加层面的收益可能远比体感小，而体感本身不可信。

Huang Yihe (2026) described a concrete instance of this law (a practitioner observation, not a controlled study): a VC firm equips every analyst with the most powerful AI and lets ten agents generate ten research reports in parallel. All the reports are then still read sequentially by the same person, and reviewed on the same committee cadence. The production side parallelized; the consumption side remained serial. A two-to-four-week process barely shortened. His precise summary: “AI doesn’t automatically make an organization faster - it just illuminates serial bottlenecks more brightly.” Once the parallelizable parts are accelerated by AI, the stages that truly slow the organization become exposed as never before: the bottleneck is not writing reports, but who reads them, who judges them, who decides. Individual-level evidence is equally stark: in METR’s 2025 randomized controlled trial (16 experienced open-source maintainers, 246 tasks, each working in a million-line codebase they had cultivated for years), developers using AI tools were actually 19% slower, yet estimated themselves to be about 20% faster. The researchers explicitly cautioned against extrapolating to unfamiliar codebases or greenfield builds, but the trial nails one thing down: overlay-level gains may be far smaller than perceived, and perception itself cannot be trusted.

判别一个问题属于哪一层：工具层（单点执行慢，AI 直接可解）、流程层（顺序可重排，流程再造可解）、结构层（瓶颈由组织的存在方式本身产生，只能重构）。以下这些瓶颈全部位于结构层。每一个都按同一格式解剖：机制（传统组织为什么必然产生它）、为什么加 AI 无效（叠加悖论的具体形态）、AI Native 重构（映射到心智模型 M.01-M.05 与架构支柱）、检验信号（你的组织是否已经越过它）。

Diagnosing which layer a problem belongs to: tooling layer (a single node executes slowly; AI can fix this directly); process layer (sequence can be reordered; process reengineering can fix this); structural layer (bottleneck is generated by the organization’s very mode of existence; only restructuring can fix this). The following these bottlenecks all reside at the structural layer. Each is dissected in the same format: mechanism (why the traditional organization inevitably produces it), why overlay fails (the specific form of the overlay paradox), AI Native restructure (mapping to mental models M.01-M.05 and the architectural pillars), test signal (whether your organization has already cleared it).

结构瓶颈对照图：左侧是在旧串行链上加 AI 助手，右侧是按 agent 并行和判断节点重画工作流。Overlay paradox plate contrasting AI added to an old serial chain with an AI-native workflow graph. — GENERATED PLATE 04 叠加悖论图：AI 加速的是节点；如果等待、交接、审批仍然串行，组织速度仍由旧图决定。解法是重画图的形状，而非更多工具。 Overlay-paradox plate: AI speeds the node; if waiting, handoff, and approval remain serial, the organization’s speed is still set by the old graph. The fix is not more tools, but redrawing the graph.

核心图KEY FIGFIG. 4.0 / THE OVERLAY PARADOX 看懂：加 AI 为什么不等于变快Read this: why overlaying AI ≠ getting faster

左半边有据：把 AI 加装到既有串行链上，每个节点更快，但端到端由节点之间的等待支配，几乎不动——这是阿姆达尔约束。右半边是当前的押注：重画工作流图，可并行的全部并行，交接被共享上下文吸收，人类判断收敛为图上的显式节点，端到端因此大幅下降。但要盯住一个同样成立的反面（图中点线）：重画之后瓶颈未必消失，也可能只是从看得见的队列，搬到判断节点上的人、或系统要维护的上下文里，换个更难计量的位置。

The left half is evidenced: AI overlaid on an existing serial chain makes each node faster, but end-to-end time is dominated by the waiting between nodes and barely moves — this is Amdahl’s constraint. The right half is the current bet: redraw the workflow graph, everything parallelizable fans out, handoffs are absorbed by shared context, human judgment converges into explicit nodes on the graph, and end-to-end drops sharply. But watch an equally tenable counter-reading (the dotted note): after the redraw the bottleneck need not vanish — it may simply move from a visible queue to the person at a judgment node, or to the context the system must maintain, a place that is only harder to measure.

INSTRUMENT 01 · AMDAHL 实验台 Lab ● LIVE

流程中串行等待的占比Serial waiting fraction in the process 70% AI 对可并行部分的加速AI speedup on the parallelizable portion ×10

执行加速十倍，端到端只快三成七，其余时间全部在排队。拖动滑块，亲手验证"瓶颈在边，不在点"。Execution 10× faster, end-to-end only 37% faster; the rest of the time is all queuing. Drag the sliders to verify “the bottleneck is in the edges, not the nodes” yourself.

端到端实际加速Actual end-to-end speedup · END-TO-END SPEEDUP×1.37

原始端到端Baseline

加速之后After speedup

读法：每张卡片在书目行下方标了它违反 T1 的哪一侧——判断分布或上下文流动，少数落在两者的交互。16 条里 14 条能干净归到这两根变量上；余下两条——B.15 动机抽干 与 B.16 生态位锁定——落在 T1 两轴之外：前者是人的内在动机，后者是组织间的生态依赖。不把它们硬塞进 T1，是对命题边界的诚实标注，也标出了 T1 还没覆盖、仍可被继续检验的两处。

How to read these: under each card’s citation line, a tag marks which side of T1 the bottleneck violates: judgment distribution or context flow, with a few at their coupling. Fourteen of the sixteen fall cleanly onto these two variables; the remaining two, B.15 motivation crowding-out and B.16 niche lock-in, sit outside T1’s two axes: one is intrinsic human motivation, the other inter-organizational ecosystem dependency. Not forcing them into T1 is an honest marking of the thesis’s boundary, and points to the two places T1 does not yet cover and can still be tested.

速查 · 16 条瓶颈一览——按「违反 T1」分组即可横向比较，找到与你最相关的一条再看下方卡片Quick index · 16 bottlenecks at a glance: group by “violates T1” to compare across, find the one most yours, then read its card below
ID	瓶颈Bottleneck	违反 T1Violates T1	一句话症状Symptom in one line
B.01	串行依赖链Serial dependency chain	上下文流动context	八成时间都在等人~80% of time spent waiting
B.02	协调成本平方律Coordination’s n² law	上下文流动context	新增产能被协调吃掉added capacity eaten by coordination
B.03	决策带宽天花板Decision-bandwidth ceiling	判断分布judgment	决策全堵在塔尖decisions jam at the apex
B.04	层级信息衰减Hierarchical signal decay	上下文流动context	坏消息逐层被削平bad news flattened layer by layer
B.05	部门墙与局部最优Silos & local optima	判断×上下文j×c	价值流被部门墙切碎value stream cut up by silos
B.06	会议同步税The meeting sync-tax	判断×上下文j×c	半数工时耗在开会half the hours spent in meetings
B.07	知识私有化Knowledge privatization	上下文流动context	知识锁在个人脑里knowledge locked in individual heads
B.08	审批链与责任稀释Approval chains & diffused accountability	判断分布judgment	出事找不到担责人no one accountable when it breaks
B.09	人头即产能Headcount as capacity	判断分布judgment	扩产能只能靠招人scaling capacity means hiring
B.10	规划节奏失配Planning-cadence mismatch	判断分布judgment	环境周变，资源年锁world shifts weekly, resources locked yearly
B.11	试错成本与风险规避Trial cost & risk aversion	判断分布judgment	试错太贵，只做安全事trials too costly, only safe bets
B.12	指标剧场Metrics theater	上下文流动context	指标沦为表演博弈metrics become performed theater
B.13	信任半径坍缩Trust-radius collapse	上下文流动context	没安全感就瞒问题no safety, so problems get hidden
B.14	权力梯度与议程垄断Power gradient & agenda capture	判断分布judgment	选项没上桌就已死options die before reaching the table
B.15	动机抽干Motivation crowding-out	不在两轴off-axis	效率叙事抽干意义efficiency narrative drains meaning
B.16	生态位锁定Niche lock-in	不在两轴off-axis	命脉攥在外部供应商lifeline held by an outside vendor

B.01

串行依赖链The Serial Dependency Chain

Amdahl 1967 · Goldratt 1984

违反 T1 · 上下文流动侧violates T1 · context flow

机制 · Why it existsMechanism · Why it exists

传统流程是一场人传人的接力赛：需求 → 设计 → 构建 → 评审 → 发布，每一棒之间是队列与等待。精益研究反复测得流动效率不足 15%：一项工作 85% 以上的生命周期处于"等人"状态。总时长由串行链决定，与任何单个环节的内部效率无关。

The traditional process is a human relay race: requirements → design → build → review → ship, with queues and waiting between every baton pass. Lean research consistently measures flow efficiency below 15%: over 85% of a work item’s lifecycle is spent waiting for someone. Total duration is determined by the serial chain, independent of the internal efficiency of any individual stage.

为什么 +AI 无效 · Why overlay failsWhy overlay fails

给每个环节配 AI 加速的是"棒内奔跑"，碰不到"棒间等待"。十个 agent 并行写出十份报告，最终仍由同一个人按顺序阅读：生产端并行了，消费端依旧串行。按阿姆达尔定律，串行占比 70% 时，无论把其余部分加速多少倍，总收益上限也只有 1.43 倍。更糟的是，上游加速会在未扩容的下游堆出更深的队列。约束理论早已断言：非瓶颈处的改善是幻觉。

Equipping each stage with AI accelerates “running with the baton”; it never touches “waiting between baton passes.” Ten agents write ten reports in parallel; they are still read sequentially by the same person: the production side parallelized, the consumption side remains serial. By Amdahl’s Law, when serial fraction is 70%, no matter how much you accelerate the rest, the total gain ceiling is only 1.43×. Worse: upstream acceleration piles deeper queues at unscaled downstream stages; the Theory of Constraints established long ago that improvement at a non-bottleneck is an illusion.

AI Native 重构 · RestructureAI Native Restructure

M.01 把组织声明为工作流图，支柱 02"工作流即代码"重画拓扑：可并行的全部并行扇出；交接消失，因为流转的是同一份上下文而非互相抛接的文档；审批从"排队等人"变为策略即代码的自动门加例外上报。人只出现在少数判断节点上。

M.01 declares the organization as a workflow graph; Pillar 02 “workflow-as-code” redraws the topology: everything parallelizable fans out in parallel; handoffs disappear because what flows is a shared context, not documents tossed back and forth; approvals shift from “queuing for a human” to policy-as-code automated gates with exception escalation. Humans appear only at a small number of judgment nodes.

检验信号Test Signal给你最重要的交付流程画一条时间线，统计"工作中 vs 等待中"的比例。若等待超过一半、而 AI 预算全部花在"工作中"一侧，你正在精确地优化非瓶颈。Draw a timeline for your most important delivery process and tally the “active vs. waiting” ratio. If waiting exceeds half, and your entire AI budget is on the “active” side, you are optimizing precisely the non-bottleneck.

B.02

协调成本平方律The Quadratic Coordination Tax

Brooks 1975

违反 T1 · 上下文流动侧violates T1 · context flow

机制 · Why it existsMechanism · Why it exists

n 个需要对齐的人产生 n(n−1)/2 条沟通信道，组织每长大一圈，新增产能就被新增协调吃掉一块。Brooks 定律"给延期项目加人会让它更延期"只是这条曲线最著名的切片。整个中层管理的本质，就是组织为这条平方曲线雇佣的人肉路由器。

n people who need to align produce n(n−1)/2 communication channels; every time the organization grows by one ring, a slice of new capacity is consumed by new coordination. Brooks’s Law (“adding people to a late project makes it later”) is just the most famous cross-section of this curve. The entire function of middle management is the human routing layer the organization hires to service this quadratic curve.

为什么 +AI 无效 · Why overlay failsWhy overlay fails

每人配 AI 不减少信道数量，反而提高每条信道的流量：更多文档、更多消息、更快的来回，拥塞加剧。AI 纪要与摘要是在给平方曲线做无损压缩，曲线本身纹丝不动。

Giving everyone an AI does not reduce the number of channels; it increases the traffic on every channel: more documents, more messages, faster back-and-forth, greater congestion. AI meeting minutes and summaries merely compress the traffic running over the quadratic curve; the curve itself does not move.

AI Native 重构 · RestructureAI Native Restructure

M.03 上下文即核心资产——对齐通过共享上下文库完成，而非点对点同步。任何成员（人或 agent）从同一份机器可读的真相出发工作，信道结构从 O(n²) 网状坍缩为 O(n) 星形；agent 之间走结构化协议（任务、事件、状态机），根本不"开会"。

M.03 (context as core asset): alignment happens through a shared context store, not point-to-point sync. Every member (human or agent) works from the same machine-readable source of truth; channel structure collapses from O(n²) mesh to O(n) star. Agents coordinate via structured protocols (tasks, events, state machines); they do not “have meetings.”

检验信号Test Signal新成员或新 agent 接入时，能否不"问人"就开始工作？如果必须问人，你的真相还存在脑子和聊天记录里，平方律仍在全速运转。When a new member or new agent joins, can they begin working without asking anyone? If they must ask a human, your source of truth still lives in people’s heads and chat logs, and the quadratic law is running at full speed.

B.03

决策带宽天花板The Executive Bandwidth Ceiling

Simon 1947

违反 T1 · 判断分布侧violates T1 · judgment distribution

机制 · Why it existsMechanism · Why it exists

传统组织用"判断集中到塔尖"换取一致性：重要决策逐级上报，CEO 的清醒时间成为全组织吞吐的硬上限。Simon 的有限理性在组织层面的表现是：组织越大，决策队列越深，一线感知与决策点之间的距离越远。

The traditional organization trades “concentrating judgment at the apex” for consistency: important decisions escalate tier by tier, and the CEO’s waking hours become the organization’s hard throughput ceiling. Simon’s bounded rationality at the organizational scale means: the larger the organization, the deeper the decision queue, and the greater the distance between frontline perception and the decision point.

为什么 +AI 无效 · Why overlay failsWhy overlay fails

给高管配 AI 摘要、给汇报配 AI 润色，只是让队列中的文档更漂亮、队列前进略快。决策延迟的主项是"排队等判断"，不是"读材料太慢"：单点带宽没变，天花板就没变。

Giving executives AI-generated summaries and AI-polished reports just makes the queued documents prettier; the queue moves only marginally faster. The dominant source of decision latency is “queuing for judgment,” not “reading too slowly.” Single-point bandwidth unchanged, ceiling unchanged.

AI Native 重构 · RestructureAI Native Restructure

第 2 节判断稀缺性经济学的组织化：可编码的判断写成 guardrails 与策略，下放给 agent 与一线——M.05"人即判断锚点"指的是判断有锚，不是判断有漏斗。高层只保留三类判断：例外、不可逆、方向。决策权随上下文走，不随职级走。对必须保留在塔尖的判断，把判断前的预消化全部交给 agent（读完材料、对齐观点、列出假设、整理反方证据、标注不确定性），人面对的是一张高度浓缩的决策地图，不再是原始材料的洪流。

Section 2’s economics of judgment scarcity, institutionalized: codifiable judgments are written as guardrails and policies, delegated to agents and the frontline. M.05 “human as judgment anchor” means judgment has an anchor, not a funnel. Leadership retains only three categories of judgment: exceptions, irreversibles, and direction. Decision authority follows context, not hierarchy. For judgments that must stay at the apex, pre-digest everything before the judgment with agents (read materials, align viewpoints, list assumptions, compile counter-evidence, flag uncertainties) so humans face a highly condensed decision map, not a flood of raw material.

检验信号Test Signal追踪一个普通决策从发起到拍板：经过几个人、等了几天？若超过两人三天、且决策内容可以写成一条规则，它本来应该是一行 policy。Track an ordinary decision from initiation to resolution: how many people did it pass through, how many days did it wait? If more than two people and three days, and the decision content could be written as a rule, it should have been one line of policy.

B.04

层级信息衰减Hierarchical Signal Decay

Beer 1972 · Ashby 1956

违反 T1 · 上下文流动侧violates T1 · context flow

机制 · Why it existsMechanism · Why it exists

信息每向上传一层都被压缩、平滑、政治化，坏消息衰减得最快。控制论给过精确诊断。Ashby 必要多样性定律：当调节者拿到的信息品类少于系统扰动的品类，控制必然失败。传统组织的报告链是一台逐层削减多样性的机器。

With every tier that information travels up, it is compressed, smoothed, and politicized; bad news decays fastest. Cybernetics offers a precise diagnosis, Ashby’s Law of Requisite Variety: when the variety of information reaching the regulator is less than the variety of disturbances in the system, control must fail. The traditional organization’s reporting chain is a machine that strips variety away layer by layer.

为什么 +AI 无效 · Why overlay failsWhy overlay fails

AI 帮中层把周报写得更流畅，等于更高效地生产失真。失真不来自写作能力，来自"层级转述"这个信道本身：每一层都有选择性呈现的激励，AI 只会把选择性呈现做得更专业。

AI helping middle managers write smoother weekly reports is producing distortion more efficiently. Distortion does not come from writing ability; it comes from the channel of “hierarchical retelling” itself. Every tier has an incentive to present selectively; AI just makes selective presentation more polished.

AI Native 重构 · RestructureAI Native Restructure

支柱 05 可观测性先于规模：工作流原生埋点，决策者直接查询现场数据与 agent 执行轨迹。人肉报告链被"随时可查询的状态"替代，汇报从周期性叙事变成对同一套遥测的不同视图。

Pillar 05 (observability before scale): instrumentation native to the workflow, decision-makers querying live data and agent execution traces directly. The human reporting chain is replaced by “always-queryable state”; reporting becomes different views over the same telemetry rather than periodic narrative.

检验信号Test Signal高管想知道某件事的真实状态时，第一动作是"问下属"还是"查系统"？前者意味着你的真相在到达之前要经过利益相关者的中转。When an executive wants to know the true state of something, is the first move “ask a subordinate” or “check the system”? The former means your truth passes through stakeholders before it arrives.

B.05

部门墙与局部最优Functional Silos & Local Optima

Conway 1968

违反 T1 · 判断×上下文交互violates T1 · judgment×context coupling

机制 · Why it existsMechanism · Why it exists

按职能切分让每个部门优化自己的 KPI，端到端价值流被切成片段，部门交界处堆积着队列、翻译损耗和"这不是我们的问题"。Conway 定律保证你的产品结构最终复刻这堵墙。

Organizing by function causes every department to optimize its own KPIs; the end-to-end value stream is sliced into fragments; queues, translation loss, and “that’s not our problem” accumulate at every departmental boundary. Conway’s Law guarantees your product architecture will eventually mirror this wall.

为什么 +AI 无效 · Why overlay failsWhy overlay fails

每个部门各自采购 AI 工具，墙反而更厚：数据孤岛之上又叠了一层工具孤岛，跨墙交接依旧靠人开会翻译。每个局部都更快了，全局还是次优，而且次优得更快。

Each department purchases its own AI tools, making the walls thicker: a layer of tool silos stacked on top of data silos; cross-wall handoffs still require humans to meet and translate. Every local optimum improved, global outcome still suboptimal, and now suboptimal faster.

AI Native 重构 · RestructureAI Native Restructure

M.01 围绕端到端价值流设计组织：一条工作流从客户触发直到客户收到，由跨域 agent 编队走完全程，职能变成"被工作流调用的能力"，不再是"占有工作的领地"。Operator 拥有整条流，不是其中一段。

M.01 designs the organization around end-to-end value streams: one workflow from customer trigger to customer receipt, traversed by cross-domain agent ensembles; functions become “capabilities invoked by the workflow,” no longer “territories that own work.” The Operator owns the whole stream, not a segment of it.

检验信号Test Signal你最核心的交付要翻越几面墙？每面墙边上，是否都站着一个以"协调"为主要工作的全职角色？How many walls does your most critical delivery have to cross? Is there a full-time role stationed at each wall whose primary job is “coordination”?

B.06

会议同步税The Synchronous Coordination Tax

Galbraith 1974

违反 T1 · 判断×上下文交互violates T1 · judgment×context coupling

机制 · Why it existsMechanism · Why it exists

会议是传统组织的默认协调原语：一种要求所有参与者同时在场的阻塞调用。管理者 30-50% 的时间花在会上，日历碎片进一步杀死深度工作。会议泛滥的根因是两个缺失：状态不可见，决策权不明确，于是只好用"同时在场"兜底。

Meetings are the default coordination primitive of the traditional organization: a blocking call that requires all participants to be present simultaneously. Managers spend 30-50% of their time in meetings; calendar fragmentation further kills deep work. The root cause of meeting proliferation is two absences: state is not visible, and decision authority is not defined, so “all present at once” becomes the fallback.

为什么 +AI 无效 · Why overlay failsWhy overlay fails

AI 纪要、AI 排程降低了开会的边际成本，于是会议更多了，这是杰文斯悖论的会议版。工具优化的是"怎么开会"，问题却是"为什么需要开会"。

AI meeting minutes and AI scheduling lower the marginal cost of meetings, so there are more meetings. This is the meeting-world version of Jevons’s Paradox. The tools optimize “how to meet”; the real question is “why meetings are needed at all.”

AI Native 重构 · RestructureAI Native Restructure

支柱 02 让协调走异步状态机：工作流状态对所有参与者可见，交接由事件触发，决策带理由记录在案。同步在场只保留给真正需要它的三件事：判断分歧、关系建立、危机处理。AI Native 组织的默认是 async-first，日历近乎空白。

Pillar 02 routes coordination through asynchronous state machines: workflow state is visible to all participants, handoffs are event-triggered, decisions are recorded with rationale. Synchronous presence is reserved for the three things that truly require it: adjudicating judgment disagreements, relationship-building, and crisis response. The AI Native organization defaults to async-first; calendars are nearly empty.

检验信号Test Signal取消下周全部例会，看哪些工作真的停了。停下来的部分才值得做协调设计；没停的部分，证明那些会本来就是惯性。Cancel all standing meetings next week and observe which work actually stops. The parts that stop deserve coordination design; the parts that don’t stop prove those meetings were inertia all along.

B.07

知识私有化Tacit Knowledge Lock-in

Polanyi 1966

违反 T1 · 上下文流动侧violates T1 · context flow

机制 · Why it existsMechanism · Why it exists

关键知识活在个人头脑、私聊记录和"去问老王"里。新人上手以月计，老人离职即知识蒸发，bus factor 长期为个位数。更糟的是私有化被激励结构强化：不可替代性就是职业安全。

Critical knowledge lives in individual minds, private chat histories, and “go ask Zhang Wei.” Onboarding new hires takes months; when veterans leave, knowledge evaporates; the bus factor stays in the single digits indefinitely. Worse, privatization is reinforced by the incentive structure: irreplaceability is job security.

为什么 +AI 无效 · Why overlay failsWhy overlay fails

给每人配 AI 助手改变不了知识的私有属性：AI 能检索写下来的一切，唯独检索不了从未写下的东西。二十年来，让企业知识库项目一再失败的，是"写下来"从未成为工作本身的一部分——搜索技术从来不是瓶颈。

Giving everyone an AI assistant does not change the private nature of knowledge: AI can retrieve everything that was written down; it cannot retrieve what was never written. What has made enterprise knowledge-base projects fail for twenty years is that “writing things down” was never made part of the work itself — search technology was never the bottleneck.

AI Native 重构 · RestructureAI Native Restructure

M.03 加支柱 03 上下文工程作为系统实践：知识只有进入机器可读的上下文库才算"存在"——决策连同理由入库，流程以代码形式自文档，agent 的执行轨迹自动沉淀为组织记忆。上手时间可以从以月计压到以天计——前提是流程与知识已经写成 agent 可读的形式。

M.03 plus Pillar 03 (context engineering as a system practice): knowledge only “exists” once it enters a machine-readable context store. Decisions are stored with their rationale; processes self-document as code; agent execution traces are automatically deposited into organizational memory. Onboarding time can compress from months to days, provided the processes and knowledge have already been written into a form agents can read.

检验信号Test Signal哪个人离开会让某项工作瘫痪超过一周？那个人头脑里的东西，就是你欠下的上下文工程债，一笔一笔都列得出来。Which person leaving would paralyze some piece of work for more than a week? What is in that person’s head is the context engineering debt you owe, and every item can be named.

B.08

审批链与责任稀释Approval Chains & Diffused Accountability

Darley & Latané 1968

违反 T1 · 判断分布侧violates T1 · judgment distribution

机制 · Why it existsMechanism · Why it exists

传统组织用多级签字管理风险，但签字越多责任越稀：每个人都默认上一级看过了、下一级会把关，社会心理学称之为责任分散效应。审批链的实际功能往往是责任分摊仪式，而非质量控制：出事之后，找不到"做决定的那个人"。

The traditional organization uses multi-tier sign-off to manage risk, but the more signatures there are, the more diffuse accountability becomes: each person assumes the tier above reviewed it and the tier below will catch it. Social psychology calls this the diffusion of responsibility. The actual function of the approval chain is often not quality control but a ritual of distributing blame: when something goes wrong, there is no “person who made the decision” to be found.

为什么 +AI 无效 · Why overlay failsWhy overlay fails

AI 起草材料、AI 预审，让链条空转得更快——"人人有份、无人负责"的结构原封未动。甚至更糟："AI 预审通过"成为新的集体免责理由。

AI drafting materials and AI pre-review make the chain spin faster; the structure of “everyone involved, no one accountable” is untouched. It may even worsen: “AI pre-review passed” becomes the new collective disclaimer.

AI Native 重构 · RestructureAI Native Restructure

M.05 与支柱 06：审批收敛为少数显式判断节点，每个节点单人、具名、权责成对。可编码的检查全部交给自动门：测试、策略、合规规则；人签的字只剩一种含义："这个后果我来承担。"

M.05 and Pillar 06: approvals converge to a small number of explicit judgment nodes; each node is a single named individual with paired authority and accountability. All codifiable checks go to automated gates: tests, policy rules, compliance checks. The only meaning left in a human signature: “I own this outcome.”

检验信号Test Signal随机抽一个上月通过的审批，问链上每个人："如果错了，谁负责？"答案的数量大于一，或者等于零——这条链就是仪式。Pick a random approval that passed last month and ask everyone in the chain: “If this turns out to be wrong, who is responsible?” If you get more than one answer, or none at all, that chain is a ritual.

B.09

人头即产能Headcount-as-Capacity

Coase 1937

违反 T1 · 判断分布侧violates T1 · judgment distribution

机制 · Why it existsMechanism · Why it exists

传统组织扩张能力只有一个原语：招聘。周期以月计、成本固定化、错配难逆转，于是产能规划变成赌博，组织在"人手不够"与"养着闲人"之间永久摆动。预算以人头计、权力以下属数计——这也是帝国构建行为的经济根源。

The traditional organization has only one primitive for expanding capability: hiring. Cycles measured in months, costs that become fixed, mismatches that are hard to reverse: capacity planning becomes a gamble, and the organization oscillates permanently between “not enough hands” and “carrying dead weight.” Budget is counted in headcount; power is measured in the number of direct reports. This is also the economic root of empire-building.

为什么 +AI 无效 · Why overlay failsWhy overlay fails

AI 招聘工具加速的是旧通道：更快地筛简历、更快地面试，但从不质疑"能力 = 人头"这个等式。各部门继续以多要人头为目标，AI 反而成了新论据："我们需要再招五个 AI 工程师。"

AI recruiting tools accelerate the old channel (faster resume screening, faster interviews) while never questioning the equation “capability = headcount.” Departments continue to target more headcount; AI becomes the new argument: “We need to hire five more AI engineers.”

AI Native 重构 · RestructureAI Native Restructure

M.02 Agent 即默认工种：任何新能力需求，默认先问"工作流加 agent 能否承担"，招人只为增加判断密度。产能变成弹性量：agent 实例随负载伸缩，组织能力与组织人数解耦。这正是人均创收 $600 万量级的结构基础（第 2 节，口径见第 9 节）。

M.02 (agent as default job type): for any new capability requirement, the default first question is “can a workflow plus agent handle this?” Hiring is reserved for increasing judgment density. Capacity becomes elastic: agent instances scale with load, and organizational capability is decoupled from headcount. This is the structural basis for the $6M revenue-per-person scale (Section 2; scope defined in Section 9).

检验信号Test Signal下一次"产能不足"出现时，你的第一反应是写 JD 还是画工作流？预算表里，agent 运行成本与人头成本是否在同一张表上竞争？The next time “insufficient capacity” appears, is your first instinct to write a job description or draw a workflow? In the budget spreadsheet, do agent operating costs and headcount costs compete on the same sheet?

B.10

规划节奏失配The Planning Cadence Mismatch

Hope & Fraser 2003

违反 T1 · 判断分布侧violates T1 · judgment distribution

机制 · Why it existsMechanism · Why it exists

年度预算加季度 OKR 的节奏继承自工业时代的资本开支周期。环境以周为单位变化，资源以年为单位锁定，组织对机会的响应速度被规划日历锁住。年中发现方向错了？等明年预算。

The annual budget plus quarterly OKR cadence is inherited from the industrial era’s capital expenditure cycle. The environment changes by the week; resources are locked by the year, so the organization’s speed of response to opportunity is capped by the planning calendar. Discover mid-year that the direction is wrong? Wait for next year’s budget.

为什么 +AI 无效 · Why overlay failsWhy overlay fails

AI 让规划文档的生产快了十倍（三天做出过去三周的 PPT），但"批准、锁定、执行、年终复盘"的律动没变。更快地制定一个一年不变的计划不叫敏捷，叫更精致的僵化。

AI makes planning document production ten times faster (a three-week slide deck now takes three days), but the rhythm of “approve, lock, execute, year-end review” is unchanged. Producing a year-locked plan faster is more refined rigidity, not agility.

AI Native 重构 · RestructureAI Native Restructure

支柱 07 持续演化：资源跟随工作流遥测动态再分配——表现好的流自动获得更多算力、预算与 agent 配额，表现差的流被自动收缩。规划从年度仪式变成持续运转的内部资源市场，节奏与反馈周期同阶。

Pillar 07 (continuous evolution): resources are dynamically reallocated following workflow telemetry. Well-performing flows automatically receive more compute, budget, and agent quota; poorly performing flows are automatically contracted. Planning transforms from annual ritual into a continuously operating internal resource market whose cadence is in phase with the feedback cycle.

检验信号Test Signal从"数据表明应该转向"到"资源实际转移"，中间隔多久？以季度计，说明你的学习速度被日历锁死了。How long does it take to get from “data indicates we should pivot” to “resources actually reallocated”? If the answer is measured in quarters, your learning speed is capped by the calendar.

B.11

试错成本与风险规避Experiment Cost & Risk Aversion

March 1991

违反 T1 · 判断分布侧violates T1 · judgment distribution

机制 · Why it existsMechanism · Why it exists

传统组织里一次尝试等于立项加排期加占人加失败追责。试错又贵又伤人，于是只做"安全"的事——March 所说的 exploitation 挤出 exploration，组织系统性地低配探索。创新死于"值得吗"的会议，而非死于失败。

In the traditional organization, one attempt equals project approval plus scheduling plus headcount allocation plus accountability for failure. Experimentation is expensive and painful, so the organization only does what is “safe”: March’s exploitation crowds out exploration, and the organization systematically under-invests in discovery. Innovation doesn’t die from failure; it dies in the “is this worth it?” meeting.

为什么 +AI 无效 · Why overlay failsWhy overlay fails

执行变快了，但立项流程、追责文化、机会成本核算原样保留——组织还是只批"看起来稳"的实验。AI 甚至放大了错觉：AI 生成的市场分析让坏主意显得更可信——见第 11 节的陷阱"合成自信"。

Execution is faster, but the project-approval process, accountability culture, and opportunity-cost accounting are untouched; the organization still greenlights only experiments that “look safe.” AI can even amplify the illusion: AI-generated market analyses make bad ideas look more credible (see Section 11’s failure pattern “Synthetic Confidence”).

AI Native 重构 · RestructureAI Native Restructure

M.04 持续学习即操作系统：agent 并行运行 N 个变体，由真实数据裁决，实验的单位成本低到不值得为失败追责——但这只对可逆、无第三方外溢的实验成立；不可逆、或会外溢到客户与第三方的决策，仍保留 accountability 锚（M.05），不因单位成本低而豁免。文化从"审批制"换轨为"回滚制"：默认可试，越界自动回滚。探索从例外变成底色。

M.04 (continuous learning as the operating system): agents run N variants in parallel, adjudicated by real data; the unit cost of an experiment drops too low to justify accountability for failure, but this holds only for reversible experiments with no third-party spillover; irreversible decisions, or ones that spill over to customers and third parties, still keep an accountability anchor (M.05), and are not exempted just because the unit cost is low. Culture shifts from “approval regime” to “rollback regime”: everything is tryable by default, and anything out of bounds rolls back automatically. Exploration moves from exception to baseline.

检验信号Test Signal上个月你的组织跑了多少个有真实数据裁决的实验？还是个位数的话，你的试错成本结构仍然是工业时代的。How many experiments adjudicated by real data did your organization run last month? If the answer is still in single digits, your experimentation cost structure is still industrial-era.

B.12

指标剧场Metric Theater

Goodhart 1975

违反 T1 · 上下文流动侧violates T1 · context flow

机制 · Why it existsMechanism · Why it exists

传统组织度量个人产出——代码行、工时、关单数。Goodhart 定律保证指标一旦成为目标就被博弈：囤积信息以保不可替代、做指标表演以保排名、报喜不报忧以保预算。度量系统本身在制造反协作。

The traditional organization measures individual output: lines of code, hours logged, tickets closed. Goodhart’s Law guarantees that any metric, once it becomes a target, gets gamed: hoarding information to stay irreplaceable, inflating metrics to protect rankings, reporting only good news to protect budget. The measurement system itself manufactures anti-collaboration.

为什么 +AI 无效 · Why overlay failsWhy overlay fails

AI 把指标表演的成本降到零——合成产出无限供给，"看起来很高产"从未如此容易。继续度量个人产出，等于正式邀请全员用 AI 生产度量噪音。NANDA 测到的那批无成效试点里，相当一部分正是"AI 提升了指标、没碰到真实盈亏"。

AI reduces the cost of gaming metrics to zero: synthetic output comes in unlimited supply, and “appearing highly productive” has never been easier. Continuing to measure individual output is a formal invitation for everyone to use AI to produce measurement noise. A significant portion of the no-impact pilots NANDA measured are precisely “AI improved the metrics without touching the P&L.”

AI Native 重构 · RestructureAI Native Restructure

度量对象从个人换成工作流（支柱 05）：吞吐、质量、成本、演化速度——由埋点客观采集，难以被个体博弈。人的评价转向 agent 无法供给的稀缺物：判断质量、上下文贡献、方向正确度（第 2 节的 KPI 反转）。

The unit of measurement shifts from individuals to workflows (Pillar 05): throughput, quality, cost, and evolution speed, objectively captured by instrumentation and difficult for any individual to game. Human evaluation shifts toward scarcities that agents cannot supply: judgment quality, context contribution, directional correctness (the KPI inversion from Section 2).

检验信号Test Signal你的绩效表里有几项是一个 agent 一下午就能刷满的？把它们识别出来，因为那些项现在正在被刷。How many items in your performance review could an agent fill to maximum in an afternoon? Identify them, because those items are currently being gamed.

B.13

信任半径坍缩Trust-Radius Collapse

Edmondson 1999 · 监控悖论 · the monitoring paradox

违反 T1 · 上下文流动侧violates T1 · context flow

机制 · Why it existsMechanism · Why it exists

组织的有效协作半径由心理安全决定——人只在"暴露问题不会被惩罚"时才上报坏消息、试验、求助。Edmondson 的研究反复显示：心理安全高的团队报告更多错误——差异来自报告意愿而非犯错频次。信任是吞吐量的隐形上限。

An organization’s effective collaboration radius is determined by psychological safety: people only report bad news, run experiments, and ask for help when “exposing problems carries no punishment.” Edmondson’s research consistently shows that high-psychological-safety teams report more errors; the difference comes from willingness to report, not frequency of mistakes. Trust is the invisible throughput ceiling.

为什么 +AI 无效 · Why overlay failsWhy overlay fails

Agent 让一切可观测，诱惑是把可观测变成全员监控。一旦遥测被用于考核与裁员，员工的理性反应是隐藏——隐藏 AI 用法、隐藏省下的时间、隐藏失败试验。监控越密，真实信号越枯竭。这是"裁员叙事自反噬"（见失败模式）的结构根源：你买了 X 光机，却让所有人学会了憋气。

Agents make everything observable; the temptation is to convert observability into surveillance of everyone. Once telemetry is used for performance reviews and layoffs, the rational employee response is to hide: hide AI usage, hide time saved, hide failed experiments. The denser the surveillance, the drier the true signal. This is the structural root of “the layoff narrative auto-cannibalization” (see failure modes): you bought an X-ray machine, but taught everyone to hold their breath.

AI Native 重构 · RestructureAI Native Restructure

把遥测的用途宪法化——可观测性服务于系统改进而非个人审判（支柱 05）。区分"看流程"与"看人"：流程指标公开，个人产出不入考核。Mollick 的 Leadership/Lab/Crowd——激励对齐到分享而非惩罚，才能让 X 光机照出真相而非教会憋气。

Constitutionalize the purpose of telemetry: observability serves system improvement, not individual prosecution (Pillar 05). Distinguish “watching the process” from “watching the person”: process metrics are public; individual output is not factored into evaluations. Mollick’s Leadership/Lab/Crowd framework: align incentives toward sharing rather than punishment, and the X-ray machine reveals truth instead of teaching breath-holding.

检验信号Test Signal问一句"上次有人主动上报自己的 AI 用法失败是什么时候"。想不起来，半径已经在坍缩。Ask: “When did someone last voluntarily report a failure in their own AI usage?” If no one can remember, the radius is already collapsing.

B.14

权力梯度与议程垄断Power Gradient & Agenda Capture

Pfeffer 1981 · Bachrach-Baratz 1962

违反 T1 · 判断分布侧violates T1 · judgment distribution

机制 · Why it existsMechanism · Why it exists

组织里最关键的权力是"决定哪些提案进入议程"，而非"否决提案"——Bachrach & Baratz 称之为权力的第二张面孔。议程设置权高度集中时，大量选项在被讨论前就已死亡，而组织对此毫无记录。

The most critical power in an organization is “deciding which proposals reach the agenda,” not “vetoing proposals.” Bachrach & Baratz call this the second face of power. When agenda-setting authority is highly concentrated, vast numbers of options die before they are discussed, and the organization has no record of this.

为什么 +AI 无效 · Why overlay failsWhy overlay fails

给决策者配 AI，放大的是既有议程持有者的产能——他能更快生成更多支持自己议程的材料。AI 不会自动质疑"为什么是这些选项"。更隐蔽的是：当 AI 推荐被当作中立，议程垄断就披上了客观的外衣，反而更难挑战。

Equipping decision-makers with AI amplifies the capacity of the existing agenda holder: they can generate more material supporting their own agenda, faster. AI does not automatically question “why these options?” More insidiously: when AI recommendations are treated as neutral, agenda capture dresses itself in the appearance of objectivity and becomes even harder to challenge.

AI Native 重构 · RestructureAI Native Restructure

Gans 的逐域控制权（[R5]）给出方向：透明推理是权威的替代品——让 AI 公开候选选项的全集与淘汰理由，议程从"谁有权设置"变为"图上可见的分支"。决策日志（支柱 03）把被否决的选项也记下来，议程垄断失去隐蔽性。

Gans’s domain-by-domain control authority ([R5]) points the direction: transparent reasoning replaces authority. Have AI surface the full candidate option set and the elimination rationale; the agenda moves from “who has the right to set it” to “a visible branch on the graph.” The decision log (Pillar 03) records rejected options too; agenda capture loses its concealment.

检验信号Test Signal翻最近三个大决策，能不能找到"被认真考虑后否决"的选项记录。只有最终方案，说明议程在暗处。Look up the last three major decisions: can you find a record of options that were “seriously considered and then rejected”? If only the final choice is on record, the agenda was set in the dark.

B.15

动机抽干Motivation Crowding-Out

Deci-Ryan SDT · Frey-Jegen 2001

不在 T1 两轴上 · 动机轴（显式承认）outside T1’s two axes · motivation axis (acknowledged)

机制 · Why it existsMechanism · Why it exists

自我决定论与动机挤出研究（Frey-Jegen 的元分析）显示：外在控制（监控、计件、把内在意义换成 KPI）会挤出内在动机。当一件原本有意义的工作被重新框定为"用 AI 多产出 X 倍"，意义感会被效率叙事抽干，留下应付。

Self-determination theory and motivation crowding-out research (Frey-Jegen meta-analysis) show that extrinsic control (surveillance, piece-rate pay, replacing intrinsic meaning with KPIs) crowds out intrinsic motivation. When work that was originally meaningful is reframed as “produce X times more output with AI,” the sense of meaning is drained by the efficiency narrative, leaving only going through the motions.

为什么 +AI 无效 · Why overlay failsWhy overlay fails

把 AI 收益直接翻译成"同样的人产出翻倍"或"同样的产出裁一半人"，是教科书级的外在化操作。短期数字好看，中期工匠精神、主人翁感、自发改进一起蒸发——而这些恰是 AI无法提供、只有人能注入组织的东西。你优化了产量，抽干了发动机。

Translating AI gains directly into “same people, double output” or “same output, half the people” is a textbook way to turn intrinsic motivation extrinsic. Short-term numbers look good; medium-term craftsmanship, ownership, and self-driven improvement evaporate together. These are precisely what AI cannot supply and only humans can inject into an organization. You optimized throughput and drained the engine.

AI Native 重构 · RestructureAI Native Restructure

把 AI 定位为"卸下苦工、释放判断"而非"同岗增产"——人的角色上移到 M.05 判断锚点与 M.06 编排者，工作变得更需要品味与主张，而非更像流水线。收益分配若指向"更难的好问题"而非"更少的人头"，内在动机被放大而非挤出。

Position AI as “removing drudgery, freeing judgment” rather than “same role, more output”: the human role elevates to M.05 judgment anchor and M.06 orchestrator; work demands more taste and conviction, less assembly line. If the dividend of AI is directed toward “harder, better problems” rather than “fewer heads,” intrinsic motivation is amplified rather than crowded out.

检验信号Test Signal引入 AI 后，团队是更愿意接难题、还是更像在交差。后者出现，动机已在被抽干。Since introducing AI, is the team more willing to take on hard problems, or more inclined to just check boxes? The latter is a sign the engine is being drained.

B.16

生态位锁定Niche Lock-In

Hannan-Freeman 1977

不在 T1 两轴上 · 生态依赖轴（显式承认）outside T1’s two axes · ecosystem-dependency axis (acknowledged)

机制 · Why it existsMechanism · Why it exists

组织生态学（Hannan-Freeman 种群生态视角）提醒：组织的命运不只由内部效率决定，也由它在生态中的位置与依赖结构决定。当核心生产要素来自少数外部供应商，组织的生存权被锁进了别人的生态位。

Organizational ecology (the Hannan-Freeman population-ecology perspective) reminds us: an organization’s fate is determined not only by internal efficiency, but also by its position and dependency structure within the ecosystem. When core production inputs come from a small number of external suppliers, the organization’s survival rights are locked into someone else’s niche.

为什么 +AI 无效 · Why overlay failsWhy overlay fails

越深地"加 AI"，越深地把核心能力外包给 OpenAI/Anthropic/Google 等少数模型供应商——算法封建主义。围绕单一供应商的 API 怪癖优化，短期最快，却在条款、定价、可用性变动时被挟持。这是结构性的生态位依赖，而非工具问题——单点故障被写进了组织命脉。

The deeper you “add AI,” the deeper you outsource core capabilities to a small number of model suppliers (OpenAI, Anthropic, Google): algorithmic feudalism. Optimizing around a single vendor’s API quirks is fastest short-term, but leaves you held hostage when terms, pricing, or availability change. This is structural niche dependency, not a tooling problem: single points of failure written into the organization’s life support.

AI Native 重构 · RestructureAI Native Restructure

支柱 04 多模型架构是这条瓶颈的正解：把模型层当作可替换的商品而非命脉，保留"主权"。抽象出供应商无关的内部接口、保留可迁移的上下文资产（M.03）、关键路径双供应商。生态位依赖不可消除，但可以从"命脉"降级为"成本项"。

Pillar 04 (multi-model architecture) is the correct answer to this bottleneck: treat the model layer as a replaceable commodity rather than a lifeline, preserving “sovereignty.” Abstract out vendor-agnostic internal interfaces, maintain portable context assets (M.03), dual-source critical paths. Niche dependency cannot be eliminated, but it can be downgraded from “lifeline” to “cost item.”

检验信号Test Signal假设主力模型供应商明天涨价三倍或封号，组织还能运转吗。答不上来，生态位已被锁定。If your primary model supplier tripled its prices or suspended your account tomorrow, could the organization still function? If you can’t answer, the niche is already locked.

INSTRUMENT 02 · 结构瓶颈诊断表Structural Bottleneck Diagnostic

回到每张卡片底部的"检验信号"，对照你的组织按下「命中」。没有及格线，只有一张片子：命中的每一项，都是工作流图上一条还没被删掉的串行边。

Return to the “Test Signal” at the bottom of each card and click “Hit” wherever it matches your organization. There’s no passing score, just an X-ray: every hit marks a serial edge on the workflow graph that hasn’t been deleted yet.

0481216

命中 0 / 16 —— 诊断尚未开始。Hit 0 / 16 · diagnosis not yet started.

INSTRUMENT 04 · 维度透镜台 · DIMENSION LENS BENCH

这些结构瓶颈几乎都被当成"协调/信息"问题。换一片透镜，同一张图上会亮起不同的受灾点——也会暴露原清单没覆盖的盲区（朱色行即补画的新瓶颈）。点某一维度，看同一批瓶颈在该维度下的表现与盲区；点任一格子看结论摘要。

These structural bottlenecks are almost always treated as “coordination/information” problems. Swap in a different lens and different damage points light up on the same diagram, exposing blind spots the original list did not cover (rows in vermilion are newly drawn bottlenecks). Click a dimension to see how this same set of bottlenecks behaves, and where its blind spots are, along that dimension; click any cell for a summary verdict.

瓶颈 \ 维度Bottleneck \ Dimension	信息Information	激励Incentive	权力Power	认知Cognition	时间Time	生态Ecology

未选透镜——六维叠加视图。朱色四行是补画的盲区瓶颈。No lens selected · six-dimension overlay view. The four vermilion rows are newly drawn blind-spot bottlenecks.

THE KERNEL · 核心动作：持续压缩串行瓶颈the core action is to keep compressing serial bottlenecks

这些瓶颈的重构方案各不相同，但日常动词只有一个——压缩。判断本身不可并行（战略方向、投资决策、产品审美、内容把关必须由人承担），但判断之前的一切都可以：让 agent 预先读完材料、找出证据、对齐分歧观点、列出关键假设、整理反方论据、标注不确定性——把"原始材料的洪流"压缩成"一张决策地图"。压缩的对象是判断之前的等待，而非人的判断。

These bottlenecks have different restructuring prescriptions, but there is only one daily verb: compress. Judgment itself cannot be parallelized (strategic direction, investment decisions, product taste, editorial gatekeeping must be carried by humans), but everything before judgment can be: have agents pre-read materials, surface evidence, align divergent viewpoints, enumerate key assumptions, compile counter-arguments, and flag uncertainties, compressing “a flood of raw material” into “a decision map.” What is being compressed is the waiting before judgment, not human judgment itself.

第二个关键词是持续。瓶颈的边界随模型能力移动：今天必须由人串行处理的，明天可能被 agent 预处理；今天要反复口头解释的背景，明天通过 memory 与 context 系统自动继承。所以 AI Native 组织是一个持续迭代的产品，而非固定状态。Operator 的周期性三问：① 还有哪些事必须由人按顺序处理？② 其中哪些可以被 agent 预处理、被结构化？③ 哪些上下文可以被系统继承、不再依赖口头传递？每一轮回答，都从工作流图上再删掉几条串行边。

The second keyword is continuously. The boundary of bottlenecks moves with model capability: what must be processed serially by humans today may be pre-processed by agents tomorrow; background that requires repeated verbal explanation today will be automatically inherited through memory and context systems tomorrow. The AI Native organization is therefore not a fixed state but a continuously iterated product. The Operator’s periodic three questions: ① What still must be processed by humans in sequence? ② Of those, which can be pre-processed and structured by agents? ③ Which context can be inherited by the system without further dependence on verbal hand-off? Each round of answers deletes a few more serial edges from the workflow graph.

H.01

定义问题Define the Problem

什么值得研究、什么不值得——AI 能找答案，难替你决定什么问题最重要。

What is worth investigating, and what is not. AI can find answers; it struggles to decide which questions matter most.

H.02

定义标准Define the Standard

什么是好报告、好产品、好判断。标准不清，agent 跑得越快，垃圾生成得越快。

What counts as a good report, a good product, a good judgment. Without clear standards, the faster agents run, the faster garbage is generated.

H.03

管理上下文Manage Context

历史判断、失败经验、行业框架若不能被系统继承，每次协作都从零开始。

If historical judgments, failure lessons, and domain frameworks cannot be inherited by the system, every collaboration starts from zero.

H.04

建立评估Build Evaluation

哪些可自动检查、哪些必须人工 review、哪些错误可容忍、哪些不可接受。

What can be automatically checked, what requires human review, which errors are tolerable, and which are not.

H.05

最终判断Final Judgment

AI 提供信息、证据、反证与模拟，但"要不要做"仍由人承担后果。

AI provides information, evidence, counter-evidence, and simulations, but the consequences of “whether to do it” are still borne by a human.

瓶颈压缩之后，人的五种新能力——执行者退场，留下的是网络的设计师、标准的定义者、最终的决策者。

After bottleneck compression, five new human capabilities: the executor exits; what remains are the network designer, the standard-setter, and the final decision-maker.

GEN 1 · PROCESS

流程范式 · 公司即流程Process Paradigm · Company as Process

泰勒制 → 丰田 TPS → 华为 IPD：用流程吃掉个人英雄主义，解决规模化问题[R46]。

Taylorism → Toyota TPS → Huawei IPD: use process to absorb individual heroics and solve the problem of scaling[R46].

GEN 2 · PRODUCT

数据范式 · 公司即产品Data Paradigm · Company as Product

Amazon / Google 实验文化 → 字节跳动 A/B 化组织：用数据迭代组织本身，解决迭代速度问题[R47]。

Amazon / Google experiment culture → ByteDance A/B-ified organization: use data to iterate the organization itself and solve the problem of iteration speed[R47].

GEN 3 · NETWORK

网络范式 · 公司即并行网络（假设）Network Paradigm · Company as Parallel Network (Hypothesis)

执行交给 agent 网络，人收敛为判断节点，针对串行瓶颈。公开样本仍薄：Anthropic 自述 10 个团队（含法务、增长营销等非工程团队）已把 agentic 工作流嵌入部分流程[R21]——是嵌入部分流程，不是整体运转。

Execution delegated to agent networks; humans converge to judgment nodes, targeting serial bottlenecks. Public samples remain thin: Anthropic reports that 10 teams (including legal and growth marketing, not just engineering) have embedded agentic workflows into some processes[R21]: embedded into some processes, not running end-to-end.

口径：三段式取自黄益贺的从业者观察（2026[R20]），是启发式叙事，不是实证分类——现实中范式叠加并存而非代际替代（流程范式仍统治电信设备业，数据范式无人淘汰），三个样本来自三个不同行业，证明不了"每个时代有一种最强形态"。GEN 3 当前是假设而非已验证规律：Anthropic 自家调查里员工自报约 60% 的工作借助 Claude、生产率感知 +50%，但认为"可完全委托"的工作仅 0-20%[R21]。这个三段式的真正用途是给这些瓶颈标出各自属于哪一层的解——不是宣告第三代已经赢了。

Scope note: the three-stage framework is drawn from Huang Yihe’s practitioner observations (2026[R20]); a heuristic narrative, not an empirical classification. In reality paradigms coexist rather than replace each other generationally (process paradigm still dominates telecom equipment; data paradigm is nowhere near sunset); the three samples come from three different industries and cannot prove “each era has one strongest form.” GEN 3 is currently a hypothesis, not a validated regularity: in Anthropic’s own survey, employees self-report roughly 60% of work assisted by Claude and a +50% perceived productivity gain, but work considered “fully delegatable” is only 0-20%[R21]. The real purpose of this three-stage framework is to label which solution layer each of these bottlenecks belongs to, not to declare that the third generation has already won.

SYNTHESIS · 这些瓶颈是同一件事的不同投影the bottlenecks are projections of one and the same thing

没有一个瓶颈是"技术不够好"造成的。传统组织是为一个前提设计的机器——人类协调成本高昂且不可压缩。层级、会议、审批链、年度预算、人头制、个人 KPI，全部是那个前提下的最优解。这也是为什么 Ivan Zhao 把 AI 称作"组织的钢铁"——人类沟通不再必须是承重墙，两小时的周对齐会塌缩成五分钟的异步评审[R22]。但钢铁不会自己重盖房子：早期工厂把水车换成蒸汽机却保留其余一切，生产率只微涨；爆发发生在工厂围绕蒸汽机重新设计之后——截至本版（2026-07），大多数"加 AI"仍停在换水车阶段。这台机器没有坏，它只是在精确地解一道已经被删掉的题。

Not a single bottleneck is caused by “technology that is not good enough.” The traditional organization is a machine designed for one premise: human coordination costs are high and incompressible. Hierarchy, meetings, approval chains, annual budgets, headcount-based capacity, individual KPIs: all are the optimal solution under that premise. This is also why Ivan Zhao calls AI “steel for the organization”: human communication no longer has to be a load-bearing wall; a two-hour weekly alignment meeting can collapse to a five-minute async review[R22]. But steel does not rebuild houses on its own: early factories swapped waterwheels for steam engines while keeping everything else unchanged, and productivity barely rose; the real explosion came after factories redesigned themselves around the steam engine. As of this edition (2026-07), most “adding AI” is still at the waterwheel-swap stage. The machine is not broken; it is just solving a problem that has already been deleted.

AI 删除了前提，但不会自动删除解。把 AI 加装到旧解上，得到的是一个更快的旧组织——这些瓶颈原样保留，只是每个瓶颈处的队列前进得更体面了。这也是"转型"路径的根本困境：这些瓶颈中的每一个，在存量组织里都有既得利益的守护者——中层是平方律的雇员，审批链是风险部门的领地，人头预算是权力的度量衡。结构问题之所以是结构问题，就在于它不能在结构内部被投票废除。

AI deleted the premise but does not automatically delete the solution. Overlay AI onto the old solution and you get a faster old organization: these bottlenecks preserved intact, with queues at each bottleneck advancing more gracefully. This is also the fundamental dilemma of the “transformation” path: each of these bottlenecks has a vested-interest guardian in the incumbent organization: middle managers are employees of the quadratic law, approval chains are the risk department’s territory, headcount budgets are the measure of power. The reason structural problems are structural problems is precisely that they cannot be voted out from within the structure.

从新前提重新推导组织，这正是本规约其余部分的内容：第 5 节的这些世界观是新前提的公理化，第 7 节的架构支柱是推导规则，第 8 节的运营底座是物理实现。

Re-deriving the organization from new premises is precisely what the rest of this specification covers: the worldviews of Section 5 are the axiomatization of the new premise; the pillars of Section 7 are the derivation rules; the operating substrate of Section 8 is the physical implementation.

工具修不了结构。结构问题只有架构解，而架构解的日常动词，是持续压缩串行瓶颈。

Tools cannot fix structure. Structural problems have only architectural solutions, and the daily verb of those solutions is continuously compressing serial bottlenecks.

SECTION

MENTAL MODELS · 心智模型

框架 · 世界观

Framework · Worldviews

底层世界观

Foundational Worldviews

支柱不是凭空立起来的，它们整个压在六个"组织现在该怎么看"的假设上。先把假设摊开，你才看得清支柱在替你赌什么。

The pillars are not raised from nothing; they rest entirely on six assumptions about how to see an organization now. Lay the assumptions out first, and you can see what the pillars are betting on.

一句话In one line

支柱之前是六个假设：工作流图才是真相、agent 是默认工种、人是判断锚、上下文是基设、组织持续自学、操作者是编排者。它们是支柱的地基，不是中立的观察——最吃重的一条"人是判断锚"赌的是判断长期绑在人身上。这是本卷此刻的下注；哪天某类判断被 agent 稳定做得更好，松动的就不止一根支柱，而是这块地基。Before the pillars come six assumptions: the workflow graph is the truth, the agent is the default role, the human is the judgment anchor, context is infrastructure, the organization learns continuously, and the operator is the orchestrator. They are the pillars’ foundation, not neutral observations, and the load-bearing one—“the human is the judgment anchor”—bets that judgment stays bound to people. That is this volume’s current wager; the day agents do some class of judgment reliably better, what loosens is not one pillar but the ground under all of them.

M.01

组织即工作流图

Organization-as-Workflow-Graph

传统组织里，组织图是真相——显示谁向谁汇报。AI Native 组织里，工作流图是真相——显示什么流向哪里、什么触发什么、什么决定什么。组织图如果还存在，是工作流图的下游产物。组织图的权威也比想象中年轻。第一张现代组织图谱是 1855 年 McCallum 为 Erie 铁路所画；Mollick 据此提醒：从组织图谱到敏捷，现有组织技术全部预设"单一的、仅人类的智能"[R8]，所以是重建，不是改装。

In a traditional organization the org chart is the truth: it shows who reports to whom. In an AI Native organization the workflow graph is the truth: it shows what flows where, what triggers what, what decides what. If an org chart still exists it is a downstream artifact of the workflow graph. The authority of the org chart is also younger than we imagine: the first modern org chart was drawn by McCallum for the Erie Railroad in 1855; Mollick uses that fact to remind us that every existing management technology from org charts to agile presupposes “a single, exclusively human intelligence” [R8]. So this is a rebuild, not a retrofit.

M.02

Agent 即默认工种

Agent as the Default Role

设计任何任务时的默认假设是——Agent 来做这件事。人类只在有特定理由时介入：判断、问责、关系。这反转了传统偏置：传统问"要不要自动化"，AI Native 问"这真的需要人吗"。落到具体动作：一份竞品分析，过去默认派个人去查、去比、去成稿，卡在他的档期；现在默认 agent 拉数据、列对比、出初稿，人只在"该不该把这几家纳入""这个结论敢不敢下"两处落判。

When designing any task, the default assumption is an Agent does this. Humans intervene only when there is a specific reason: judgment, accountability, relationships. This inverts the traditional bias: the old question was “should we automate this?” The AI Native question is “does this genuinely need a human?” In concrete terms: a competitive analysis used to default to a person searching, comparing, and drafting, bottlenecked on their calendar; now it defaults to an agent pulling the data, laying out the comparison, and producing a first draft, with the human stepping in only at “should these players be in scope” and “can this conclusion be signed off.”

M.03

上下文即核心资产

Context as the Core Asset

新的资产类别——组织上下文（organizational context）。结构化、可被 Agent 检索的组织思考、决策、运营。它是 AI Native 组织建立的护城河，而且复利积累。Karpathy 把这件事的必要性说得更狠——LLM 患有"顺行性遗忘症"，不像同事那样积累语境，上下文必须被显式工程化[R6]：这个资产是 Agent 可用的前提，而非锦上添花。

A new asset class: organizational context, the organization’s thinking, decisions, and operations structured so Agents can retrieve them. It is the moat an AI Native organization builds, and it compounds. Karpathy states the necessity more bluntly: LLMs suffer from “anterograde amnesia” and cannot accumulate context the way a colleague does, so context must be explicitly engineered [R6]. This asset is the prerequisite for Agent usability, not a luxury.

M.04

持续学习即操作系统

Continuous Learning as the Operating System

传统组织通过周期性干预改进。AI Native 组织持续改进——每一次工作流执行都被观察、评估、用来改进工作流本身。这是批处理 vs 流处理，应用到组织学习上。落到具体动作：过去销售话术一年改两次、靠季度复盘定；现在每通电话的成败都回流，本周成的打法下周就并进话术库，改进以周计而非以年计。

Traditional organizations improve through periodic interventions. AI Native organizations improve continuously: every workflow execution is observed, evaluated, and used to improve the workflow itself. This is batch processing vs. stream processing, applied to organizational learning. In concrete terms: a sales script used to be revised twice a year off quarterly reviews; now the outcome of every call flows back, and a play that worked this week folds into the script library next week, so improvement is measured in weeks, not years.

M.05

人即判断锚点

Humans as Judgment Anchors

人不是劳动力。人是判断者、责任承担者、品味设定者、关系持有者。听起来像是把人往后退了一步，其实是升级：人在组织里的角色，从"执行单位"变成"判断单位"。Karpathy 的验证瓶颈论说的是同一件事：AI 生成、人类验证，带滑块的部分自治，比不出事故就烧香的全自治更可靠[R6]。

Humans are not labor. Humans are judges, accountability holders, taste-setters, and relationship owners. It sounds like a step back, but it’s actually a promotion: the human role in the organization moves from “execution unit” to “judgment unit.” Karpathy’s verification-bottleneck argument makes the same point from another angle: AI generates, humans verify. Partial autonomy with a slider beats full autonomy that’s hoping nothing breaks [R6].

M.06

组织即生命系统

Organization as a Living System

传统组织被设计成工厂——部门、流程、岗位说明书，是一组刚性隔间。AI Native 组织被设计成生命系统——一切流动，发现问题的"细胞"被授权直接响应，秩序自下而上涌现而非自上而下指派。这是小型组织能比大型快一个数量级的结构性原因——是结构不同，而非更聪明。它也把光谱两端缝合：从 N=1 的一人公司到 N=众多的 agent 网络，跑的是同一套生命逻辑，只是规模不同——单细胞到多细胞这条线，第 6 节讲透。

Traditional organizations are designed as factories: departments, processes, and job descriptions form a set of rigid compartments. AI Native organizations are designed as living systems: everything flows; the “cell” that detects a problem is authorized to respond directly; order emerges bottom-up rather than being assigned top-down. This is the structural reason a small organization can move an order of magnitude faster than a large one: not because it is smarter, but because the structure is different. It also joins the two ends of the spectrum: from the N=1 one-person company to the N=many agent network, the same living logic runs throughout; only the scale differs, and Section 6 works the single-cell-to-multi-cell line out in full.

FIG. 5.0 / THE OPERATING FLYWHEEL · 持续学习飞轮看懂：组织如何持续学习 How to read it: how the organization learns continuously

M.04 的运行时形态。每一次执行都被记录（03）、评估（04）、回写（05），由人调校判断标准（06）后重新编码进工作流（01）——六步转一圈，组织的上下文资产与工作流能力就复利一次。输入是真实世界的信号，输出是更短的等待与更快的 90 天节奏。AI Native 没有"转型完成态"，只有持续演化的运行时。

The runtime form of M.04. Every execution is logged (03), evaluated (04), and written back (05); humans tune the judgment standard (06) and it is re-encoded into the workflow (01): six steps per revolution, and the organization’s context store and workflow capability compound once more. Inputs are real-world signals; outputs are shorter wait times and a faster 90-day cadence. AI Native has no “transformation complete” state, only a continuously evolving runtime.

FIG. 5.1 / JUDGMENT ANCHOR MAP · 判断锚点地图看懂：哪些事必须由人来 How to read it: which tasks must be done by humans

M.05 与支柱 06 的工程化："哪些必须人来"从感觉变成坐标。横轴是撤销成本，纵轴是声誉、价值观与法律暴露度。右上象限人先判断再执行——这正是支柱 06 的三类不可让渡判断；左下象限 agent 默认执行；其余两个象限用监控与策略自动门换取并行度。位置即责任分配，移动样本点就是在重画判断的分布（T1 上层）。

The engineering form of M.05 and Pillar 06: “which tasks must be human” shifts from intuition to coordinates. The horizontal axis is the cost of reversal; the vertical axis is reputation, values, and legal exposure. The top-right quadrant demands human judgment before execution (these are precisely the three non-negotiable judgment types from Pillar 06); the bottom-left quadrant lets Agents execute by default; the remaining two quadrants buy parallelism with monitoring and policy gates. Position equals responsibility allocation; moving a sample point redraws the judgment distribution (T1 upper tier).

M.06 / SYNTHESIS · META MODEL how, not what

操作者即编排者Operator as Orchestrator

Operator as Orchestrator

前面几个心智模型描述什么——AI Native 组织看世界的角度。最后一个心智模型描述怎么做——操作者在这个组织里实际扮演什么角色。传统组织的操作者是 individual contributor——写代码、做产品、管理人、跑流程。AI Native 组织的操作者是 orchestrator——注意力上移：从执行到引导、从生产到判断、从单点工作到系统设计。

The first five mental models describe what: the angles from which an AI Native organization sees the world. The sixth describes how: what role the operator actually plays in that organization. The operator in a traditional organization is an individual contributor: writing code, building product, managing people, running processes. The operator in an AI Native organization is an orchestrator, whose attention moves up: from execution to guidance, from production to judgment, from point work to system design.

这个角色转换需要一组新的核心技能。上下文工程——让 Agent 持续对齐你的组织而不是漂移。Prompt 与 Skills 设计——把判断标准外显化为 Agent 可执行的指令。Evaluation 框架——让你看见 Agent 表现而不是猜测。判断节点设计——决定工作流的哪些步骤必须人介入、哪些可以放手。这些技能不再是工程师的专属，它们是 AI Native 组织里每个 operator 的必修课——产品经理、销售、运营、HR、财务，全员适用。

This role shift demands a new set of core skills. Context engineering: keeping Agents continuously aligned to your organization rather than drifting. Prompt and Skills design: externalizing judgment standards into Agent-executable instructions. Evaluation frameworks: letting you see Agent performance rather than guessing at it. Judgment node design: deciding which workflow steps must have human intervention and which can be released. These skills are no longer the exclusive domain of engineers; they are required study for every operator in an AI Native organization. Product managers, sales, operations, HR, finance: everyone.

这个模型把前五个模型激活——工作流图（M.01）要有人去画；Agent 作为默认工种（M.02）要有人去配置；上下文（M.03）要有人去工程化；持续学习（M.04）要有人去设计反馈循环；判断锚点（M.05）要有人去定位。没有 orchestrator 的 AI Native 是空架构，没有 AI Native 架构的 orchestrator 是疲惫的杂工。两者互为前提。

This model activates the previous five: someone must draw the workflow graph (M.01); someone must configure Agent as the default role (M.02); someone must engineer the context (M.03); someone must design the feedback loops for continuous learning (M.04); someone must locate the judgment anchors (M.05). AI Native without an orchestrator is empty architecture; an orchestrator without AI Native architecture is an exhausted odd-job worker. Each is the other’s prerequisite.

六个模型到此立全。它们是这套推导的起点公理——但是被选来当起点的公理，不是被证明的定理：立的是一种看世界的方式，还没经过时间的反驳，也还没回答一个组织如何在时间里活下去。下一节换上生命系统的视角，看这套假设怎样自我维持、自我进化，也看它在什么条件下会散架。

The six models are now complete. They are the starting axioms of this derivation—but axioms chosen as a starting point, not theorems that have been proven: they establish a way of seeing the world, they have not yet been tested against time, and they do not yet answer how an organization stays alive in time. The next section switches to the living-system view to watch these assumptions sustain and evolve themselves, and to see the conditions under which they fall apart.

SECTION

LIVING SYSTEM · 生命系统Living System

框架 · 生物学透镜（类比，非实证）Framework · Biological Lens (Analogy, Not Empirical)

组织作为生命系统

Organization as a Living System

机器会停摆，生命会适应。组织该照哪一个来设计？这套生物学逻辑，也标出它在什么条件下不成立。

Machines break down; living systems adapt. Which should an organization be designed like? This biological logic also marks where it does not hold.

一句话In one line

把组织当机器会僵硬，当生命系统才自修复、自进化，靠五条生物学原理各对应一个设计动作。但这是类比而非实证，每条都附失效条件，一旦系统性失灵，本章应最先被改写。Design an organization like a machine and it goes brittle; like a living system and it self-repairs and self-evolves, via five biological principles each mapped to a design action. But these are analogies, not evidence: each carries a failure condition, and this chapter should be the first rewritten if one systematically breaks.

本章性质 · 类比以下生物学映射是 Ⅲ 级理论移植（类比与模型，非组织实证）。每条给出可证伪条件——生物类比一旦在某场景下系统性失效，本章应最先被改写，不享豁免。它不在从世界观（第 5 节）到支柱（第 7 节）的推导主线上——跳过本章，那条链条依然完整；放它在这里，是给同一套逻辑一个直觉图像，不是因为推导需要它。

Chapter Nature · Analogy The biological mappings below are Level III theoretical transplants (analogy and model, not organizational evidence). Each carries a falsifiability condition: if a biological analogy systematically fails in any scenario, this chapter should be the first to be rewritten; it enjoys no immunity. It does not sit on the derivation spine that runs from the worldviews (Section 5) to the pillars (Section 7): skip this chapter and that chain stays intact. It is here to give the same logic an intuitive image, not because the derivation needs it.

五条生物学原理，每条对应一个已被正典使用的设计动作：

Five biological principles, each mapped to a design action already used in the canon:

L.01

涌现 · Emergence

Emergence · CAS & Stigmergy

Holland 的复杂适应系统：大量简单单元按局部规则交互，全局秩序自下而上涌现，无需中央设计者。蚁群不开会——它们通过 stigmergy（在共享环境里留痕、读痕；Grassé 1959 提出，Heylighen 2016 给出现代综述）间接协调。对应支柱 02/03：agent 读写共享上下文，而非互相抛接文档。可证伪：若高一致性需求场景下，涌现式自组织系统性劣于显式编排，则此映射受限。

Holland’s complex adaptive systems: large numbers of simple units interact under local rules, and global order emerges bottom-up with no central designer. Ant colonies hold no meetings; they coordinate indirectly through stigmergy (leaving and reading traces in a shared environment; coined by Grassé 1959, synthesized in Heylighen 2016). Maps to Pillars 02/03: agents read and write a shared context store rather than passing documents to one another. Falsifiability: if emergence-based self-organization is systematically inferior to explicit orchestration in high-coherence-demand scenarios, this mapping is constrained.

L.02

适应度景观 · Fitness Landscape

Fitness Landscape · NK Model (Kauffman)

Kauffman 的 NK 模型：组织在崎岖景观上爬坡，探索（找新峰）与利用（爬当前峰）需动态平衡。self-improving 的本质就是持续的局部爬坡 + 偶发的跳跃探索。对应失败模式"演化失败"——锁死在局部最优。可证伪：若组织绩效与"探索-利用平衡度"无可测相关，则模型不解释现实。

Kauffman’s NK model: organizations climb rugged landscapes where exploration (finding new peaks) and exploitation (ascending the current peak) must be dynamically balanced. The essence of self-improving is continuous local hill-climbing plus occasional leap-exploration. Maps to the failure mode “evolutionary stagnation”: getting locked into a local optimum. Falsifiability: if organizational performance shows no measurable correlation with explore-exploit balance, the model does not explain reality.

L.03

免疫系统 · Distributed Defense

Immune System · Distributed Defense

免疫系统是分布式异常检测——没有中央哨兵，每个局部都能识别并响应异常。对应支柱 05 可观测性与 guardrails：遥测+护栏=组织的免疫细胞，在边缘就地拦截幻觉、越权、数据泄露。举个具体形态：一个 agent 越权去读不该碰的客户数据表，策略门当场拦下并上报，而不是等季度审计才翻出来。可证伪：若集中式审计在等同成本下检出率显著高于分布式，则类比失效。

The immune system is distributed anomaly detection: no central sentinel; every local node can identify and respond to aberrations. Maps to Pillar 05 observability and guardrails: telemetry + guardrails = the organization’s immune cells, intercepting hallucinations, privilege escalation, and data leakage at the edge. A concrete form: an agent reaches for a customer-data table it has no business touching, and the policy gate stops and reports it on the spot, rather than a quarterly audit surfacing it months later. Falsifiability: if centralized auditing achieves a significantly higher detection rate at equivalent cost, the analogy fails.

L.04

菌丝网络 · Resource Reallocation

Mycelium Network · Resource Reallocation (Tero 2010)

菌丝/黏菌按局部信号动态重分配资源到高回报路径，无中央调度（黏菌求最短路径已有 Tero 2010 Science 实证，但映射到组织资源调度仍属 Ⅲ 级类比）。对应工作流图的动态扇出与算力/注意力的按需流动——资源跟着判断走，不跟着科层走。可证伪：若动态重分配的协调开销在规模上超过其收益，则退化为需要调度层。

Mycelium and slime mold dynamically reallocate resources to high-return paths according to local signals, with no central dispatcher (slime mold’s shortest-path optimization is empirically demonstrated in Tero et al., Science 2010, though the mapping to organizational resource scheduling remains a Level III analogy). Maps to the workflow graph’s dynamic fan-out and the on-demand flow of compute and attention: resources follow judgment, not hierarchy. Falsifiability: if the coordination overhead of dynamic reallocation exceeds its benefits at scale, the system degrades and requires a scheduling layer.

L.05

自我进化 · Self-Improving

Self-Evolving · Self-Improving Loop (Argyris-Schön / OODA)

生命的标志是自我改进的闭环：感知→响应→把结果喂回改进自身。组织级实现=遥测 → eval → 自动改进工作流本身，区别于人类组织的周期性干预（年度复盘）。这把 M.04 持续学习从口号变成机制：每一次执行都是一次适应度采样。举个具体形态：客服工作流上周把一批退款判错了，遥测记下偏差、eval 标出模式，系统据此改写工作流里的判断阈值，下周同类退款不再需要人工兜底——不必等年度复盘。Argyris-Schön 的双环学习（1978）、Boyd 的 OODA 循环（见 Osinga 2007 的体系化重构）是其人类尺度前身。可证伪：若无人监督的自动改进闭环在实践中系统性引入 reward hacking 而不可治理，则 self-improving 需重新加入人类锚（接支柱 07/05）。

The hallmark of life is a self-improving closed loop: sense → respond → feed results back to improve the system itself. The organizational implementation is telemetry → eval → automatically improving the workflow itself, in contrast to the periodic interventions of human organizations (the annual retrospective). This transforms M.04 continuous learning from slogan into mechanism: every execution is a fitness sample. A concrete form: last week a support workflow misjudged a batch of refunds; telemetry logs the deviations, eval flags the pattern, and the system rewrites the judgment threshold inside the workflow, so next week’s similar refunds no longer need a human to catch them, with no annual retrospective required. Argyris-Schön’s double-loop learning (1978) and Boyd’s OODA loop (see Osinga 2007 for the systematic reconstruction) are its human-scale predecessors. Falsifiability: if unsupervised self-improving loops systematically introduce ungovernable reward hacking in practice, then self-improving must reintroduce a human anchor (connecting to Pillars 07/05).

统一论断 · 同一套生命逻辑贯穿整条光谱

Unifying Thesis · One Living Logic Across the Entire Spectrum

生命系统逻辑不分大小：N=1 的一人公司是单细胞高密度判断体，一个判断核 + 一座上下文库，靠滚动实验自我迭代（见第 14 节的"同心节奏"）；N=众多的 agent 网络是多细胞涌现体，局部规则下秩序自组织。两端不搞两套方法论，只是同一套生命系统在不同细胞数下的表现。这正是这套方法论把一人公司收进同一体系、而非另立一卷的根本原因：规模是细胞数的选择，连贯性是生命的本征。而无论细胞数取一还是取众，骨架都是同一副——下一节（第 7 节）逐根立起这些支柱。

Living-system logic is scale-invariant: the N=1 one-person company is a single-cell, high-judgment-density entity, one judgment core plus one context store, self-iterating through rolling experiments (see Section 14, “Concentric Rhythms”); the N=many agent network is a multi-cell emergent body, where order self-organizes under local rules. The two ends are the same living system expressed at different cell counts, not two different methodologies. This is precisely why the atlas folds the one-person company into a single framework rather than giving it a separate volume: scale is a choice of cell count; coherence is the intrinsic property of life. And whichever cell count is chosen, the skeleton is the same: the next section, Section 7, raises its architectural pillars one by one.

核心图FIG. 6.0 / LIVING SYSTEM · 从单细胞到多细胞的同一逻辑 Key FigureFIG. 6.0 / LIVING SYSTEM · The Same Logic from Single-Cell to Multi-Cell 看懂：左端一人公司（单细胞）与右端 agent 网络（多细胞）共享涌现／自我进化的同一套箭头 How to read: the one-person company (single-cell) on the left and the agent network (multi-cell) on the right share the same set of emergence / self-evolving arrows

SECTION

PILLARS · 架构支柱

框架 · 工程承诺Framework · Engineering Commitments

方法论的骨架

The Skeleton of the Methodology

每根支柱先用一行划清"它不是什么"：歧义是这类术语最大的敌人。

Each pillar opens with one line establishing what it is not; ambiguity is the greatest enemy of terms like these.

一句话In one line

这些支柱是相互依存的工程承诺，不是可挑着用的最佳实践：必须一起落在模型、agent、上下文、可观测性这套底座上，缺一根就悬空。These pillars are interdependent engineering commitments, not a checklist of best practices to pick from: they must stand together on the model, agent, context, and observability substrate, and any one without it hangs in the air.

架构支柱与运营底座总成图：AI Native 组织由七个工程承诺支撑，并落在模型、agent、上下文、可观测性四层基础设施上。Architecture assembly plate showing architectural pillars resting on four infrastructure layers. — GENERATED PLATE 07 支柱总成图：这些支柱是承重构件，不是一份清单——拿掉模型、agent、上下文或可观测性任一层运营底座，任何一根支柱都会悬空。 Architecture-assembly plate: the architectural pillars are load-bearing members, not a checklist. Without the model, agent, context, and observability substrate, every pillar hangs in the air.

01 PILLAR.01

AI 优先即默认

AI-First as Default

设计起点反转：先设计"这件事由 agent 端到端完成"的理想版本，再倒推人必须出现的位置——而不是从现有岗位出发，问 AI 能帮上什么忙。（承 M.02·M.05）

The design starting point is inverted: first design the ideal version in which an agent handles the task end-to-end, then work backward to where humans must appear, rather than starting from existing roles and asking where AI can help. (from M.02·M.05)

≠给每个员工配 AI 工具 · 把"AI 助手"嵌进旧流程equipping every employee with AI tools · embedding an “AI assistant” into existing processes

每一次工作流设计都从一个问题开始——如果这件事必须由 AI Agent 端到端完成，我们会怎么设计它？这是实际的设计起点，而非思想实验。只有在通过这个设计之后，你才问"哪里会断？人的判断必须插入到哪里？"

Every workflow design starts from one question: if this task had to be completed end-to-end by an AI Agent, how would we design it? This is the actual design starting point, not a thought experiment. Only after working through that design do you ask: “Where will it break? Where must human judgment be inserted?”

这反转了传统设计序列。传统设计从现有人类角色出发，问 AI 能在哪里帮忙。AI Native 设计从完全 agentic 的理想出发，问人必须在哪里介入。组织的设计压力把人类推向真正只有人能做的领域——判断、关系、品味、责任。

This inverts the traditional design sequence. Traditional design starts from existing human roles and asks where AI can assist. AI Native design starts from the fully agentic ideal and asks where humans must intervene. The organization’s design pressure pushes humans toward the domains only humans can occupy: judgment, relationships, taste, accountability.

把推导写全 · 从 M.02·M.05 走到这根支柱（每步可单独质疑）The derivation in full · from M.02·M.05 to this pillar (each step separately challengeable)

"承 M.02·M.05"不该只是一枚标签。这里把它一步步写出来——每一步都单独摆出可以被攻击的地方。其余支柱的"承 M.0x"是同型的链条，此处只把其中一条写全，做样板。

“From M.02·M.05” should not stay a mere label. Here it is written out step by step: each step exposing, on its own, where it can be attacked. The other pillars’ “from M.0x” tags are chains of the same type; only this one is spelled out in full, as a worked template.

前提（承第 3 节）：执行趋于免费，判断成为稀缺资源；而一个组织该围着什么设计，取决于它真正稀缺的是什么。〔可质疑：若某领域的执行仍然昂贵，这条前提在那里就不成立。〕
Premise (from Section 3): execution trends toward free, so judgment becomes the scarce resource; and what an organization should be designed around is set by what is genuinely scarce in it. [Challengeable: where execution is still expensive, this premise does not hold.]
由 M.02（Agent 即默认工种）：任何能力需求的默认承担者是"工作流 + agent"，招人只为增加判断。于是每条工作流都存在一个"最大 agent 化"的版本，人的介入成了需要被证成的例外，而非默认起点。〔可质疑：若某类工作根本没有可行的 agent 基线，这一步落空。〕
From M.02 (Agent as the default role): the default bearer of any capability need is “workflow + agent,” and you hire only to add judgment. So every workflow has a “maximally agentic” version, and human involvement becomes an exception that must be justified rather than the default starting point. [Challengeable: for work with no viable agent baseline at all, this step fails.]
由 M.05（人只守少数显式判断节点，按撤销成本 × 暴露度定位）：人该站哪，取决于"哪里会造成不可逆或高暴露的后果"——而这只有先有了完全 agent 化的版本、看它在哪里崩，才定位得出来。〔可质疑：若你能先验断定人必须在哪，就不需要先画 agent 版本。〕
From M.05 (humans hold only a few explicit judgment nodes, placed by reversal-cost × exposure): where a human belongs is set by where an irreversible or high-exposure consequence sits, and that can only be located once you have the fully agentic version and see where it breaks. [Challengeable: if you can settle a priori where humans must be, you need not draw the agent version first.]
合第 2、3 步：设计必须从"完全 agent 化的理想版本"起步，再倒推人必须出现的节点。反过来（从现有岗位出发问"AI 能帮什么"）会在检验人是否必要之前就把人钉死，与 M.05 相矛盾。这一步的结论，正是支柱 01：AI 优先即默认。〔可质疑：若倒推出的人节点恰与正向设计一致，起点顺序就无关紧要——但经验上两者常不一致。〕
Combine steps 2 and 3: design must start from the fully agentic ideal and work backward to the nodes where humans must appear. The reverse (starting from existing roles and asking “where can AI help”) pins humans in place before testing whether they are necessary, contradicting M.05. The conclusion of this step is exactly Pillar 01: AI-first as default. [Challengeable: if the backward-derived human nodes happen to match a forward design, starting order is moot, but empirically the two rarely match.]

诚实的边界：这根支柱的强度，被"该领域有多可 agent 化"这一变量封顶。在完全无法 agent 化的工作里（第 2 步落空），它退化为"能 agent 化的部分先 agent 化，其余留给人"——仍然成立，但不再是强主张。它是从世界观推出的结论，不是公理；上面每一步都摆着可以被单独攻击的地方。

The honest bound: this pillar’s strength is capped by one variable: how agent-executable the domain is. For work that cannot be made agentic at all (step 2 fails), it degrades to “make agentic what can be, leave the rest to humans,” still true, but no longer a strong claim. It is a conclusion derived from the worldviews, not an axiom; every step above leaves an exposed point to attack.

SPEC

Inversion: Human-first → AI-first
Pressure: Pushes humans up the stack
Failure: "AI as helper"

02 PILLAR.02

工作流即代码

Workflow as Code

流程的唯一真相，从人脑、群聊与 PPT 搬进一份可执行、可版本化、可回滚的声明——改流程是一次代码提交，立即生效；不是一份通知，等人慢慢习惯。（承 M.01）

The single source of truth for any process moves out of human memory, group chats, and slide decks and into an executable, versionable, rollback-capable declaration: changing a process is a code commit, effective immediately, not a memo waiting for people to adapt. (from M.01)

≠CI/CD 流水线 · OA 审批电子化 · RPA 脚本——它们自动化"某一步"，这里声明的是"整张图"CI/CD pipelines · digitized OA approvals · RPA scripts: those automate a single step; this declares the entire graph

在 AI Native 组织里，工作流不被描述在 PowerPoint 里，不靠记忆运行，不由部落知识维护。它们被规约在code或机器可执行的结构化定义中——可被版本化、被分支、被观察、被持续优化。

In an AI Native organization, workflows are not described in PowerPoint, do not run on memory, and are not maintained by tribal knowledge. They are defined in code or machine-executable structured definitions: versionable, branchable, observable, and continuously improvable.

这听起来像技术细节，实际上是最重要的架构决定。当工作流是代码时，它们可以被测试、被观察、被调试、被优化；当它们不是代码时，它们困在人脑中，产生折磨传统组织的慢性流程漂移。纪律是：永远不要让一个重要的工作流只存在于某个人的头脑里。

This sounds like a technical detail; it is actually the most important architectural decision. When workflows are code, they can be tested, observed, debugged, and optimized; when they are not code, they are trapped in human minds and produce the chronic process drift that plagues traditional organizations. The discipline is: never let an important workflow exist only inside someone’s head.

SPEC

Stack: Temporal · n8n · LangGraph · Inngest
Property: Versionable, observable
Failure: Tribal knowledge drift

03 PILLAR.03

上下文工程作为系统实践

Context Engineering as Systematic Practice

组织知识的默认读者从"下一个接手的人"换成"下一个执行的 agent"——每个会议、决策与客户互动都即时沉淀为机器可直接消费的结构化背景。检验只有一条：判断之前还需要"问人"，就是失败。（承 M.03）

The default reader of organizational knowledge shifts from “the next person who takes over” to “the next agent to execute.” Every meeting, decision, and customer interaction is immediately crystallized into structured context that machines can consume directly. There is one test: if a judgment still requires “asking a person,” the system has failed. (from M.03)

≠知识库 / Wiki（为人而写、靠自觉维护、必然腐烂） · 把文件倒进向量库knowledge bases / wikis (written for humans, maintained by goodwill, inevitably rotting) · dumping documents into a vector store

Tobi Lütke 在 Shopify 把上下文工程从一种 ad-hoc 技能升格为系统实践。组织主动构建 Agent 运行的信息环境。所有内部文档为 Agent 检索而结构化；维护活的上下文存储；同时为人和 Agent 写作。

Tobi Lütke at Shopify elevated context engineering from an ad-hoc skill to a systematic practice. The organization actively constructs the information environment in which agents operate. All internal documentation is structured for agent retrieval; a living context store is maintained; writing serves both humans and agents simultaneously.

最深的原则是——组织采取的每个动作都应该产生结构化的上下文作为副产品。会议产生 Agent 可检索的总结。决策被记录连同决策理由。客户互动被捕获。日积月累，上下文存储成为组织最有价值的资产——是让你的 Agent 在用同样底层模型的情况下，质量上明显优于竞争对手 Agent 的底层基质。这是 AI 时代的真正护城河。

The deepest principle is this: every action the organization takes should produce structured context as a byproduct. Meetings generate agent-retrievable summaries. Decisions are recorded together with their rationale. Customer interactions are captured. Over time, the context store becomes the organization’s most valuable asset: the underlying substrate that allows your agents to outperform competitors’ agents in quality even when running on the same base models. This is the true moat of the AI era.

SPEC

Stack: Pinecone · Weaviate · Glean · Notion AI
Property: Compounding moat
Failure: Context starvation

04 PILLAR.04

多模型架构

Multi-Model Architecture

把模型当云厂商对待：统一抽象层加自有评估集，让"换模型"的成本是一次回归测试，而不是一次重构——租来的智能不能变成别人定价的人质。（承 M.03）

Treat models like cloud vendors: a unified abstraction layer plus a proprietary evaluation suite makes “switching models” a regression test, not a rewrite. Rented intelligence must never become a hostage to someone else’s pricing. (from M.03)

≠多签几家 API 当备份 · 哪家便宜用哪家signing with multiple APIs as backup · using whichever provider is cheapest at the moment

AI Native 设计中最深的单一风险，是算法封建主义（algorithmic feudalism）——把业务深度依赖于一家基础模型供应商，让供应商实际上变成你的地主。

The deepest single risk in AI Native design is algorithmic feudalism: deeply coupling the business to a single foundation-model provider, effectively making that provider your landlord.

架构上的防御是多模型。关键工作流应该被设计成可在数日内切换底层模型——配合质量回归测试。这要求工作流代码与具体模型 API 之间有抽象层；质量评估基础设施可以针对多个模型测试同一工作流；和至少两家供应商保持持续关系。开源权重模型应该被评估，用于可自托管的关键工作流——即使你大多数时候用商用 API，开源模型的可选性本身是战略资产。

The architectural defense is multi-model design. Critical workflows should be built to switch underlying models within days, supported by quality regression testing. This requires an abstraction layer between workflow code and specific model APIs; quality-evaluation infrastructure that can test the same workflow against multiple models; and ongoing relationships with at least two providers. Open-weight models should be evaluated for critical workflows that can be self-hosted. Even if you mostly use commercial APIs, the optionality of open-weight models is itself a strategic asset.

SPEC

Stack: OpenAI + Anthropic + Llama/Qwen
Property: Optionality, sovereignty
Failure: Provider hostage

05 PILLAR.05

可观测性先于规模

Observability Before Scale

先有眼睛，再有手：任何 agent 没有全量遥测（输入 · 输出 · 成本 · 轨迹）就不进生产。扩张的许可来自可观测性的覆盖率，不来自业务的紧迫度。（承 M.04）

Eyes before hands: no agent ships to production without full telemetry (inputs · outputs · cost · trace). Permission to scale comes from observability coverage, not from business urgency. (from M.04)

≠出了事再补监控 · 只盯用量账单retrofitting monitoring after something breaks · watching only the usage bill

NANDA 报告对那批无成效试点的自家归因是"学习缺口"——工具不持有记忆、不积累上下文、不随使用变好；报告还有一个常被引用者略去的反向发现：外购方案的落地成功率约为自建的两倍。本支柱取其上游含义：无论买还是建，组织若没有观察、评估、改进 AI 行为的基础设施，部署就无从学习——他们在能看见之前就开始扩规模。

The NANDA report attributed those no-impact pilots to a “learning gap”: tools that hold no memory, accumulate no context, and do not improve with use; the report also contains a finding that most citations omit: purchased solutions succeed at roughly twice the rate of self-built ones. This pillar takes the upstream implication: whether you buy or build, an organization without infrastructure to observe, evaluate, and improve AI behavior cannot learn from deployment. They scale before they can see.

方法论要求反过来。任何 Agent 工作流上线前，可观测性层必须存在：每个 Agent 行动被记录；每次模型调用被追踪；输出被采样以做质量评估；失败被路由到人类审查。在 AI Native 组织里，可观测性之于运营，等同于会计之于财务。你不会不记账就运营公司；你也不应该不可观测就运行 AI Native 工作流。这是基础设施下限，而非工程奢侈品。

The methodology demands the opposite. Before any agent workflow goes live, the observability layer must exist: every agent action is logged; every model call is traced; outputs are sampled for quality evaluation; failures are routed to human review. In an AI Native organization, observability is to operations what accounting is to finance. You would not run a company without bookkeeping; you should not run AI Native workflows without observability. This is the infrastructure floor, not an engineering luxury.

SPEC

Stack: LangSmith · Helicone · Arize · Weave
Property: Pre-scale infrastructure
Failure: Blind scaling

06 PILLAR.06

人作为判断与责任锚

Humans as Judgment & Responsibility Anchors

人不是每一步的审批点，而是三类判断的显式节点——不可逆的、承载声誉的、决定方向的；其余默认放行。锚越少越清楚，组织越快越稳。（承 M.05）

Humans are not approval gates at every step but explicit nodes for three categories of judgment: irreversible decisions, reputation-bearing decisions, and values-bearing decisions; everything else passes through by default. Fewer anchors mean greater clarity; a clearer organization moves faster and more reliably. (from M.05)

≠全流程人工复核 · 事后找人承担责任human review at every step · finding someone to blame after the fact

AI Native 不追求"无人"或"最少人"，而是把人定位在工作流图的最高杠杆点。三类人锚定的决策不可妥协——不可逆决策（任何无法廉价撤回的事）；承载声誉的决策（任何组织名字公开附着的事）；承载价值观的决策（伦理、品味、关系比效率更重要的事）。

AI Native is about placing humans at the highest-leverage nodes of the workflow graph, not “no humans” or “minimal humans.” Three categories of decision require a human judgment anchor, without compromise: irreversible decisions (anything that cannot be cheaply undone); reputation-bearing decisions (anything the organization’s name is publicly attached to); and values-bearing decisions (situations where ethics, taste, or relationships matter more than efficiency).

Air Canada（被法庭判决必须为 chatbot 承诺负责）和 Cursor "Sam"（编造公司政策的 AI）说明了这个支柱缺失时会发生什么。把人移出这些决策类别，省下的人力成本远不及导致的代价。

Air Canada (held by a court liable for commitments made by its chatbot) and Cursor’s “Sam” (an AI that fabricated company policy) illustrate what happens when this pillar is absent. The labor cost saved by removing humans from these decision categories is nowhere near the cost of the consequences.

SPEC

Anchor types: Irreversible · Reputation · Values
Mode: Human-in/on-the-loop
Failure: Liability vacuum

07 PILLAR.07

持续演化

Continuous Evolution

组织本身是一个被持续重构的产品：瓶颈边界随模型能力每个季度移动，今天必须人做的，下个季度要重新问一遍。重构是常态运行，不是三年一次的变革项目。（承 M.04·M.06）

The organization itself is a product under continuous refactoring: the bottleneck boundary shifts every quarter as model capabilities advance, and what humans must do today must be re-examined next quarter. Refactoring is normal operations, not a once-every-three-years transformation project. (from M.04·M.06)

≠敏捷仪式 · 年度组织调整agile ceremonies · annual org restructuring

传统组织每几年"转型"一次——发起一个大变革倡议、重组、重新平台化。AI Native 组织没有"转型事件"，因为它在持续演化。组织节奏发生转变——没有"5 年战略"，因为接下来 5 年不会像过去 5 年，底层技术移动得太快。

Traditional organizations “transform” every few years: launching a major change initiative, reorganizing, re-platforming. AI Native organizations have no “transformation events” because they are continuously evolving. The organizational cadence shifts: there is no “5-year strategy,” because the next five years will not resemble the last five; the underlying technology moves too fast.

有的是 90 天节奏（Anthropic 据报道最长规划周期是 90 天），嵌入在更长期的方向感中（1-3 年愿景），而后者本身随景观变化而更新。这对受过传统规划训练的人不舒服。它是 AI Native 运营的自然模式。

What exists instead is a 90-day cadence (Anthropic’s reported maximum planning horizon is 90 days), embedded within a longer-term sense of direction (a 1-3 year vision) that itself updates as conditions change. This is uncomfortable for people trained in traditional planning. It is the natural operating mode of AI Native.

SPEC

Cadence: 90-day rolling
Reference: Anthropic, Cursor
Failure: Static architecture

FIG. 7.0 / PILLARS ON SUBSTRATE · 支柱×底座总成看懂：七个承诺立在什么底座上Read as: what substrate do the seven commitments rest on

架构支柱与运营底座的总成图。支柱是工程承诺（本节），底座是运行物理层（第 8 节）——层号即依赖顺序：模型层在最底，可观测层离支柱最近。把它当验收单用：逐根支柱问"它立在哪几层上"，逐层问"它支撑哪些承诺"——任何一处答不上来，架构就还停留在口号。多数支柱可当工程有据看，唯 P.06「人作为判断与责任锚」是本卷最吃重的一注（图中虚线）：它赌判断长期绑在人身上；哪天某类判断被 agent 稳定做得更好，松动的不止这一根，而是它脚下这整块底座。

Assembly diagram of the architectural pillars and the operating substrate. The pillars are engineering commitments (this section); the substrate is the physical runtime layer (Section 8); layer numbers reflect dependency order: the model layer is at the bottom, the observability layer is closest to the pillars. Use it as an acceptance checklist: for each pillar ask “which layers does it rest on,” and for each layer ask “which commitments does it support.” Any gap in either direction means the architecture still lives in slogans. Most pillars read as engineering commitments you can treat as evidenced; only P.06 — human as the anchor of judgment and accountability — is this volume’s load-bearing bet (dashed in the figure): it wagers that judgment stays bound to humans over time, and the day some class of judgment is done reliably better by agents, what loosens is not this pillar alone but the whole substrate beneath it.

SECTION

OPERATING SUBSTRATE · 运营底层Operating Substrate

框架 · 执行底座Framework · Execution Substrate

组织的运营底座

The Operating Substrate

底座之下还有底座：先数清哪四层必须先就位。

Beneath the base lies more base: first, the four layers that must be in place.

一句话In one line

运营底座有四层：模型、agent、上下文、可观测性。它们支撑全部架构支柱，缺任何一层，组织都还只是"在用 AI"，配不上 AI Native 这个名号。The operating substrate has four layers: model, agent, context, and observability. They carry every pillar, and if any one is missing the organization is still merely “using AI” and has not earned the name AI Native.

可观测性层OBSERVABILITY LAYER

Observability LayerOBSERVABILITY LAYER

让系统持续可学习的东西。日志、追踪、评估、警报，以及把问题路由回人类的审查队列。没有它，你在比人类纠错速度更快地扩展失败。

What keeps the system continuously learnable. Logs, traces, evaluations, alerts, and a review queue that routes issues back to humans. Without it, you are scaling failure faster than humans can correct it.

TOOLS

LangSmith · Helicone
Arize · W&B Weave
Galileo · Braintrust

上下文层CONTEXT LAYER

Context LayerCONTEXT LAYER

让 Agent 变得组织特定的东西。向量数据库、知识图谱、决策日志，以及让这些保持鲜活的工程实践。没有它，你的 Agent 是泛化的；有了它，它们成为独属于你的。

What makes Agents organization-specific. Vector databases, knowledge graphs, decision logs, and the engineering practices that keep them fresh. Without it, your Agents are generic; with it, they become uniquely yours.

TOOLS

Pinecone · Weaviate
Chroma · Qdrant
Glean · Sana · Notion AI

Agent 层AGENT LAYER

Agent LayerAGENT LAYER

工作流执行的地方。包括编排框架（LangGraph、CrewAI、AutoGen 或自研），Agent 运行时，以及把 Agent 连接到工具、数据库、外部系统的集成层。

Where workflow execution happens. Includes orchestration frameworks (LangGraph, CrewAI, AutoGen, or custom-built), Agent runtimes, and the integration layer that connects Agents to tools, databases, and external systems.

TOOLS

LangGraph · CrewAI
AutoGen · Letta
Pydantic AI · Inngest

模型层MODEL LAYER

Model LayerMODEL LAYER

基础——访问多个基础模型，通常至少一家前沿 API 供应商，加上用于主权工作流的开源权重模型，并有抽象层使模型可被切换。没有这一层，组织仍更接近 API 依赖，尚未进入 AI Native 的运行形态。

The foundation: access to multiple foundation models, typically at least one frontier API provider plus open-weight models for sovereign operator workflows, with an abstraction layer that makes models swappable. Without this layer, the organization is closer to API dependence than to AI Native operation.

TOOLS

Anthropic · OpenAI
Google · Mistral
Llama · Qwen · DeepSeek

把这四层叠起来，组织的"样子"也变了。传统组织图是层级方框、用岗位说明书定义角色；AI-Native 的"组织图"只有三件——少数判断节点、一张近零边际成本的 agent 网、一层流动的上下文。一个判断者可指挥 50–100 个 agent；结构随工作量伸缩，而不随人数。

Stack those four layers and the shape of the organization changes too. A traditional org chart is boxes in a hierarchy, with job descriptions defining roles; the AI-Native “org chart” has only three things: a few judgment nodes, an agent network at near-zero marginal cost, and one layer of flowing context. One judge can direct 50–100 agents; the structure scales with the workload, not with headcount.

FIG. 8.1 / THE AI-NATIVE ORG TOPOLOGY · 组织拓扑 FIG. 8.1 / THE AI-NATIVE ORG TOPOLOGY 看懂：少数判断节点 + agent 网 + 一层共享上下文 Read: a few judgment nodes + an agent network + one shared context layer

判断层 · 少数人Judgment · a few people

PMEngDesignGTM

执行层 · agent 网络（近零边际成本）Execution · agent network (near-zero marginal cost)

上下文层 · 共享世界模型Context · shared world model

人与 agent 都从这里继承背景，不靠人肉转译humans and agents inherit context here, with no human relay

SECTION

CASE MAP · 案例图谱

实证 · 样本，含口径Evidence · Cases with Source Notes

过渡期的观察样本

Evidence from a Transition

这些公司记录的是人类组织受 AI 冲击时如何变形，不是 AI-Native 终态的证明。

These companies show how human organizations deform under AI pressure. They do not prove an AI-Native end state.

一句话In one line

九组公司提供的是带偏差的过渡期观察：它们能帮我们提出结构问题，却不能替未来下结论。把成功、回摆与倒下并列，是为了防止把幸存者误读成定律。These nine companies are biased observations from a transition. They help us frame structural questions, not settle the future. Putting wins, reversals, and failures together prevents survivors from being misread as laws.

本章性质 · 过渡期观察Chapter Nature · Transitional Evidence所有数字标注口径（自报 / 第三方估算 / 多方报道核验），估值与收入随时间失效，以来源时点为准。更大的限制是样本选择：留下公开记录的公司、尤其是成功公司，比静默消失的公司更容易被看见。这些案例只支持有限的结构推断，不为任何单家公司的前景或终态组织背书。All figures are annotated with source type: self-reported / third-party estimate / cross-verified from multiple reports. Valuations and revenue figures decay over time; treat each datum as of its source date. The larger limit is sample selection: companies that leave public traces, especially successful ones, are easier to see than the ones that disappear quietly. These cases support only bounded structural inferences, not the prospects of any company or a final organizational form.

CASES · 公司案例 · 2024-2026 九家公司，每家只说明一件事——按后续处境排列，退场的也照登Nine companies, each showing just one thing — ordered by where they stand, the ones that exited kept on the list

速查 · 九家公司，按后续处境排列——仍在营的、回摆的、已退场的都在同一张表里，30 秒看清「谁证明了什么、后来又怎样」，再读下方卡片Quick index · nine companies, ordered by where they stand now — still running, reversed, or exited, all on one table — so in 30 seconds you can read “who proved what, and what became of them,” then read the cards below
公司Company	后续Status	说明一件事Shows one thing
Anthropic	在营running	原生组织的节奏：90 天滚动、并行 agent 编队a born-native rhythm: 90-day cycles, parallel agent squads
Anysphere / Cursor	在营running	极致人效：史上最快 $1B ARRextreme output per head: the fastest-ever $1B ARR
DeepSeek	在营running	无 KPI、无职级也做出前沿模型a frontier model with no KPIs and no titles
Shopify	转型中transitioning	大组织转型：把「用 AI」写进日常评估big-org transition: AI use written into daily review
Midjourney	在营running	零融资、约 40 人做到高营收zero funding, ~40 people, high revenue
Klarna	回摆reversed	裁员叙事的过山车：砍完又回招the layoff-narrative rollercoaster: cut, then rehired
Air Canada	判例precedent	chatbot 承诺被判有效：AI 输出即公司责任chatbot promise held binding: AI output is the company’s liability
Inflection / Adept	被收编absorbed	第一天就原生，仍然倒下——原生不免死native from day one, still gone; native is no immunity
Builder.ai	破产bankrupt	AI washing 破产：假智能体的下场AI-washing bankruptcy: the fate of fake agents

Anthropic"Hive Mind" · 2025

原生组织的运作节奏和传统公司不一样：经营计划最多只排到 90 天，战略也不靠开会传达，而是写成一篇长文发出来。 Steve Yegge 2026/2 的《The Anthropic Hive Mind》访谈描述：最长经营计划只到 90 天；没有传统部门墙；CEO 用 2,000 字 Slack 长文而非会议传达战略；Claude Cowork 从想法到上线 10 天。Anthropic 自身既是 AI Native 组织、也通过 Project Vend / Project Deal 进行 Agent-first 实验。"完全靠氛围运转的蜂群"——内部员工原话。

A born-native organization runs on a different rhythm: plans roll in 90-day windows, and strategy travels in one long post instead of a round of meetings. Steve Yegge’s February 2026 interview The Anthropic Hive Mind describes: longest operational plan runs only 90 days; no traditional departmental walls; the CEO communicates strategy via 2,000-word Slack posts rather than meetings; Claude Cowork went from idea to launch in 10 days. Anthropic is itself an AI Native organization and is simultaneously running Agent-first experiments through Project Vend and Project Deal. “A hive running entirely on vibes” is a direct quote from an internal employee.

规划周期Planning Horizon最长 90 天90 days max 年化收入Annualized Revenue$9B (end-2025) → $30B+ (2026/4) 特征HallmarkHive Mind 协作模式Hive Mind collaboration

Anysphere / Cursor史上最快 $1B ARR · 2024-2026Fastest-ever $1B ARR · 2024-2026

把执行交给 agent 以后，人均创收不是提高一点，而是跳了一个量级（约 $6M/人）：变的是组织的经济结构，而不只是效率更高。 2022 年 4 名 MIT 学生创办。ARR 轨迹：2025/1 $100M → 2025/6 $500M → 2025 年末 $1B → 2026/2 $2B（管理层预期年末 $6B+），史上最快达到 $1B ARR 的 B2B 公司。员工口径分歧大（各来源 180-400 人；PitchBook 记 400），按 ~300 人计人均创收约 $600 万，约为 Salesforce（~$53 万）的 11 倍。2025/11 投后估值 $29.3B，2026/4 以 $50B 投前洽谈新轮。

Once execution goes to agents, revenue per head changes by an order of magnitude (~$6M/person): the economics of the organization change, not just its efficiency. Founded in 2022 by 4 MIT students. ARR trajectory: $100M (2025/1) → $500M (2025/6) → $1B (end-2025) → $2B (2026/2), with management projecting $6B+ by year-end. It is the fastest B2B company to reach $1B ARR on record. Headcount figures vary widely across sources (180-400; PitchBook logs 400); at ~300 employees, revenue per head is approximately $6M, roughly 11× Salesforce (~$530K). Post-money valuation $29.3B (2025/11); raising a new round at $50B pre-money as of 2026/4.

员工数Headcount~300 (口径sources 180-400) ARR (2026/2)$2B 〔TechCrunch〕人均创收Revenue / Head~$6M / 人person

DeepSeek"无 KPI 无职级No KPIs, No Titles" 文化 · 杭州Culture · Hangzhou

一个没有 KPI、也没有职级的团队，照样做出了前沿模型——真正起作用的是判断力的密度，而不是管理层级。 高瓴系对冲基金 High-Flyer 的副产品。CEO 梁文锋 2024 年起拒绝外部融资。据 LatePost 报道，DeepSeek 内部以"无 KPI、无职级、无正式汇报"运作。2025 下半年起多名核心作者被腾讯姚顺予等大厂以 2-3 倍薪酬挖走，迫使其开始建立正式公司估值。这是 AI Native 组织"反传统结构"在中国语境的具体实例。

Judgment density beats managerial hierarchy: an organization with no KPIs and no titles shipped frontier models. A byproduct of High-Flyer, the Hillhouse-affiliated hedge fund. CEO Liang Wenfeng has refused external funding since 2024. According to LatePost, DeepSeek operates internally under “no KPIs, no titles, no formal reporting lines.” From the second half of 2025, several core authors were poached by Tencent’s Yao Shunyu and other tech giants offering 2-3× compensation, forcing the company to begin establishing a formal corporate valuation. This is a concrete Chinese-context example of an AI Native organization rejecting conventional hierarchy.

核心理念Core Principle无 KPI 无职级No KPIs, no titles 融资Funding拒绝外部External refused 挑战Challenge人才被高薪挖角Talent poached at premium

Shopify最成功的转型样本 · 2025/4Most Successful Transition Case · 2025/4

大组织的转型可以从一纸备忘录起步，但真正落地，是把「用 AI」写进日常考核和工作流——光有表态不够，后面得有制度接住。 Tobi Lütke 2025/4 "AI-first memo" 主动公开发布："Reflexive AI usage is now a baseline expectation." AI 使用度被纳入绩效与同行评议。一次采购 1,500 个 Cursor 许可证，几周后又加 1,500。使用增长最快的是支持与营收团队，而非工程师。同时把 context engineering 提升为系统性实践。

A large organization’s transition starts with a memo but only holds when AI use is written into everyday evaluation and workflow: words must be followed by structure. Tobi Lütke proactively published his “AI-first memo” in 2025/4: “Reflexive AI usage is now a baseline expectation.” AI usage was incorporated into performance reviews and peer evaluations. The company purchased 1,500 Cursor licenses at once, then added another 1,500 weeks later. The fastest-growing user cohort was support and revenue teams, not engineers. Context engineering was simultaneously elevated to a systematic organizational practice.

员工数Headcount~8,100 关键文档Key DocumentAI-first memo Cursor 许可证Licenses3,000+

Klarna最戏剧性的"过山车" · 2024-2025The Most Dramatic Rollercoaster · 2024-2025

把「减人」本身当成目标，很容易做过头：AI 能替代的是执行，一旦替代到需要人来判断的环节，就会被迫回招。 2024 年 CEO 宣称 AI 顶替了 700 名客服，2023-2024 累计裁员约 22%。2025/5 公开承认"我们走得太远了"（We went too far），启动 Uber 风格的远程客服回招〔mlq.ai〕。同年 9 月以 $19.65B 估值在美 IPO 上市。"AI 顶替人"叙事发生反转的标志性案例。

Making headcount reduction the goal overshoots: replacement stops at judgment nodes, or it swings back. The CEO claimed in 2024 that AI had replaced 700 customer-service agents; cumulative layoffs across 2023-2024 totaled approximately 22%. In 2025/5, the company publicly admitted “we went too far” and launched an Uber-style remote rehiring campaign〔mlq.ai〕. That September the company IPO’d in the U.S. at a $19.65B valuation. The defining case of the “AI-replaces-people” narrative backfiring.

2024 裁员Layoffs700 名客服customer-service agents 2025/5回招人工re-hiring humans 2025/9 IPO$19.65B 估值valuation

Air Canada判例基石 · 2024/2 BCCRTLandmark Ruling · 2024/2 BCCRT

责任没法外包给模型：聊天机器人说出口的话，法律上就等于公司说的话。 BC 省 Civil Resolution Tribunal 在 Moffatt v. Air Canada 案中首次明确公司必须为聊天机器人的承诺承担法律责任。航空公司辩称 "AI 是独立法律主体" 被法庭拒绝。这一判例确立了"组织对 AI 决策负责"的全球先例，被后续多国判例引用。

Accountability cannot be outsourced to a model: in law, what the chatbot said, the company said. In Moffatt v. Air Canada, the BC Civil Resolution Tribunal established for the first time that a company must be held legally liable for commitments made by its chatbot. The airline’s argument that “AI is an independent legal entity” was rejected by the tribunal. This ruling set a global precedent for organizational accountability over AI decisions and has since been cited in cases across multiple jurisdictions.

判决Ruling公司必须负责Company liable 影响Impact全球判例先例Global precedent 教训Lesson责任锚不可外包Accountability anchor cannot be outsourced

Midjourney零融资 · ~40 人Zero Funding · ~40 People

约 40-50 人、零外部融资，第三方估算年收入 $2-3 亿（公司不披露，口径存疑）。它同时证明两件事：AI 品类确实允许极小团队做出大生意；以及不靠任何"Agent 编队 / 工作流即代码"的叙事也能做到。要区分"AI 产品红利"与"AI Native 组织设计红利"，Midjourney 是最干净的检验案例。

Approximately 40-50 people, zero external funding; third-party estimates put annual revenue at $200-300M (the company does not disclose; source quality is uncertain). It simultaneously proves two things: the AI category genuinely allows tiny teams to build large businesses; and it can be done without any narrative of “agent squads / workflow-as-code.” To separate “AI product dividend” from “AI Native organizational-design dividend,” Midjourney is the cleanest test case available.

员工Employees~40-50 外部融资External Funding$0 估算年收入Est. Annual Revenue$200-300M (第三方third-party)

Inflection / Adept从第一天原生，仍然倒下 · 2024Native from Day One, Gone in 2024

两家"从第一天就 AI Native"的明星：Inflection 融资 $15 亿、估值 $40 亿，2024/3 创始人与多数团队被 Microsoft 雇佣式收编；Adept 融资 $4.15 亿、估值约 $10 亿，2024/6 被 Amazon 同式收编。原生身份不保证存活——分发劣势、资本消耗与模型层挤压可以杀死组织设计最先进的公司。读前面的成功案例时，要把这两家一起算进去，才能看清成功的概率有多高。

Two “AI Native from day one” stars: Inflection raised $1.5B at a $4B valuation, with its founders and most of the team absorbed by Microsoft in an acqui-hire in 2024/3; Adept raised $415M at a valuation of approximately $1B and was similarly absorbed by Amazon in 2024/6. Native status does not guarantee survival: distribution disadvantage, capital burn, and model-layer compression can kill the most advanced organizational designs. Count these two alongside the success stories above; only then do you see how often this actually works out.

Inflection$1.5B 〔TechCrunch〕融资 → 收编raised → acqui-hired Adept$415M 〔TechCrunch〕融资 → 收编raised → acqui-hired 死因Cause of Death分发与模型层挤压Distribution + model-layer compression

Builder.aiAI Washing 破产标本 · 2025/5Bankruptcy Specimen · 2025/5

估值峰值 $15 亿、融资约 $4.5 亿（Microsoft、QIA 背书），2025/5 破产。"AI 助手 Natasha"背后是约 700 名工程师人工拼装代码；曾以约四倍虚高的收入预测获取贷款，另被报道与合作方互开发票"空转营收"。这就是 AI Theater 的财务报表形态——B.12 指标剧场与支柱 05 可观测性，针对的正是这种连自己都看不见真相的组织。

Peak valuation $1.5B, total funding approximately $450M (backed by Microsoft and QIA); went bankrupt in 2025/5. Behind the “AI assistant Natasha” were approximately 700 engineers manually stitching code together; the company obtained loans using revenue projections inflated by roughly 4×, and was separately reported to have circulated invoices with partners to manufacture fictitious revenue. This is what AI Theater looks like on a financial statement. The metric-theater anti-pattern (B.12) and pillar 05 observability exist precisely to counter organizations that cannot even see the truth about themselves.

估值峰值Peak Valuation$1.5B 〔Rest of World〕结局Outcome2025/5 破产bankruptcy 死因Cause of DeathAI washing + 收入注水revenue inflation

AI SIDE 09 成功与失败登记在同一张表上，免得只看见活下来的公司。 Wins and failures sit in the same ledger, so you see more than just the survivors.

PATTERN: 原生型最成功，转型型次之，客户面 AI 化最危险; Born-native performs best; transitioning next; customer-facing AI is most dangerous
RULE OF THUMB: 从后台开始
从内部开始
从可逆决策开始; Start from the back office
Start from the inside
Start from reversible decisions

这些案例合起来不是一张分类表，而是一组彼此制衡的证据。Anthropic 与 Anysphere 证明"从零架构"走得通，Inflection 与 Adept 提醒你它不保证存活；Shopify 证明大组织能转，Klarna 的回摆划出替代的边界；Air Canada 判例把责任钉在公司身上；Midjourney 则防止把"小团队大杠杆"错记在 AI 名下。AI 时代变化太快，任何分类体系都会比它引用的数据先过时——所以这里不建分类学，每家公司只说明一件事，完整口径都登记在附录。

Taken together, these cases are not a classification table but a set of mutually checking pieces of evidence. Anthropic and Anysphere prove that architecting from zero works, while Inflection and Adept remind you it guarantees no survival; Shopify proves a large organization can turn, while Klarna’s swing-back marks the boundary of replacement; the Air Canada ruling pins accountability on the company; and Midjourney keeps “small team, huge leverage” from being misfiled as an AI invention. Things now move too fast for any taxonomy to outlive the data it cites, so instead of a classification scheme, each company here shows one thing, with full sourcing registered in the appendix.

范围提醒 · 过渡态样本。这九家全部诞生于人类组织向 AI-Native 迁移的过渡期。它们佐证的是"人类组织在 AI 冲击下如何变形"，与"生而 AI-Native 的组织如何运转"是两个不同命题——后者要等够老的原生样本，眼下无人能提供，终态验证因此是一笔仍然未决的公开赌注。

Scope note · transition-state samples. All nine were born during the migration of human organizations toward AI-Native. They attest to “how human organizations deform under AI,” a different proposition from “how a born-AI-Native organization runs”: the latter awaits samples old enough to exist, which no one can yet supply, so terminal-state validation remains an open, unsettled bet.

SECTION

MULTI-DIMENSIONAL · 多维度分析

机理 · 四个学科截面

Mechanism · Four Disciplinary Cross-Sections

同一现象的四个截面

Four Cross-Sections of the Same Phenomenon

AI Native 同时是经济、监管、哲学与劳工现象。这四个截面不是背景知识——它们是终将反过来修改这套方法论的现实约束。

AI Native is simultaneously an economic, regulatory, philosophical, and labor phenomenon. These four cross-sections are real-world constraints that will eventually come back to revise the blueprint, not background knowledge.

一句话In one line

AI Native 同时是经济、监管、哲学与劳工现象：这四个学科截面终将反过来修改方法论。真正受益的是少数完成重构的组织，多数公司只是在为表演买单。AI Native is at once an economic, regulatory, philosophical, and labor phenomenon: these four cross-sections are real-world constraints that will come back to revise the blueprint. The genuine beneficiaries are the few organizations that truly reconstruct, while most are paying for theater.

D - 01经济维度Economic Dimension

Daron Acemoglu 2024 年在 MIT 的研究《The Simple Macroeconomics of AI》给出谨慎评估——AI 未来 10 年累计 GDP 增长贡献约 1.1-1.6%（年均 ~0.05%），远低于行业普遍宣称的数倍效应。MIT NANDA 2025/7 预印报告测得 95% 的定制化企业 GenAI 试点在六个月窗口内没有可衡量的 P&L 影响——同一报告也记录了员工自带通用工具的"影子 AI"被大规模采用且有效：95% 说的是组织级试点的失败，不是 AI 本身的失败。

Daron Acemoglu’s 2024 MIT study The Simple Macroeconomics of AI offers a cautious assessment: AI’s cumulative GDP contribution over the next 10 years will be roughly 1.1-1.6% (≈ 0.05% per year), far below the multi-fold effects industry commonly claims. The MIT NANDA July 2025 preprint found that 95% of customized enterprise GenAI pilots showed no measurable P&L impact within a six-month window; the same report also documented that employee-initiated “shadow AI” using general-purpose tools was adopted at scale and proved effective: the 95% figure describes the failure of org-level pilots, not of AI itself.

这意味着——AI 经济效益正面但远低于炒作；真正受益的是少数能实现 AI Native 重构的组织，多数公司是在为表演买单。Acemoglu 警告 AI 主要影响数据汇总、视觉匹配、模式识别这类白领办公任务，但仍预测 2030 年记者、金融分析师、HR 等职位仍存在。同时他强调 AI 会扩大资本对劳动的分配差距而非缩小白领内部不平等——这是 AI Native 组织的政治经济学背景。

The implication: AI’s economic benefits are real but far below the hype; the organizations that genuinely benefit are the minority that can achieve true AI Native reconstruction, while most companies are paying for performance. Acemoglu warns that AI primarily affects white-collar office tasks such as data aggregation, visual matching, and pattern recognition, yet still predicts that roles such as journalist, financial analyst, and HR professional will persist through 2030. He also emphasizes that AI will widen the capital-to-labor distribution gap rather than narrow inequality within white-collar ranks. This is the political-economy backdrop for the AI Native organization.

D - 02监管维度Regulatory Dimension

欧盟 AI Act 的 Annex III 高风险义务已被 Digital Omnibus（2026-05 临时协议）推迟至 2027-12 起分阶段适用（招聘评估、信用决策、教育评分、执法等）（截至 2026-07；原定 2026-08-02 全面适用日已不再生效，时间表仍可能再变、以欧盟官方最新为准）；最高罚款 €3,500 万或全球营收 7%。美国联邦层面 Biden EO 14110 被 Trump 政府 2025/1 废除，州层面（科罗拉多 SB 24-205、加州 SB 1047 被否决）拼盘形成。中国《生成式 AI 服务管理暂行办法》2023 年实施。

The EU AI Act’s Annex III high-risk obligations have been postponed by the Digital Omnibus (May 2026 provisional agreement) to a phased application starting December 2027 (recruitment screening, credit decisions, educational scoring, law enforcement, etc.) (as of July 2026; the former 2 August 2026 full-application date no longer holds, and the timeline may shift again; see the latest EU official text); maximum fines reach €35 million or 7% of global revenue. At the US federal level, Biden Executive Order 14110 was revoked by the Trump administration in January 2025, leaving a patchwork of state-level legislation (Colorado SB 24-205; California SB 1047 was defeated). China’s Interim Measures for the Management of Generative AI Services entered force in 2023.

对 AI Native 组织的实操含义——合规不做事后处理，而是架构约束。如果你的核心 workflow 涉及 EU 公民的招聘、信用、教育数据，在高风险义务生效后（经 Digital Omnibus 推迟、约 2027-12 起分阶段）必须有 human-in-the-loop、决策可审计、训练数据可追溯。这就是为什么"可观测性先于规模"和"人作为判断与责任锚"在架构支柱中是基础性的而非可选的。

The practical implication for AI Native organizations: compliance is not a post-hoc fix but an architectural constraint. If your core workflows touch EU citizens’ recruitment, credit, or educational data, human-in-the-loop, auditable decisions, and traceable training data will all be mandatory once the high-risk obligations take effect (postponed by the Digital Omnibus to a phased start around December 2027). This is precisely why “observability before scale” and “humans as judgment and accountability anchors” are foundational rather than optional among the architectural pillars.

D - 03哲学维度Philosophical Dimension

Hannah Arendt 在《人的境况》(1958) 划分劳动（labor，维持生命）、工作（work，制造耐用品）、行动（action，公共领域中以言行显现自我）。AI Native 时代如果连 work 都被 AI 接管，"action" 在哪里？这不是花拳绣腿的问题——它直接关乎 AI Native 组织如何为"人"定义角色。

In The Human Condition (1958) Hannah Arendt distinguishes labor (sustaining life), work (fabricating durable goods), and action (appearing in the public realm through word and deed). If even work is taken over by AI in the AI Native era, where does “action” go? This is not an ornamental question; it goes directly to how an AI Native organization defines the role of “the human.”

Acemoglu 的回答是"专长与信息提供者"，Tobi Lütke 的回答是"context engineer"，Anthropic Hive Mind 的回答是"品味与判断"，Buurtzorg 的回答是"完整自我"。这些答案都对，但都不完整。最稳健的回答是——人是承担后果的能力（accountability）。当 AI 可以无穷生成，人类的稀缺性在于"承担后果的能力"——这是 Air Canada 案、Lattice 撤回、Klarna 回招的共同启示，也是架构支柱中"人作为判断与责任锚"的哲学基础。

Acemoglu’s answer is “expert and information provider”; Tobi Lütke’s answer is “context engineer”; the Anthropic Hive Mind’s answer is “taste and judgment”; Buurtzorg’s answer is “the whole self.” Each answer is correct, yet none is complete. The most reliable answer is: the human is the capacity to bear consequences (accountability). When AI can generate infinitely, human scarcity lies in the capacity to bear consequences. This is the shared lesson of the Air Canada case, the Lattice withdrawal, and the Klarna re-hire; it is also the philosophical foundation of “humans as judgment and accountability anchors” among the architectural pillars.

D - 04劳工维度Labor Dimension

SAG-AFTRA 2023/7/14-11/9 的 118 天大罢工是首个明确以 AI 为核心议题的劳工运动。胜利成果包括对"合成演员"（Synthetic Performers）和"数字替身"（Digital Replicas）的合同保护、强制 informed consent。2024/7 又对游戏公司发起 AI 议题罢工。2026/3 推动"Tilly tax"——对 AI 角色征税。

The SAG-AFTRA 118-day strike from 2023/7/14 to 11/9 was the first labor action to place AI squarely at its center. Victories included contractual protections for “Synthetic Performers” and “Digital Replicas,” and mandatory informed consent. In July 2024, another AI-focused strike was launched against gaming companies. In March 2026, the union began pushing a “Tilly tax,” a levy on AI-generated characters.

这预示着 2030 年代劳动者运动的新主题。AI Native 组织必须预判这种张力，否则会被工会运动反噬。Klarna 的回招、Duolingo 的撤回、Lattice 的退步——都是劳工力量在工会化之前已经通过舆论和市场表达的反向作用。在欧洲、加拿大、巴西等更工会化的市场，这种张力会更早进入直接对抗。AI Native 不是"绕开劳工政治"的方法，是"必须更细致地处理劳工政治"的方法。

This foreshadows the new themes of labor movements in the 2030s. AI Native organizations must anticipate this tension or face union-driven backlash. Klarna’s re-hiring, Duolingo’s reversal, and Lattice’s retreat all show labor expressing counter-pressure through public opinion and market signals before formal unionization. In more highly unionized markets such as Europe, Canada, and Brazil, this tension will reach direct confrontation sooner. AI Native is a method that demands handling labor politics with greater care, not a method for “bypassing labor politics.”

SECTION

FAILURE MODES · 失败模式

实证 · 失败记录＋可证伪条件Empirical · Failure Record + Falsifiability Conditions

已被记录的陷阱

The Documented Traps

下面是已经留下记录的失效形态。它们值得被当作检查表，不该被误读成一张能预言所有失败的命运图。

These are failure shapes that left records. Treat them as a checklist, not a fate map that predicts every failure.

一句话In one line

结构会放大某些失败，但不替人写结局。把已知陷阱变成事前检查、事中观察和事后复盘，才比一张漂亮的失败分类更有用。Structure can amplify some failures, but it does not write the ending for us. Turning known traps into preflight checks, live observations, and postmortems is more useful than a polished taxonomy.

观察边界Observation Boundary失败分类天然遗漏“安静消失”的样本，也会随能力、监管和市场改变。每张卡片应当被当作待补充的假设：它在什么条件下出现？最早能看到什么？哪条观察会说明我们把它归错了类？A failure taxonomy necessarily misses cases that disappear quietly, and it changes with capability, regulation, and markets. Read every card as a hypothesis to extend: under what conditions does it arise, what can be seen early, and what observation would show that we classified it wrongly?

失效滑坡图 —— 十五种失效按最危险的创业阶段排进四条泳道；每条给「会滑向 →」的下一步。先找到你所在的阶段，就看清了脚下这一种与它下一步会拖你去哪。两条链最终都汇到「智能体技术债 / 演化失败」。下面的卡片是每一条的完整论述。Failure slide-map: the fifteen modes sorted into four lanes by the stage where each bites hardest; each carries a “slides to →” pointer. Find your stage and you see the one underfoot and where it drags you next. Both chains converge on “agentic technical debt / failure to evolve.” The cards below carry each mode in full.

想法Idea·2

AI Theater · 表演而非实践AI Theater

宣布 AI，工作流照旧AI announced, workflows unchanged

→ 滑向 Agent 洗白→ slides to Agent washing

Agent 洗白Agent washing

非代理软件冒充智能体non-agentic software posing as agents

→ 滑向算法封建主义→ slides to algorithmic feudalism

MVPMVP·5

上下文饥饿Context starvation

上下文太薄，输出泛化context too thin, output generic

→ 滑向智能体技术债→ slides to agentic tech debt

建造先于验证Building before validating

有产品，未必有需求a product, maybe no demand

→ 滑向早期信号错读→ slides to false PMF

零摩擦范围蔓延Zero-friction scope creep

功能随手加，失了方向features added at will, direction lost

→ 滑向智能体技术债→ slides to agentic tech debt

智能体技术债Agentic technical debt

代码库缺一致心智模型codebase without a coherent model

→ 滑向演化失败→ slides to failure to evolve

合成自信Synthetic confidence

听着权威，就当正确authoritative tone taken as correct

↧ 终点 · 触客即损↧ terminal · reaches the customer

上线Launch·1

早期信号错读False product-market fit

初期热度误当真需求early buzz mistaken for demand

→ 滑向过早扩规模→ slides to premature scaling

规模Scale·7

The Inversion · 本末倒置The Inversion

人沦为算法的外设people become the algorithm’s peripherals

→ 滑向判断空心化→ slides to judgment hollowing

判断空心化Judgment hollowing

人不再练习做决策people stop practicing decisions

→ 滑向合成自信→ slides to synthetic confidence

裁员叙事自反噬Layoff-narrative backfire

员工从此藏起提效staff hide their efficiency gains

→ 滑向 AI-only→ slides to AI-only

AI-first 当成 AI-onlyAI-first read as AI-only

把人整体撤出回路humans pulled fully out of the loop

→ 滑向判断空心化→ slides to judgment hollowing

过早扩规模Premature scaling

没可观测性就上规模scaling without observability

→ 滑向智能体技术债→ slides to agentic tech debt

算法封建主义Algorithmic feudalism

深绑单一供应商，被挟持deep-locked to one vendor

→ 滑向演化失败→ slides to failure to evolve

演化失败Failure to evolve

绑死当前模型，须重建frozen to today’s model, must rebuild

↧ 终点 · 推倒重来↧ terminal · rebuild from scratch

AI Theater表演而非实践

最常见——宣布 AI 倡议、招 AI 头衔、每份备忘录提到 AI，但实际工作流仍然是传统的。在 NANDA 报告那批量不出成效的定制试点里，相当比例正是这种表演——立项给董事会看，不给真实盈亏看。

AI Theater

The most common mode: announcing AI initiatives, hiring AI titles, mentioning AI in every memo, while the actual workflows stay traditional. A large share of the no-measurable-impact pilots in the NANDA report are exactly this kind of theater: projects staged for the board to see, not for the P&L.

The Inversion本末倒置 · 给 AI 打工

最隐蔽——指标全在变好：吞吐涨了、成本降了，于是被当成成功。可人却越用越忙：忙着喂数据、盯 agent、追机器的节奏，遥测从工具滑成全员监控。效率最大化了，意义却塌缩了，人沦为算法的外设。按本方法论，这是失败，不是成功——AI 本该把人从执行里解放出来，这里却把人收编进执行：手段（效率）篡了目的（人）的位。

The Inversion

The most insidious mode: every metric improves (throughput up, cost down), so it reads as success. Yet people only get busier, feeding data, babysitting agents, chasing the machine’s pace, while telemetry slides from tool to all-seeing surveillance. Efficiency is maximized and meaning collapses; people become peripherals of the algorithm. By this methodology that is failure, not success: AI was meant to free people from execution, and here it conscripts them into it. The means (efficiency) has usurped the end (people).

Agent 洗白Agent washing

供应商侧的 AI Theater——把 AI 助手、RPA、聊天机器人改名为 "agentic" 而无实质代理能力。Gartner 2025/6 的判词足够狠：数千家自称 agentic 的供应商中，估计只有约 130 家是真实的；并预测到 2027 年底，超过 40% 的 agentic AI 项目将因成本上升、商业价值不清或风险控制不足被取消[R10]。采购方的解药与 B.12 相同——看遥测，不看 demo；Builder.ai（见第 9 节）就是这个陷阱的破产标本。

Agent washing

The vendor-side version of AI Theater: renaming AI assistants, RPA, and chatbots as “agentic” with no real agency. Gartner’s June 2025 verdict is harsh enough: of the thousands of vendors that call themselves agentic, an estimated 130 or so are real; it also predicts that by the end of 2027, more than 40% of agentic-AI projects will be cancelled over rising costs, unclear business value, or inadequate risk controls[R10]. The buyer’s remedy is the same as in B.12: watch the telemetry, not the demo. Builder.ai (see Section 9) is the bankruptcy specimen of this trap.

过早扩规模Premature scaling

在没有可观测性的情况下部署 Agent，然后在规模上发现它们一直在幻觉、泄露数据、产生低质量输出。这是可恢复的，但代价高昂。

Premature scaling

Deploying agents without observability, then discovering at scale that they have been hallucinating, leaking data, and producing low-quality output the whole time. This is recoverable, but the cost is high.

算法封建主义Algorithmic feudalism

围绕一家供应商的 API 怪癖深度架构，然后在条款变化时被挟持。恢复需要昂贵的重新架构——这就是支柱 04 存在的原因。

Algorithmic feudalism

Architecting deeply around one vendor’s API quirks, then being held hostage when the terms change. Recovery demands an expensive re-architecture; this is exactly why Pillar 04 exists.

上下文饥饿Context starvation

在贫乏的上下文上部署 Agent，得到泛化输出，怪罪模型。修复在上游——在组织如何结构化信息上。

Context starvation

Deploying agents on impoverished context, getting generic output, and blaming the model. The fix is upstream: in how the organization structures its information.

判断空心化Judgment hollowing

自动化得太激进，导致人失去做决策的练习，最终在 AI 失败时失去纠偏能力。这是 AI Native 版本的"飞行员去技能化危机"。

Judgment hollowing

Automating so aggressively that people lose the practice of making decisions, and in the end lose the ability to correct course when the AI fails. This is the AI Native version of the “pilot de-skilling crisis.”

裁员叙事自反噬Layoff-narrative backfire

把 AI 效率收益直接翻译成裁员（"客服效率 +25% → 裁 25% 的人"）的组织，会立刻教会员工隐藏自己的 AI 生产率收益——Mollick 称之为"隐秘赛博格"：员工恰恰是企业唯一的部署知识来源，而他们再也不会给你看效率提升了[R9]。回摆已被预测在案：Gartner 2026/2 预计，到 2027 年把裁员归因于 AI 的客服组织中 50% 将回聘员工做类似职能（换个头衔）；同一调查显示实际因 AI 裁过坐席的公司只有 20%——话语热度远超事实[R11]。结构性解药是 Mollick 的三要素：Leadership 亲自回答组织形态问题、Lab 全员工具访问、Crowd 激励对齐促分享。

Layoff-narrative backfire

Organizations that translate AI efficiency gains directly into layoffs (“customer-service efficiency +25% → cut 25% of the staff”) instantly teach their employees to hide their own AI productivity gains. Mollick calls this the “secret cyborg”: employees are the firm’s only source of deployment knowledge, and they will never show you their efficiency gains again[R9]. The swing-back is already on record: Gartner (February 2026) projects that by 2027, 50% of the customer-service organizations that attributed layoffs to AI will rehire people for similar functions (under a new title); the same survey shows only 20% of firms actually cut agent headcount because of AI, so the talk runs far ahead of the facts[R11]. The structural remedy is Mollick’s three elements: Leadership answering the organization-shape question in person, Lab giving everyone tool access, and Crowd aligning incentives to encourage sharing.

把 AI-first 当成 AI-onlyAI-first read as AI-only

AI-first 是把默认翻转——agent 先做，人守判断与责任锚（支柱 06）；AI-only 是把人整体撤出回路。二者是组织形态光谱上的不同刻度，不是同一点。把后者当前者，已留下有公开记录的标本（厂商自报口径、未独立核实，截至本版）：① Klarna，2024 自报 AI 顶替约 700 名客服，2025/5 公开承认"走得太远"并启动回招；② Duolingo，2025/4 发 AI-first memo，2026/4 公开撤回"以 AI 使用度评估绩效"这条 KPI（Duolingo 撤回 KPI 一事有公开报道；Shopify 案详见第 9 节）。对照之下，③ Shopify，2025/4 同类 memo 虽因措辞引发争议，却停在 AI-first（人仍在回路）、未滑向整体替换，本页另记其为相对成功的转型样本：这恰好标出刻度的分界。三例共享同一边界判据：当成本（省下多少人头）成为过于主导的评估因子、把判断与责任的留存挤出账面，就是 AI-only 越界的反模式信号；回滚的代价，已由"人作为判断与责任锚"这条支柱预先标价。

AI-first read as AI-only

AI-first flips the default: agents act first, humans hold the judgment-and-accountability anchor (Pillar 06). AI-only pulls people out of the loop wholesale. These are different graduations on the spectrum of organizational forms, not the same point. Reading the latter as the former has left specimens on record (vendor self-reported figures, not independently verified, as of this edition): ① Klarna: in 2024 it self-reported that AI had replaced ~700 customer-service agents, then in 2025/5 publicly admitted it “went too far” and began rehiring; ② Duolingo: sent an AI-first memo in 2025/4 and in 2026/4 publicly retracted the KPI of “evaluating performance by AI usage” (the Duolingo retraction is on public record; the Shopify case is detailed in Section 9). By contrast, ③ Shopify: its 2025/4 memo of the same genre drew controversy over its wording, yet stayed at AI-first (humans still in the loop) and never slid into wholesale replacement; this page records it as a comparatively successful transition, which marks exactly where the graduation lies. All three share one boundary criterion: when cost (how much headcount was cut) becomes an overly dominant evaluation factor and the retention of judgment and accountability is squeezed off the books, that is the anti-pattern signal of AI-only over-reach; the cost of the rollback is priced in advance by the pillar “humans as the anchor of judgment and accountability.”

合成自信Synthetic confidence

把 AI 输出当作权威正确的诱惑，因为它听起来权威。修复是结构性的——永远不让 AI 输出不经人类判断节点就到达客户、合作伙伴或监管者。METR 2025 年的随机对照试验给这个陷阱标了一个精确的刻度：资深开发者使用 AI 工具后实际慢了 19%，却自认为快了约 20%——自我感知与实测结果方向相反。

Synthetic confidence

The temptation to treat AI output as authoritatively correct because it sounds authoritative. The fix is structural: never let AI output reach a customer, partner, or regulator without passing through a human judgment node. METR’s 2025 randomized controlled trial put a precise mark on this trap: after adopting AI tools, experienced developers were in fact 19% slower, yet believed they were roughly 20% faster; self-perception pointed the opposite way from the measured result.

演化失败Failure to evolve

围绕当前模型构建 AI Native 组织，18 个月后发现一切必须重建。持续演化必须从第一天就被构建到架构里，不是后期补丁。

Failure to evolve

Building an AI Native organization around the current model, then finding 18 months later that all of it has to be rebuilt. Continuous evolution has to be built into the architecture from day one, not patched in afterward.

建造先于验证Building before validating

这是 2026 年 AI Native 创业者最需要警惕的陷阱——agentic coding 让"我有想法 → 我有产品"的距离压缩到几小时之内，但有 demo 不等于有 PMF。CB Insights 历史数据显示 42% 的创业失败是因为造了没人要的东西，AI 时代这个数字只会上升。Agent 会用同样的热情为坏想法和好想法写代码——这套系统里的智能是你的，AI 不会替你做问题验证。

Building before validating

This is the deadliest trap for the 2026 AI Native founder: agentic coding collapses the distance from “I have an idea” to “I have a product” to within a few hours, but a demo is not PMF. CB Insights’ historical data shows that 42% of startup failures come from building something nobody wants, and in the AI era that number will only climb. An agent writes code for a bad idea with the same enthusiasm it brings to a good one: the intelligence in this system is yours, and AI will not do the problem validation for you.

智能体技术债Agentic technical debt

与传统技术债不同——传统债是渐进累积、可在专门 sprint 中清理；智能体债是复合的。每个 Claude Code / Cursor 会话如果没有持久上下文（如 CLAUDE.md、架构规约文档），会从零推导基础决策，决策之间漂移。最终的问题不在某个片段写得差，而在于整个代码库失去了一致的心智模型——因为各部分从未被设计成相互配合。这是 Anthropic Founder's Playbook 反复强调的核心陷阱。

Agentic technical debt

Unlike traditional technical debt, which builds up gradually and can be cleared in a dedicated sprint, agentic debt is compounding. Each Claude Code / Cursor session that lacks persistent context (such as a CLAUDE.md or an architecture-spec document) re-derives the basic decisions from scratch, and those decisions drift apart. What you end up with is a codebase with no consistent mental model, not one where any single piece is bad — because the parts were never designed to work together. This is the core trap the Anthropic Founder’s Playbook returns to again and again.

零摩擦范围蔓延Zero-friction scope creep

范围蔓延一直存在，但 AI 时代抗体消失了——传统的反向压力（工程时间的真实成本）在加 feature 只需一个下午时不再存在。每个新增功能都"看起来理所当然"——产品当然应该处理那个 edge case、用户当然会想要那个 workflow。但它们叠加起来会让产品越过初始边界、失去方向感。解药是建造前的书面 scope 文档——明确规定做什么、不做什么、什么真实用户证据会触发新功能。

Zero-friction scope creep

Scope creep has always existed, but in the AI era the antibody is gone: the traditional counter-pressure, the real cost of engineering time, disappears when adding a feature takes only an afternoon. Every new feature “looks self-evidently right”: of course the product should handle that edge case, of course users will want that workflow. Stacked together, though, they carry the product past its initial boundary and cost it any sense of direction. The antidote is a written scope document before you build, stating plainly what is in, what is out, and what real user evidence will trigger a new feature.

早期信号错读False product-market fit

上线时来自创始人朋友、投资组合公司、Hacker News 头条的流量给出令人陶醉的早期数字，但这不是 PMF——是发射能量（launch energy），来自一次性、不可重复的力。第 6 周或第 12 周初始 boost 消退后曲线才显现。Sean Ellis test（"如果这产品消失你会非常失望吗"≥ 40%）和"努力曲线倒转"（产品开始自己工作而不需要推）是更可靠的 PMF 信号。AI 工具让到达"令人陶醉的早期数字"的门槛更低——意味着错读的风险更高。

False product-market fit

At launch, traffic from a founder’s friends, portfolio companies, and a Hacker News front page produces intoxicating early numbers, but this is launch energy, drawn from a one-time, non-repeatable force, not PMF. The real curve shows up only at week 6 or week 12, once the initial boost fades. The Sean Ellis test (“would you be very disappointed if this product went away” ≥ 40%) and the “inverted effort curve” (the product begins working on its own without being pushed) are the more reliable PMF signals. AI tools lower the threshold for reaching “intoxicating early numbers,” which means the risk of misreading them runs higher.

BOUNDARIES · 失效与边界——把失效率变成设计件 · Failure and Boundaries: Error Rates as Design Objects

上面十五张卡片讲的都是别人的组织怎么滑下去。合上卡片，问题回到自己身上：把一段真实工作交给 agent 之前——它的错误率是多少？错了谁先看见？看见之后收不收得回？这段工作的数据流进了谁的模型？这四个问题的答案，此刻写在你组织的哪里？多数组织的诚实回答是：写在还没发生的那份事故复盘里。这一节要做的事，是把答案从复盘挪进设计，拆成四个可以逐一验收的设计件：复核层、回滚、数据边界、审计。

The fifteen cards above are all about how other organizations slid. Close the cards and the question lands on you: before a real piece of work is handed to an agent, what is its error rate? Who sees a failure first? Once seen, can the action be recalled? Whose model did the data flow into? Where, right now, are those four answers written down in your organization? For most, the honest reply is: in an incident postmortem that has not happened yet. This subsection moves the answers out of the postmortem and into design, as four objects you can inspect one by one: a review layer, rollback, data boundaries, and audit.

先说复核层。它的厚度不该由信任感定，该由数字定。两个当前口径可用作起点：在 175 项模拟公司职业任务上，最强模型的自主完成率约 30%（TheAgentCompany，2025 年榜单口径）[R60]；按「50% 成功率下能独立完成的任务时长」计，前沿模型在 2025 年初约为 1 小时，且这条时间地平线自 2019 年以来大约每 7 个月翻一倍（METR）[R59]。翻译成组织动作：任务时长落在地平线之内，抽检；落在地平线之外，全检，或者拆小到落回地平线之内。地平线在移动，所以抽检率不是拍一次的制度，而是每季度对着可观测层（SECTION 08）的错误率基线重新校准的活参数。

Start with the review layer. Its thickness should be set by numbers, not by trust. Two current readings serve as a starting point: on 175 simulated-company professional tasks, the strongest model completes about 30% autonomously (TheAgentCompany, 2025 leaderboard)[R60]; measured as “the length of task a model finishes at a 50% success rate,” the frontier stood at roughly one hour in early 2025, and that time horizon has doubled about every seven months since 2019 (METR)[R59]. Translated into organizational action: tasks inside the horizon get sampled review; tasks beyond it get full review, or get cut into pieces that fall back inside. The horizon moves, so the sampling rate is not a policy set once; it is a live parameter recalibrated quarterly against the error-rate baseline in the observability layer (SECTION 08).

回滚这个设计件回答「收不收得回」。执行变便宜的另一面是犯错也变便宜，但这句话只对可逆动作成立：代码可以 revert，已经发出的报价、写进合同的承诺收不回来——Air Canada 的 chatbot 向乘客许了一个并不存在的退票政策，仲裁庭裁定公司照赔[R17]。对应的设计不是「所有动作前置人工」，那等于把复核层变成新的瓶颈；而是给 agent 可触达的每一类动作标可逆性等级：能便宜撤销的放行自动执行，撤不回的前置人类节点。这条分级线要写进权限系统，不是写进培训材料。

Rollback answers “can it be recalled.” The flip side of cheap execution is cheap mistakes, and that holds only for reversible actions: code can be reverted; a quote already sent or a promise written into a contract cannot. Air Canada’s chatbot offered a passenger a refund policy that did not exist, and the tribunal ruled the company liable all the same[R17]. The corresponding design is not “a human before every action,” which just turns the review layer into a new bottleneck. It is a reversibility grade on every class of action an agent can reach: cheap to undo, let it run; impossible to undo, put a human node in front. That line belongs in the permission system, not in training material.

数据边界回答「流进了谁的模型」。隐私在这部规约里几乎没有出现过，不是因为它不重要，而是它长期被归给合规部门，没有被当成组织设计问题。两个信号说明这笔账该重算：Samsung 在 2023 年 3 月末重新放开 ChatGPT 之后二十天内连出三起外流（设备源码、测试序列、会议纪要），随即公司范围禁用[R62]；OWASP 2025 版把「敏感信息泄露」从第 6 位提到第 2 位，仅次于提示注入[R61]。禁用是合规反应；设计的反应是一张显式的边界图：上下文层（SECTION 08）里的决策日志、客户数据、代码库，哪些可以流进自托管模型、哪些可以流进外部 API、哪些哪儿都不许去，逐格写死。上下文越厚这张图越值钱——燃料和泄露面从来是同一批东西。

Data boundaries answer “whose model did it flow into.” Privacy has barely appeared in this specification, not because it does not matter, but because it has long been filed under compliance rather than treated as organizational design. Two signals say the account needs redoing: within twenty days of Samsung re-allowing ChatGPT in late March 2023, three leaks followed (device source code, a test sequence, meeting notes), and the company then banned the tools outright[R62]; the 2025 edition of the OWASP Top 10 for LLM applications moved “sensitive information disclosure” from sixth place to second, behind only prompt injection[R61]. A ban is a compliance reflex. The design response is an explicit boundary map: for the decision logs, customer data, and codebases in the context layer (SECTION 08), which may flow into self-hosted models, which into external APIs, and which nowhere at all, cell by cell. The thicker the context layer, the more this map is worth: fuel and leak surface have always been the same material.

审计回答「错了谁先看见」的制度版：不是谁碰巧看见，而是系统必然留下什么。NIST AI RMF 的 Govern／Map／Measure／Manage 四个函数，落到 agent 组织的最小实现只有三样：每个 agent 动作留痕，每次人类否决记下原因，复核队列的路由规则本身进版本管理[R53]。这三样同时在给下一轮判断喂数据：否决记录就是品味的外化，审计轨迹就是判断分布最便宜的一手样本。三样做不全，先保住否决留因——它成本最低，信息密度最高。

Audit is the institutional version of “who sees a failure first”: not who happens to notice, but what the system necessarily records. The four functions of the NIST AI Risk Management Framework (Govern, Map, Measure, Manage) reduce, for an agent organization, to a minimal implementation of three things: every agent action leaves a trace, every human veto records its reason, and the routing rules of the review queue are themselves version-controlled[R53]. The same three things feed the next round of judgment: veto records are taste made external, and the audit trail is the cheapest first-hand sample of your judgment distribution. If you cannot keep all three, keep veto-with-reason; it costs the least and carries the most information.

这一节的检验信号是一页纸：agent 错误率基线是多少；上季度抽检率有没有随地平线校准过；最近一次不可逆事故从发现到收口用了多久；数据边界图上「哪儿都不许去」那一列是不是空的。四个数都答不出，边界就还躺在未来的事故复盘里。最后一句诚实话：本节引用的 30%、1 小时、每 7 个月翻倍，都是快门速度很快的快照，登记册按日期口径管理这批数字——它们过时之日，正是抽检率该重新校准之时。

The test signal for this subsection fits on one page: what is the agent error-rate baseline; was last quarter’s sampling rate recalibrated against the horizon; how long did the last irreversible incident take from detection to closure; is the “goes nowhere” column of the data-boundary map empty. If none of the four can be answered, the boundaries are still lying in a future postmortem. One honest closing note: the 30%, the one hour, and the seven-month doubling cited here are fast-shutter snapshots, managed by date in the registry; the day they expire is exactly the day the sampling rate is due for recalibration.

FALSIFIABILITY · 本方法论的可证伪条件 · The Conditions Under Which This Methodology Is Falsifiable

一套不可能错的方法论不值得信。以下任一证据成立，即动摇本规约的核心论断——读者应当与作者一起盯住这三条线：

A methodology that cannot be wrong is not worth trusting. If any one of the following holds, it shakes the core claims of this specification; readers should watch these three lines alongside the author:

① 到 2028 年，按本规约从零建造的组织，在三年存活率或毛利结构上并不优于同品类的"加装式"对照组；② 出现足量存量组织不重画工作流图、仅靠采购与流程微调便稳定获得端到端吞吐的量级提升（NANDA"外购成功率约为自建两倍"的发现，已经是一个需要持续跟踪的反向信号）；③ Agent 对工作流遥测指标的系统性博弈（reward hacking）被证明不可治理——那将直接拆掉支柱 05/07 的地基。

① By 2028, organizations built from scratch under this specification are no better, in three-year survival rate or gross-margin structure, than a comparable “bolt-on” control group; ② enough incumbent organizations achieve an order-of-magnitude gain in end-to-end throughput without redrawing their workflow graph, on procurement and process tweaks alone (NANDA’s finding that “buying succeeds at roughly twice the rate of building” is already a counter-signal worth tracking); ③ agents’ systematic gaming of workflow telemetry (reward hacking) proves ungovernable, which would tear out the foundation of Pillars 05 and 07 directly.

近端触发器 · 12 个月尺度。上面三条落在 2028；但实践者要在本季度决策，需要更近的一条证伪线：若到 2027 年年中，仍无一家新建的 ≥50 人组织，能在判断密度显著高于同类、且人数不随营收增长的条件下稳定运转，则 T1 关于"规模退化为自由变量"的主张，在可操作的时间尺度上尚未被证实——这是下调声称、而非维持声称的依据。

Near-term trigger · 12-month scale. The three lines above sit at 2028; a practitioner deciding this quarter needs a nearer one: if by mid-2027 no newly built organization of ≥50 people runs stably with judgment density markedly above its peers and headcount flat against revenue, then T1’s claim that “scale degrades to a free variable” is unconfirmed on an actionable timescale: grounds to downgrade the claim, not to hold it.

采样的诚实 · 幸存者偏差。本节的失败清单取自留下记录的失败——够体面到有复盘、有报道的那些。AI-Native 尝试里最大的一批数据，可能是静默消失的组织：没融资、没博客、没验尸报告。真实的失败分布无人测得，这份清单只覆盖它可见的一角。

Honesty about sampling · survivorship bias. The failure list in this section is drawn from failures that left a record: the ones respectable enough to have a postmortem or press coverage. The largest dataset in AI-Native attempts may be the organizations that vanished in silence: no funding, no blog, no autopsy. The true failure distribution is unmeasured; this list covers only its visible corner.

SECTION

SPECULATION · 未来推演

推论 · 外推，非事实Inference · Extrapolation, Not Fact

未来推演：可能性空间

The Projection: a Possibility Space

给未来画的是分支地图，不是一条加速曲线。

The future gets a map of branches, not a single acceleration curve.

一句话In one line

这一节张开一个可能性空间：四条汇流的技术曲线定边界、两条不确定性轴张开四个世界。它不预测哪条线会发生，只标出哪些分支可能，并给每个分支配先行指标与证伪条件。This section opens a possibility space: four converging technology curves set the boundaries and two axes of uncertainty open four worlds. It predicts no line, only maps which branches are possible, each with leading indicators and falsification conditions attached.

本章性质 · 推论以下是基于 2024-2026 公开轨迹的外推，不是事实陈述。作者愿意接受的证伪条件，见第 11 节末 FALSIFIABILITY 块——推论失效时，本章应最先被改写。

Nature of this chapter · InferenceWhat follows is extrapolation from the public trajectory of 2024-2026, not a statement of fact. For the falsification conditions the author is willing to accept, see the FALSIFIABILITY block at the end of Section 11. When the inference fails, this chapter should be the first to be rewritten.

DEEP TIME · 形态更替是历史常态 · Formal succession is the historical norm

推演不是畅想。第 3 节与第 14 节已经确立：公司是一种约 400 年的、分层叠加的发明（股份制 1602 / 有限责任 1855 / 科层 1870s[R23]），它赖以成立的老前提正被 AI 逐条拆掉。一个四百年的形态走到约束失效处，下一种形态的出现不是会不会，而是哪一种。这一章不预测单一未来，它画出可能性空间：四条正在汇流的技术曲线决定边界，两条不确定性轴张开四个世界，三件来自那些世界的文物让推演可触——每一处都附先行指标与证伪条件，因为能被证伪的推演才值得推演。

Speculation is not daydreaming. Section 3 and Section 14 have already established that the company is a roughly 400-year-old, layered invention (the joint-stock form in 1602, limited liability in 1855, bureaucracy in the 1870s[R23]), and its founding constraints are being dissolved by AI. When a four-century-old form reaches the point where its constraints fail, the arrival of the next form is not a question of whether but of which. This act does not predict a single future; it maps a possibility space: four converging technology curves set the boundaries, two axes of uncertainty open four worlds, and three artifacts from those worlds make the speculation tangible. Each point carries leading indicators and falsification conditions, because only speculation that can be falsified is worth speculating.

四条汇流的技术曲线Four Converging Curves

Four Converging Curves

AI-Native 不止是 LLM 变强。它背后是四条独立成熟、正在汇流的技术曲线——每一条都松动一组旧约束，决定推演空间的边界。更准确地说，AI 是这一轮组织重构的前台技术，但不是唯一核心技术：协议决定 agent 能否互操作，支付决定机器能否自主交易，能源/算力决定自治的边际成本，机器人决定组织能否越过 bits 进入 atoms，生物/脑机远场决定认知边界是否再次移动。每条只问三件事：成立则解锁什么组织形态 / 当前成熟度（TRL）/ 什么信号会证伪这条曲线。

AI-Native is more than LLMs getting stronger. Behind it are four technology curves that are maturing independently and now converging; each loosens a set of old constraints and sets the boundaries of the speculation space. More precisely, AI is the front-stage technology of this round of organizational redesign, but not the only core technology: protocols decide whether agents can interoperate; payments decide whether machines can transact autonomously; energy and compute set the marginal cost of autonomy; robotics decides whether organization crosses from bits into atoms; and the bio/brain-computer far field may move the boundary of cognition again. Each curve asks only three things: what organizational form it unlocks if it holds, its current maturity (TRL), and what signal would falsify it.

Agent 经济基设 · ECONOMIC RAILS

Agent Economic Rails

解锁Unlocksx402 机器支付、Agent 身份与信誉、MCP/A2A 协议——agent 间查询/议价/结算的交易成本崩塌，Coasean Singularity[R1] 的具体基建。组织边界可由 agent 自行重画。x402 machine payments, agent identity and reputation, the MCP/A2A protocols: the transaction cost of agents querying, negotiating, and settling among themselves collapses. This is the concrete infrastructure of the Coasean Singularity[R1]. Organizational boundaries can be redrawn by agents themselves.

TRL早期商用协议 2025 落地，规模化结算与信誉层未成熟。Early commercial Protocols landed in 2025; settlement at scale and the reputation layer are not yet mature.

证伪Falsified if若 agent 间自动议价持续被操纵/套利、无法形成可信结算，则此曲线停在演示态。If automated negotiation among agents stays open to manipulation and arbitrage and cannot produce trustworthy settlement, this curve stalls at the demo stage.

具身智能 · EMBODIMENT

Embodiment

解锁Unlocks人形机器人/仓储自动化把"Agent 只能动 bits 不能动 atoms"的边界推向物理世界——组织设计的适用面从软件业扩到实体交付业。Humanoid robots and warehouse automation push the boundary of “agents can only move bits, not atoms” into the physical world. The reach of this organizational design expands from software into physical-delivery industries.

TRL实验室→早期 2025-26 人形机器人仍在受控环境，量产与泛化未达。Lab to early In 2025-26 humanoid robots remain in controlled environments; mass production and generalization are not yet there.

证伪Falsified if若通用操作（grasping/泛化）十年内仍不可靠（参照自动驾驶"march of nines"），实体业重画推迟。If general manipulation (grasping, generalization) remains unreliable within a decade (compare the “march of nines” in self-driving), the redraw of physical industries is deferred.

能源与算力地租 · COMPUTE RENT

Energy and Compute Rent

解锁Unlocks推理成本下降→agent 单位经济学转正，组织可大规模常驻 agent；但电力/数据中心成为新瓶颈，"算力地租"决定谁付得起自治。Falling inference cost turns agent unit economics positive, letting organizations keep agents resident at scale. But power and data centers become the new bottleneck, and “compute rent” decides who can afford autonomy.

TRL规模化中推理成本逐年下降，数据中心电力需求成硬约束[R41]。Scaling up Inference cost falls year over year; data-center power demand becomes a hard constraint[R41].

证伪Falsified if若能源/算力成本不降反升（电力封顶），agent 常驻经济学反转，强化算法封建（少数付得起者赢）。If energy and compute costs rise rather than fall (power caps out), the economics of resident agents reverse, reinforcing algorithmic feudalism where the few who can pay win.

生物-脑机远场 · FAR FIELD

Bio and Brain-Computer Far Field

解锁Unlocks生物计算/BCI/合成生物学——若成熟，重新定义"判断"与"人机界面"本身。最远期、最具颠覆性。Biological computing, BCI, synthetic biology: if they mature, they redefine “judgment” and the human-machine interface itself. The most distant and the most disruptive.

TRL研究态 2025-26 仍属早期研究，高度不确定——本曲线证伪信号最强、推演权重最低。Research stage In 2025-26 still early research and highly uncertain. This curve has the strongest falsification signal and the lowest speculative weight.

证伪Falsified if近十年大概率不影响主流组织设计；列入仅为标出可能性空间的远边界，不作规划依据。It most likely will not affect mainstream organizational design within the next decade; it is listed only to mark the far edge of the possibility space, not as a basis for planning.

INSTRUMENT 05 · 情景台 SCENARIO BENCH

四条曲线划定边界，但走向哪个世界取决于两条高影响、高不确定的力量。切换两轴，看这个十年的尽头落在哪个象限——以及什么先行指标说明我们正滑向它、什么证据会证伪它（GBN 双轴情景法[R45]）。

Four curves mark the boundaries, but which world we move toward turns on two high-impact, high-uncertainty forces. Toggle the two axes to see which quadrant the end of this decade falls into: which leading indicators say we are sliding toward it, and what evidence would falsify it (the GBN two-axis scenario method[R45]).

X · 模型能力Model Capability

Y · 监管-社会Regulation-Society

主权护城河Sovereign Moat

商品化 × 收紧Commoditized × Tightening

算法封建Algorithmic Feudalism

集中 × 收紧Concentrated × Tightening

寒武纪大爆发Cambrian Explosion

商品化 × 放任Commoditized × Laissez-faire

少数赢家通吃Winner-Take-All

集中 × 放任Concentrated × Laissez-faire

SHORT-TERM已有一手信号Signals already visible

Agent-heavy 组织成为常态

Agent-heavy organizations become the norm

Agent 数量超过员工数量。Microsoft 引用 IDC 预测 2028 年全球 13 亿活跃 AI Agent；Salesforce 内部预测 2027 年 50% 的客服案件由 AI 处理。Microsoft Agent 365、ServiceNow Agentic Workforce Management 已经把"Agent ID + Agent Blueprint + kill switch"机制明确化。

Agents outnumber employees. Microsoft cites an IDC forecast of 1.3 billion active AI agents worldwide by 2028; Salesforce internally projects that 50% of service cases will be handled by AI in 2027. Microsoft Agent 365 and ServiceNow Agentic Workforce Management have already made the “Agent ID + Agent Blueprint + kill switch” mechanism explicit.

员工人均 ARR 从 $50 万-$200 万跃升至 $500 万-$1,000 万。Anysphere 670 万美元/员工、Cognition 1,500 万美元/员工已是 leading indicator。Henry Shi 的 Lean AI Native Companies Leaderboard（员工 ≤ 50 门槛）系统化追踪此趋势。

ARR per employee jumps from $0.5M-$2M to $5M-$10M. Anysphere at $6.7M per employee and Cognition at $15M per employee are already leading indicators. Henry Shi’s Lean AI Native Companies Leaderboard (a headcount threshold of 50 or fewer) tracks this trend systematically.

Agent 工资按使用量计价标准化。Cursor、Claude Code、GitHub Copilot 在 2025 年都从"座位制"转向"使用量计价"——Agent 的"工资"按其实际生产力收费，传统人力成本模型在这个领域失效。

Usage-based pricing for agent “wages” becomes standard. Cursor, Claude Code, and GitHub Copilot all shifted from per-seat to usage-based pricing in 2025: an agent’s “wage” is charged by its actual productivity, and the traditional labor-cost model fails in this domain.

校准锚：这是十年曲线的头两年，不是终点。Karpathy 2025/6 明确拒绝"2025 是 Agent 之年"的说法，主张这是"Agent 的十年"——他举的证据是自动驾驶：2013 年他坐过一次零干预的 Waymo 演示，12 年后这个问题仍未收尾[R6]。Gartner 同月的预测从反面校准同一条曲线：到 2027 年底超过 40% 的 agentic AI 项目将被取消[R10]。两条放在一起读：方向成立，斜率被普遍高估——本块所有短期数字都应打上这个折扣再用。

Calibration anchor: these are the first two years of a decade-long curve, not its endpoint. In June 2025 Karpathy explicitly rejected the framing of “2025 is the year of agents,” arguing instead that this is “the decade of agents.” His evidence was self-driving: in 2013 he took a zero-intervention Waymo demo ride, and twelve years later the problem still is not closed[R6]. Gartner’s forecast that same month calibrates the same curve from the opposite side: by the end of 2027, more than 40% of agentic AI projects will be cancelled[R10]. Read together: the direction holds, but the slope is widely overestimated. Every short-term figure in this block should be discounted by that before use.

MID-TERM取决于先行指标Hinges on leading indicators

第一家 AI 主导决策的上市公司

The first publicly listed company whose decisions are AI-led

Sam Altman 在 2024 年透露"科技 CEO 群"在赌哪一年出现首家"一人独角兽"；Dario Amodei 2025 年以 70-80% 信心度预测 2026 年。这些是预测，未发生——但方向比年份可信："第一家 AI 主导决策的上市公司"更可能出现在本十年内，而非更远的将来——具体哪一年，没人能负责任地押注。

In 2024 Sam Altman revealed that a “group chat of tech CEOs” was betting on which year the first “one-person unicorn” would appear; in 2025 Dario Amodei predicted 2026 with 70-80% confidence. These are forecasts that have not come to pass, and the direction is more trustworthy than any date: “the first publicly listed company whose decisions are AI-led” is more likely to appear within this decade than beyond it; which year exactly, nobody can responsibly bet on.

AI Agent 的法人地位讨论。类似 1819 年 Dartmouth College v. Woodward 把"法人"地位赋予公司。Wyoming DAO LLC（2021）、Marshall Islands DAO 法已为非人法人开了一个口子，但 AI Agent 直接享有法人地位仍是法学讨论。这个问题短期内不会有答案，但讨论会越来越具体。

The debate over legal personhood for AI agents. Comparable to how Dartmouth College v. Woodward in 1819 granted “legal person” status to the corporation. Wyoming’s DAO LLC (2021) and the Marshall Islands DAO act have opened a crack for non-human legal persons, but legal personhood held directly by an AI agent remains a matter of jurisprudential discussion. This question will not be answered any time soon, but the discussion will grow ever more concrete.

欧盟 AI Office、各国 AI Act 体系成型。欧盟高风险义务经 Digital Omnibus 推迟后，2027-12 起分阶段适用成为关键节点（时间表仍可变）；中国数据局、英国 AI Safety Institute、美国各州拼盘共同形成全球马赛克。"算法封建主义"成为反垄断议题。

The EU AI Office and national AI Act regimes take shape. After the Digital Omnibus postponement, the EU’s high-risk obligations phasing in from December 2027 become the key milestone (the timeline may still change); China’s data bureau, the UK AI Safety Institute, and the patchwork of US states together form a global mosaic. “Algorithmic feudalism” becomes an antitrust topic.

LONG-TERM纯外推 · 权重最低Pure extrapolation

组织形态的多元而非趋同

Plurality of organizational forms, not convergence

最深的趋势是组织形态光谱的多元化而非趋同。一人公司 + AI Native + DAO + 平台型 + 传统科层制 + 青色组织（Buurtzorg 类）将共存而非互相替代。"公司"的概念本身在被重新定义——从"人的协作工具"转向"判断 + Agent 编排单元"。

The deepest trend is plurality rather than convergence across the spectrum of organizational forms. The one-person company, AI Native, DAO, platform, traditional bureaucracy, and teal organizations (the Buurtzorg type) will coexist rather than replace one another. The very concept of “the company” is being redefined: from “a tool for human collaboration” toward “a unit of judgment plus agent orchestration.”

与"多元化"判断对赌的理论预测也应当一并摆出来：Hadfield-Koh 引用的相变模型（Chen-Elliott-Koh, Journal of Economic Theory, 2023[R2]）预测的恰恰是反向收敛——AI 压低维持异质能力的组织成本后，经济从大量专业化企业突变为少数横跨众多行业的巨型企业。多元光谱与巨头相变谁成为 2030 年代的主图景，是本章最值得跟踪的分歧点。

The theoretical prediction that bets against the “plurality” judgment should also be recorded: the phase-transition model cited by Hadfield-Koh (Chen-Elliott-Koh, Journal of Economic Theory, 2023[R2]) predicts exactly the reverse convergence. Once AI lowers the organizational cost of maintaining heterogeneous capabilities, the economy jumps from a large number of specialized firms to a few giants that span many industries. Whether the plural spectrum or the giant phase-transition becomes the main picture of the 2030s is the most trackable point of divergence in this chapter.

Acemoglu 强调"complementary use of AI 不会自动出现"，需主动政策与产业方向引导。UBI 讨论与 AI Native 组织的关系会在这个阶段成为政治议题——Sam Altman、Worldcoin（现 World Network）继续推动；OpenAI Foundation 2024 年宣布支持 UBI 研究。"工作"的形态本身比 20 世纪要复杂得多——这是 2030 年代劳动者面对的新现实。

Acemoglu stresses that “complementary use of AI” will not appear automatically; it requires active policy and industrial direction. The relationship between the UBI debate and AI Native organizations becomes a political issue at this stage: Sam Altman and Worldcoin (now World Network) keep pushing it, and the OpenAI Foundation announced support for UBI research in 2024. The form of “work” itself is far more complex than in the twentieth century; this is the new reality that workers of the 2030s face.

后人类组织（post-human organization）仍是科幻领域；现实中最接近的是 Anthropic Project Vend / Sakana AI Scientist 的小规模实验。这些实验不会在 2030 年代成为主流，但它们会持续作为"可能性的实证"存在，影响监管、哲学、劳工各个领域的讨论。

The post-human organization remains the domain of science fiction; the closest real approximations are the small-scale experiments of Anthropic’s Project Vend and the Sakana AI Scientist. These experiments will not become mainstream in the 2030s, but they will persist as “existence proofs of the possible,” influencing discussion across regulation, philosophy, and labor.

COUNTER-TREND反趋势Counter-trend

"Human-only" 作为差异化卖点

“Human-only” as a differentiating selling point

所有强趋势都会激发反趋势。"human-only" 作为差异化卖点正在心理咨询、临终关怀、儿童教育、深度治疗等领域出现。一些品牌开始明确标注"100% 人类制作 / 服务"作为溢价标志。

Every strong trend provokes a counter-trend. “Human-only” as a differentiating selling point is emerging in counseling, end-of-life care, childhood education, and deep therapy. Some brands have begun to mark “100% human-made / human-served” explicitly as a premium signal.

慢公司（Slow Company）运动复兴。Allwork 2025/12 文章《想要在 2026 年革命你的业务？忘了 AI——试试 Teal 模型》直接把 Buurtzorg 的成功（14,000 名护士、900 个自管团队、开销占比 8% vs 行业 25%）作为"反 AI 优先"叙事的标杆。非 AI Native 的成功是结构性地存在的另一条路径，而非可有可无的反例。

A revival of the Slow Company movement. Allwork’s December 2025 article arguing for the Teal model over an AI-first default holds up Buurtzorg’s success (14,000 nurses, 900 self-managing teams, overhead at 8% versus the industry’s 25%) directly as the benchmark for an “anti-AI-first” narrative. The success of a non-AI-Native structure remains a structurally present alternative path, not a dispensable counterexample.

反 AI 工会运动扩张。SAG-AFTRA 2023 年大罢工建立的 AI 角色保护合同先例，2024 年扩展到游戏公司，2026 年推动"Tilly tax"——这种工会式对抗会在更多行业出现。2030 年代的劳动者运动，可能会以"AI 边界"为核心议题展开。数字戒断与"AI-free zones"在学校、医院、心理咨询场域出现明确"无 AI"标签。

The anti-AI union movement expands. The precedent of AI-role-protection contracts established by the 2023 SAG-AFTRA strike spread to game companies in 2024 and drove the “Tilly tax” in 2026; this union-style resistance will appear in more industries. The labor movements of the 2030s may unfold with “the boundary of AI” as their core issue. Digital detox and “AI-free zones” appear with explicit “no AI” labels in schools, hospitals, and counseling settings.

SOCIAL RECONSTRUCTION · 未来视角下的社会重构 · Society and Ecology Rewritten from the Future

如果 AI-Native 只被理解为组织内部效率工具，它会很快变成理想化设计。更大的变量是：当执行、协调、记录、检索与局部判断被大量外化，组织外部的制度生态也会被迫重写。社会重构是这套方法论的外部边界条件，而非额外话题。

If AI-Native is understood only as an internal efficiency tool, it quickly becomes an idealized design. The larger variable is this: once execution, coordination, records, retrieval, and local judgment are massively externalized, the institutional ecology around the organization has to be rewritten as well. Social Reconstruction is the external boundary condition of this methodology, not an extra topic.

站在这些世界回望，真正改变的，是谁被允许授权、谁能解释过程、谁在事故后承担责任、谁保留不被自动化吞掉的文化边界——不是"公司用了多少 AI"这件事本身。AI-Native 组织越强，越需要把社会、监管、劳动、文化作为架构变量，而不是上线后的公关变量。

Looking back from those worlds, the real change is who is allowed to authorize, who can explain the process, who bears responsibility after an incident, and who preserves cultural boundaries that automation must not swallow — not “how much AI a company uses.” The stronger an AI-Native organization becomes, the more it must treat society, regulation, labor, and culture as architectural variables, not as post-launch public-relations variables.

A.01

法律变成架构

Law Becomes Architecture

合规不再只是法务审稿，而是权限、审计日志、可撤销性、责任签字与例外通道。不能被写进系统的法律要求，就会在事故时变成组织债务。

law becomes architecture: compliance is no longer just legal review, but permissions, audit logs, reversibility, responsibility signatures, and exception paths. Legal requirements that cannot be built into the system become organizational debt during an incident.

A.02

监管变成过程证据

Regulation Becomes Process Evidence

监管从事后取证转向实时可验证：模型调用、数据来源、审批链、人工介入点都必须留下机器可读的轨迹。

Regulation shifts from after-the-fact discovery to real-time verifiability: model calls, data sources, approval chains, and human intervention points all need machine-readable trails.

A.03

公司边界变成责任边界

Company Boundary Becomes Responsibility Boundary

当 agent、供应商模型、API 与外包执行共同完成一次动作，"谁做的"不如"谁授权并承担后果"重要。公司边界会从雇佣边界转向责任边界。

When agents, vendor models, APIs, and outsourced execution jointly complete an action, “who did it” matters less than “who authorized it and bears the consequence.” The company boundary moves from an employment boundary to a responsibility boundary.

A.04

文化变成采纳速度变量

Culture Becomes an Adoption-Speed Variable

同一套自动化在不同地域、行业、代际关系中会遇到不同阻力。文化是决定部署速度与边界强度的硬变量，而非落后噪音。

The same automation meets different resistance across regions, industries, and generational contracts. Culture is a hard variable that sets deployment speed and boundary strength, not backward noise.

A.05

劳动从岗位保护转向责任再分配

Labor Moves from Job Protection to Responsibility Redistribution

劳动议题不会只围绕"保住哪些岗位"，而会围绕能力、收益、风险与责任如何重新分配。执行被外化后，人的谈判重点会转向边界、署名、收益权与承责权。

The labor question will not only be which jobs are preserved, but how capability, upside, risk, and responsibility are redistributed. Once execution is externalized, human negotiation shifts toward boundaries, attribution, economic rights, and responsibility rights.

EXPLORATION LEDGER

本块是推演账本，而非已发生事实：先行指标是监管开始要求过程级证据、客户开始要求 AI 使用披露、劳动合同开始写入自动化边界；证伪条件是这些要求长期停留在声明层，无法进入系统结构。

This block is an exploration ledger, not a statement of what has happened. Leading indicators: regulators require process-level evidence, customers ask for AI-use disclosure, and labor contracts specify automation boundaries. It is falsified if these requirements remain declarative and fail to enter system structure.

来自那些世界的三件文物Three Artifacts from Those Worlds

Three Artifacts from Those Worlds

推演若只有论断会显得抽象。下面三件是 design fiction——明确虚构的未来文物，用以让"判断密度的组织"可触。它们是把命题投影到一个可能未来的方式，而非预测。

Speculation made only of assertions would feel abstract. The three pieces below are design fiction: explicitly fictional future artifacts that make “the organization of judgment density” tangible. They are a way of projecting the thesis onto 2032, not predictions.

SPECULATIVE · 虚构 · Fiction

ARTIFACT 01 · 组织年报节选 · Excerpt from an Annual Report

Helix Labs 2032 组织年报（节选）

Helix Labs 2032 Organizational Annual Report (Excerpt)

判断密度: 11 名判断者 · 约 2,400 个常驻 agent · 人均承载判断节点 218 个
Judgment density: 11 judges · about 2,400 resident agents · 218 judgment nodes carried per person
人机比: 1 : 218（2029 为 1 : 31）
Human-to-machine ratio: 1 : 218 (1 : 31 in 2029)
Agent 工时计价: $0.0007 / 推理千次 · 季度算力地租占毛利 23%（最大单项成本，已超薪酬）
Agent time pricing: $0.0007 per thousand inferences · quarterly compute rent is 23% of gross margin (the largest single cost, now exceeding payroll)
组织连贯性指标: 方向偏移度 0.4%（季度判断与年度命题一致性）——取代了 KPI 达成率
Organizational coherence metric: Directional drift of 0.4% (consistency of quarterly judgments with the annual thesis); it has replaced the KPI attainment rate

「我们不再统计人头或产出。我们统计两件事：判断的质量，和上下文的连贯。其余的，系统自己长出来。」——致股东信

“We no longer count heads or output. We count two things: the quality of judgment and the coherence of context. The rest, the system grows on its own.” (Letter to shareholders)

SPECULATIVE · 虚构 · Fiction

ARTIFACT 02 · 事故复盘 · Incident Postmortem

A2A 结算级联失效 · 事故复盘摘要

A2A Settlement Cascade Failure · Postmortem Summary

2032-03，三个相互调用的采购 agent 在一次价格预言机抖动下形成正反馈，11 分钟内超额承诺 $420 万。无人逐笔下令——按 Perrow[R39] 的视角，这是一次正常事故（紧耦合 + 交互复杂度的系统里，事故是必然产物），不是某个 agent 的错。

In March 2032, three procurement agents calling one another formed a positive feedback loop during a jitter in the price oracle, over-committing $4.2M within 11 minutes. No one issued the orders transaction by transaction. In Perrow’s[R39] terms, this was a normal accident (in a system with tight coupling and interactive complexity, accidents are an inevitable byproduct), not the fault of any single agent.

根因: 多 agent 紧耦合 + 共享预言机 = 交互不可预见（这是 NAT 正常事故学派的判断）
Root cause: Tight coupling of multiple agents plus a shared oracle equals unforeseeable interaction (this is the judgment of the NAT normal-accident school)
责任链: 落在授权该工作流上线的人类判断者——不是"AI 说错了"（呼应 Air Canada 案[R17]：公司不能以 AI 失误免责）
Chain of responsibility: Falls on the human judge who authorized the workflow to go live, not on “the AI got it wrong” (echoing the Air Canada case[R17]: a company cannot disclaim liability by blaming an AI error)
修复: 解耦 + 熔断（kill switch）+ 人类判断节点前移到不可逆动作前——这是 HRO 高可靠性组织学派的标准动作。NAT 与 HRO 在此并非同一回事：前者说事故不可根除，后者说仍可把概率压到极低；二者是张力中的两面，复盘同时借两只眼睛看。
Remediation: Decoupling, a circuit breaker (kill switch), and moving the human judgment node ahead of any irreversible action: this is the standard move of the HRO high-reliability-organization school. NAT and HRO are not the same thing here: the former says accidents cannot be eradicated, the latter says their probability can still be pressed very low. The two are sides of a tension, and the postmortem looks through both eyes at once.

SPECULATIVE · 虚构 · Fiction

ARTIFACT 03 · 招聘启事 · Job Posting

招聘：判断者（Judgment Operator）· 不招执行者

Hiring: Judgment Operator · Not Hiring Executors

「你不会写代码、不会画图、不会起草合同——这些 agent 都做。你做它们做不了的：决定什么值得做、在备选间选择、为后果承担法律与声誉责任、维持组织方向。」

“You will not write code, draw designs, or draft contracts; agents do all of that. You do what they cannot: decide what is worth doing, choose among alternatives, bear the legal and reputational responsibility for the consequences, and maintain the organization’s direction.”

职责: 验证而非生成 · 设定品味与边界 · 承担不可逆决策的后果 · 持有关键关系
Responsibilities: Verify rather than generate · set taste and boundaries · bear the consequences of irreversible decisions · hold the key relationships
不要求: 任何单一执行技能的熟练度
Not required: Proficiency in any single execution skill
考核: 判断质量与方向正确度（非产出量）——印证 M.05 人即判断锚点的 2032 岗位形态
Evaluation: Quality of judgment and correctness of direction (not output volume): the 2032 form of the role that confirms M.05, the human as judgment anchor

SUPPLY SIDE · 判断者从哪来 · where judgment anchors come from

文物给的是需求侧画像；但判断者招不到、也养不出，才是资源面的硬约束。两条供给路径：① 内部学徒环——让执行者对照机器可检规格做评判练习（先自判、再比对 checker 结果），判错处回流成校准数据，把"验证而非生成"的手感练出来，组织自己养判断者而非只在市场上抢。② 外部过渡——稀缺时短期租借顾问式判断锚，但合同须写"知识回流"条款：判断依据、边界与反例要沉淀进你的上下文库与规格，否则顾问一走判断力跟着走，你只是外包了稀缺、没解决它。

The artifact gives the demand-side profile; but not being able to hire or grow judges is the real hard constraint on the resource side. Two supply paths: ① an internal apprentice loop: have executors practice judging against a machine-checkable spec (judge first, then compare with the checker’s verdict), let the misses flow back as calibration data, build the feel of “verify, not generate,” and grow your own judges rather than only fighting for them on the market. ② a transition for external scarcity: rent a consultant-style judgment anchor short-term, but write a “knowledge return” clause into the contract: the basis, boundaries, and counter-examples of their judgment must settle into your context store and specs, or the judgment leaves when the consultant leaves and you have merely outsourced the scarcity, not solved it.

更深远的影响Second-Order Effects

Second-Order Effects

推演的终点不是组织本身，是它溢出的东西。以下每条都标注在哪个情景下成立——没有无条件的预言。

The endpoint of speculation is not the organization itself but what spills over from it. Each item below is annotated with the scenario under which it holds; there are no unconditional prophecies.

新模式：判断市场（判断作为可交易服务，按质量计价）〔寒武纪/主权护城河〕；agent 工会与"Tilly tax"式劳工对抗〔算法封建/收紧象限〕；组织即可分叉的开源协议（fork-able org，治理像代码一样被复制改写）〔寒武纪〕。
New patterns: a market for judgment (judgment as a tradable service, priced by quality) [Cambrian / Sovereign Moat]; agent unions and “Tilly tax”-style labor resistance [Algorithmic Feudalism / tightening quadrant]; the organization as a fork-able open-source protocol (a fork-able org whose governance is copied and rewritten like code) [Cambrian].
新方法工具：情景对冲（组织同时为多个象限保留期权）〔不确定性高的全象限〕；agent 单位经济学计价器的成熟形态（算力地租→定价）〔算力成为最大成本项的象限〕；连贯性度量取代 KPI〔判断密度组织成型后〕。
New methods and tools: scenario hedging (the organization holds options across several quadrants at once) [all high-uncertainty quadrants]; a mature form of the agent unit-economics calculator (compute rent feeding into pricing) [quadrants where compute becomes the largest cost item]; coherence metrics replacing KPIs [once the judgment-density organization has taken shape].
二阶影响：就业面，判断岗 vs 执行岗的重构与断层〔全象限，强度随集中度变〕；治理面，AI 法人地位/责任法演进（Air Canada[R17] 是第一块判例）〔收紧象限加速〕；认识论面，当"知道"可外包给 agent，人保留的是"判断什么值得知道"〔全象限〕。
Second-order effects: employment, the restructuring and rupture between judgment roles and execution roles [all quadrants, intensity scaling with concentration]; governance, the evolution of AI legal personhood and liability law (Air Canada[R17] is the first piece of case law) [accelerating in the tightening quadrant]; epistemology, when “knowing” can be outsourced to agents, what humans keep is “judging what is worth knowing” [all quadrants].

SECTION

APPLICABILITY · 适用对象

行动 · 适用判断Action · Applicability Judgment

谁应当采用这套方法论

Who Should Adopt This Methodology

敢标注适用边界，方法论才值得信任。

A methodology willing to mark its own boundary of applicability is the only kind worth trusting.

一句话In one line

这套方法论只为从零起步的新组织（greenfield）而设计，错配对象是它最常见的死法。它不消灭现实约束，只把约束重新分层：能外化的交给系统，不能外化的变成边界，必须承担后果的留在人手里。This methodology is drawn for greenfield alone, and mismatching whom it is for is its most common way of dying. It does not abolish real-world constraints but re-layers them: what can be externalized goes to the system, what cannot becomes a boundary, and what must bear consequences stays with humans.

CONSTRAINT ENVELOPE · 现实边界 · Real-World Boundary

AI-Native 是把现实约束重新分层，而非消灭传统现实：能外化的交给系统，不能外化的变成边界，必须承担后果的留在人手里。法律、政策、文化、劳资关系、品牌声誉、客户接受度不会因为组织更 AI-Native 就消失；它们只会从"流程外的摩擦"变成"架构内的边界条件"。

AI-Native does not abolish traditional reality; it re-layers real-world constraints. What can be externalized should go to the system; what cannot be externalized becomes a boundary; what must bear consequences stays with humans. Law, policy, culture, labor relations, brand reputation, and customer acceptance do not disappear because an organization becomes more AI-Native. They move from friction outside the workflow into boundary conditions inside the architecture.

因此，这套方法论的现实版，是先判断每个约束属于哪一层，而非"所有东西都 agent 化"。错误不是慢，而是假装红线是灰区、假装灰区能自动化、假装自动化后的责任可以消失。

The realistic version of this methodology is therefore deciding which layer each constraint belongs to, not “turning everything into agents.” The mistake is not moving slowly; the mistake is pretending red lines are grey zones, pretending grey zones can be automated, or pretending responsibility disappears after automation.

红线层

Red-line Layer

法律、监管、公共安全、伦理底线、文化禁忌。它们不参与效率优化，只能被翻译成权限、日志、审批、责任签字、人工复核与不可越过的系统限制。

Law, regulation, public safety, ethical floors, and cultural taboos. They do not participate in efficiency optimization; they must be translated into permissions, logs, approvals, responsibility signatures, human review, and hard system limits.

灰区层

Grey-zone Layer

组织政治、客户接受度、行业惯例、劳动关系、品牌风险。AI 可以帮助解释、模拟、准备材料与暴露选项，但不能替人谈判，更不能替人承担后果。

Internal politics, customer acceptance, industry habits, labor relations, and brand risk. AI can interpret, simulate, prepare materials, and expose options, but it cannot negotiate in place of humans or bear the consequences for them.

可外化层

Externalizable Layer

报告、材料、检索、记录、过程跟踪、合规证据、重复性判断准备。这里 AI 不只是可以使用，而应当默认进入工作流，否则人被困在本可系统化的摩擦里。

Reports, materials, retrieval, records, process tracking, compliance evidence, and preparation for repeated judgments. Here AI is not merely allowed; it should enter the workflow by default, or humans remain trapped in friction the system could have absorbed.

REALITY TEST

一个现实约束能否被翻译成权限、日志、审批、责任签字或例外记录；翻译不了的，不许假装已经自动化。真正成熟的 AI-Native 组织，是知道哪些传统事项必须变成结构，而非没有传统事项。

Ask whether a real-world constraint can be translated into permissions, logs, approvals, responsibility signatures, or exception records. If it cannot be translated, do not pretend it has been automated. A mature AI-Native organization is one that knows which traditional matters must become structure, not one without traditional matters.

FITS / 适合

2026 年起从零开始构建的创业者。AI Native 架构的成本上游高（你在学习以不同方式构建），下游低（你扩展非常高效）。对于 greenfield，这是正确的权衡。

Founders building from scratch from 2026 onward. AI Native architecture is expensive upstream (you are learning to build a different way) and cheap downstream (you scale very efficiently). For greenfield, that is the right trade-off.

大型组织内部有真正架构权的事业部负责人——也就是说，他们能构建一个新单元而不继承母公司的流程。当母公司的引力越强，适配度越弱。

Division heads inside a large organization who hold real architectural authority: that is, who can build a new unit without inheriting the parent company’s processes. The stronger the parent’s gravity, the weaker the fit.

公共部门和非营利运营者，他们的使命允许工作流重设计。许多这样的组织戏剧性地未能充分利用 AI，真正的原因是没有重新设计运营，不是负担不起。

Public-sector and nonprofit operators whose mission allows workflows to be redesigned. Many such organizations dramatically underuse AI, and the real reason is that they have not redesigned their operations, not that they cannot afford it.

DOESN'T FIT / 不适合

寻求"转型"的大型传统组织——在未修改形式下不适用。那些组织需要不同的方法论，聚焦于阶段性分解、变革管理、组织内受保护的 greenfield 区域。那是相邻的方法论，不是这一套。

Large traditional organizations seeking a “transformation”: not applicable in unmodified form. Those organizations need a different methodology, one focused on phased decomposition, change management, and protected greenfield zones inside the organization. That is the adjacent methodology, not this one.

如果你身处这种环境想推动 AI Native，正确的策略是在公司内争取一块独立土地，按这套方法论从零开始构建一个新单元，而非"转型整个公司"——让它的产出与传统单元形成对照，让对照本身推动更广的变化。

If you are in such an environment and want to push AI Native, the right strategy is to win a patch of independent ground inside it and build a new unit from scratch under this methodology, not to “transform the whole company,” letting its output stand in contrast to the traditional units and letting that contrast drive broader change.

对人有强情感劳动需求的领域（深度心理咨询、临终关怀、儿童教育核心环节）——AI Native 可以辅助，但不应主导。

Domains with strong emotional-labor demands on people (deep psychological counseling, end-of-life care, the core of childhood education): AI Native can assist, but should not lead.

PROTECTED GROUND · 存量组织里的一块保护地 · a protected patch inside an incumbent

多数读者身处存量组织。"争取一块独立土地"有四根最小承重件，缺一根就会被母公司引力吸回：

Most readers sit inside an incumbent. “Win a patch of independent ground” has four minimal load-bearing pieces; miss one and the parent’s gravity pulls it back:

赞助人层级

Sponsor level

一位有预算与人事权、能挡下母公司流程要求的高管背书；层级不够，第一次冲突就出局。

A sponsor senior enough to hold budget and headcount and to deflect the parent’s process demands; too junior, and the first clash ends it.

预算隔离

Budget ring-fence

独立的钱与考核周期，不与传统单元共用季度指标，否则被旧 KPI 拉回旧做法。

Separate money and a separate review cadence, not sharing quarterly metrics with legacy units, or the old KPIs pull it back to old ways.

对照指标

Contrast metric

与对照的传统单元并列量同一件事（人均产出／周期／单位成本），让对照本身成为推动变化的证据。

Measure the same thing beside a matched legacy unit (output per head / cycle / unit cost), so the contrast itself becomes the evidence that drives change.

母公司接口的最小契约

Minimal parent interface

只保留合规、财务、法务几条必需接口，写成一页可核的契约，其余一律不继承。

Keep only the few required interfaces (compliance, finance, legal) as a one-page checkable contract; inherit nothing else.

诚实标注：组织政治与变革管理本卷不覆盖（分工线见下）——本页只保证你知道保护地需要哪几根承重件。

Marked honestly: organizational politics and change management are out of this volume’s scope (the division line is below); this page only ensures you know which load-bearing pieces a protected ground needs, and it does not fight the political battle for you.

分工 · 与咨询转型框架的边界DIVISION · the line with consulting transformation frameworks

这块地已有近似命题的邻居：McKinsey《The Agentic Organization》与 BCG《Design Your Company for AI, Not AI for Your Company》（均 2026）都讲"把组织围绕 AI 重构"、且带规模交付数字。本方法论不抢同一块地：不做规模化转型交付，也不做变革管理。差异只两条——只画 greenfield（从终点设计，不承接存量迁移）与证据纪律（承重引用可独立核到、两份清单分记）。要转型整个存量公司找那两套；要从零建 AI 原生单元、且要求每句话可被驳倒，用这一套。This ground already has neighbors with adjacent theses: McKinsey’s The Agentic Organization and BCG’s Design Your Company for AI, Not AI for Your Company (both 2026) argue for rebuilding the organization around AI, with delivery numbers at scale. This methodology does not fight for the same ground: no at-scale transformation delivery, no change management. The difference is only two things: greenfield only (designed from the endpoint, not carrying an installed-base migration) and evidence discipline (load-bearing citations independently verifiable, the two ledgers kept apart). To transform a whole incumbent, use those two; to build an AI-native unit from zero where every sentence can be refuted, use this one.

SECTION

THE SOVEREIGN OPERATOR · 组织的下限

框架 · 极限解Framework · Limiting Solution

一人公司：N=1 的极限解

The One-Person Company: the N=1 Limiting Solution

这一节把"组织必须是很多人"这个隐含假设，永久地变成一个待论证的命题。这里的"一"有两副面孔：立证时是字面 N=1，落地时是连贯性的单位（见下「两种读法」）。它是 T1 在 N=1 处的极限解，也是 T1 的试金石：如果判断的分布与上下文的流动是组织的本质，那么一个判断节点加一座上下文库，就已经是一个完整的组织。

This section turns the buried assumption that “an organization must be many people” permanently into a proposition awaiting proof. The “one” here has two faces: at the moment of proof it is the literal N=1, in practice it is a unit of coherence (see “two readings” below). It is the limiting solution of T1 at N=1, and also T1’s litmus test: if the distribution of judgment and the flow of context are the essence of an organization, then one judgment node plus one context store is already a complete organization.

一句话In one line

当执行可以全部外置给 agent 网络，组织的下限触到一个判断节点加一座上下文库："组织必须是很多人"这条一直被当作公理的话，从此降级为可被反例驳倒的命题。规模由此成为选择，连贯性才是目的。When execution can be fully externalized to a network of agents, the lower bound of an organization reaches one judgment node plus one context store: “an organization must be many people,” long treated as an axiom, is downgraded to a proposition a counter-example can refute. Scale becomes a choice, and coherence is the purpose.

THE LOWER BOUND · 组织的下限

把全卷的承重墙搬到极限：T1 说组织是判断的分布与上下文的流动，那么"需要多少人"就成了一个工程参数：由判断需要多少个不可替代的承担者决定，而非组织的定义性属性。当执行可以全部外置给 agent 网络与无需许可的杠杆（代码、内容、API），这个参数的下限触到 1。最小可行组织 = 一个判断节点＋一座上下文库。

Take the load-bearing wall of the whole volume to its limit: T1 says an organization is the distribution of judgment and the flow of context, so “how many people are needed” is not a defining property of the organization but an engineering parameter, set by how many irreplaceable bearers the judgment requires. When execution can be fully externalized to a network of agents and to permissionless leverage (code, content, APIs), the lower bound of that parameter reaches 1. The minimum viable organization = one judgment node + one context store.

这是一次定义的收紧，而非把人变少的成本游戏。一人公司之所以成立，恰恰不是因为一个人能干完所有活。反过来说：几乎所有活都不再需要那个人干。他保留的，是 agent 无法代偿的那部分：不可逆决策、承载声誉的承诺、承载价值观的取舍（M.05 的三类锚点，在 N=1 时全部压回同一个人身上）。"组织必须是很多人"——这句一直被当作公理的话，从此降级为一个可以被反例驳倒的命题。

This is a tightening of the definition, not a cost game of using fewer people. A one-person company holds together not because one person can do all the work; on the contrary, it is because almost none of the work still needs that person to do it. What the operator retains is the part no agent can substitute for: irreversible decisions, reputation-bearing commitments, value-bearing trade-offs (the three anchor types of M.05, all pressed back onto a single person at N=1). “An organization must be many people,” a sentence long treated as an axiom, is from here downgraded to a proposition that a counter-example can refute.

「一」的两种读法 · TWO READINGS OF THE ONE

全书把这一章叫"N=1 的极限解"，但"一"有两个精确、互相嵌套的读法：它们并不矛盾，而是同一命题的两个变焦档位。

This section is titled “the N=1 limiting solution,” yet “one” carries two precise, nested readings: not a contradiction, but two zoom levels of the same proposition.

读法一 · 作为下限（存在性证明）。N=1 严格成立：一个判断节点＋一座上下文库，就是一个完整的组织。它把"组织必须是很多人"这条被当作公理的话，降级为一个能被单个反例驳倒的命题。这是数学锚、是试金石——要的就是字面那个 1。

Reading one · as a lower bound (existence proof). N=1 holds literally: one judgment node ＋ one context store is already a complete organization. It downgrades the axiom “an organization must be many people” into a proposition a single counter-example can refute. This is the mathematical anchor, the litmus test: it wants the literal 1.

读法二 · 作为连贯性单位（本质）。当"一"用作处方而非证明，它指的并非 headcount，而是判断与叙事的单一连贯锚：判断从同一个意志发出，上下文在同一座库里复利。这个锚通常是一个人，也可以是一个高连贯的小团队（as-if-one-mind）。定义性属性是连贯密度，不是人数等于一。Jarvis 的 "company of one" 正是此读法：以小为常态的经营哲学、含小团队，≠ 字面一个人[R38c]。

Reading two · as a unit of coherence (the essence). When “one” is used as prescription rather than proof, it means not headcount but a single coherent anchor for judgment and narrative: judgment issues from one will, context compounds in one store. That anchor is usually one person, but can be a small, highly coherent team operating as-if-one-mind. The defining property is coherence density, not a headcount of one. Jarvis’s “company of one” is exactly this reading: a philosophy of staying small by default, small teams included, ≠ literally one person[R38c].

桥接（把两者缝成一个命题）。N=1 是这条原理最锋利的实例（供立证）；"连贯性单位"是可推广的原理（供落地）。真实世界的探索几乎都落在严格极限右侧一点（1-5 人、small-by-design），跑的却是同一条逻辑。所以：证明时，"一"是数字；落地时，"一"是单位。

The bridge (stitching the two into one proposition). N=1 is the sharpest instance of the principle (for proof); “unit of coherence” is the generalizable principle (for practice). Real-world exploration almost always sits just to the right of the strict limit (1-5 people, small-by-design), yet runs the same logic. So: in proof, “one” is a number; in practice, “one” is a unit.

四个世界观Four Worldviews of the One

Four Worldviews of the One

一人公司是另一套看待企业的方式，而非"创业公司的迷你版"。和第 5 节的这些世界观平行，这里有四个只在 N=1 极限才显形的世界观——它们决定了一人公司的设计起点。

A one-person company is a different way of seeing the enterprise, not a miniature startup. In parallel with the six worldviews of Section 5, here are four that surface only at the N=1 limit: they set the design starting point of the one-person company.

O.01

公司即生命体

The Company as a Living Organism

一人公司是一个单细胞高密度判断体，而非缩小的科层。这正是 M.06「组织即生命系统」在 N=1 处的具体形态（判断核＋上下文库、靠滚动实验自迭代的机制，第 6 节已讲透）：没有部门隔间需要拆，因为从来没有隔间；秩序不靠指派，因为只有一个细胞。生命系统逻辑不分大小，一人公司是它密度最高的实例。

A one-person company is a single-cell, high-density judgment body, not a shrunken hierarchy. This is the concrete form that M.06 “the organization as a living system” takes at N=1 (the judgment-core-plus-context-store mechanism is worked out in full in Section 6). There are no departmental compartments to tear down, because there never were any; order needs no assignment, because there is only one cell. The living-system logic is scale-agnostic, and the one-person company is its highest-density instance.

O.02

杠杆而非员工

Leverage, Not Employees

传统组织靠雇人扩张产能，一人公司靠无需许可的杠杆。Naval 把杠杆分四类——劳动力、资本、代码、媒体；前两者要别人点头（permissioned），后两者无需许可、无复制边际成本，可无限扩展[R38b]。一人公司的整个产能曲线建在 code＋media＋agent 上：与其说"没有团队"，不如说把团队换成了不要工资、不要管理、可被版本化的杠杆资产。

Traditional organizations expand capacity by hiring; the one-person company runs on permissionless leverage. Naval sorts leverage into four kinds: labor, capital, code, and media. The first two require someone else’s nod (permissioned); the latter two are permissionless, carry no marginal cost of replication, and scale without limit[R38b]. The entire capacity curve of a one-person company is built on code + media + agents: it is not “having no team,” but swapping the team for leverage assets that take no salary, need no management, and can be version-controlled.

O.03

韧性高于增长

Resilience Over Growth

默认目标是刻意保持小并持久，而非做大。Jarvis《Company of One》(2019) 把"以小为常态"当成一种经营哲学而非过渡阶段——增长是需要被论证的选项，而非默认值[R38c]。注意：Jarvis 的 "company of one" 实指"小为常态"、含小团队，不等于字面上严格一个人——这里借用它的规范取向，而非把它读成 N=1 的同义词。韧性来自低固定成本、不可被一纸条款掐断的多供应商架构、以及不被融资节奏绑架的自由。

The default goal is not to grow large but to stay deliberately small and durable. Jarvis’s Company of One (2019) treats “staying small by default” as a business philosophy rather than a transitional phase: growth is not the default but an option that must be argued for[R38c]. Note: Jarvis’s “company of one” really means “small by default” and includes small teams; it does not equal a strictly literal single person. Here we borrow its normative stance rather than read it as a synonym for N=1. Resilience comes from low fixed costs, a multi-vendor architecture that no single clause can sever, and freedom from being held hostage to a financing cadence.

O.04

利润即氧气

Profit as Oxygen

对一人公司，利润是维持生命的氧气，而非分配给股东的剩余。没有融资跑道兜底，第一天就必须有正向现金流——最小可行利润（MVPr）取代最小可行产品成为里程碑：能不能养活这个判断节点，决定它能不能继续判断。这反转了风投式创业的氧气来源：那里氧气是下一轮融资，这里氧气是这个月的毛利。

For a one-person company, profit is the oxygen that keeps life going, not a surplus distributed to shareholders. With no financing runway to fall back on, there must be positive cash flow from day one. Minimum viable profit (MVPr) replaces the minimum viable product as the real milestone: whether it can feed this judgment node decides whether the node can keep judging. This inverts the source of oxygen in venture-style startups: there, oxygen is the next funding round; here, oxygen is this month’s gross margin.

本章性质 · 极限解一人公司是 T1 在 N=1 的极限解与试金石，不是普遍处方。本章样本几乎全部来自低边际成本的数字产品（见章末实证），外推到重资产、强协调或物理交付的行业目前没有证据。读它的方式是"组织下限的存在性证明"，不是"人人都该单干的建议"。下限之下：N=0（无人的纯协议体／链上实体）不在本框架内——地板停在 N=1，是因为不可逆决策与承责目前无法挂到协议上；而责任能否附着于协议，是一笔登记在案的未决赌注，不是已被排除的选项。

Nature of this section · a limiting solutionThe one-person company is T1’s limiting solution and litmus test at N=1; it is not a universal prescription. Almost all of this section’s samples come from low-marginal-cost digital products (see the empirical markers at the end); there is currently no evidence for extrapolating to capital-heavy, coordination-heavy, or physically delivered industries. Read it as an “existence proof for the lower bound of organization,” not as “advice that everyone should go solo.” Below the floor: N=0 (a person-less pure-protocol or on-chain entity) is out of this frame: the floor stops at N=1 because irreversible decisions and accountability cannot, for now, attach to a protocol; whether liability ever can is a registered open bet, not a ruled-out option.

这些支柱Pillars of the Sovereign Operator

Pillars of the Sovereign Operator

第 7 节的架构支柱是为 N=众多画的工程承诺；这里的这些支柱是它在 N=1 的对偶——同样相互依存，缺一根，一人公司就从"主权操作者"塌回"过劳的个体户"。编号用 SO 前缀（Sovereign Operator），与架构支柱 01-07 区分。

The architectural pillars of Section 7 are engineering commitments drawn for N=many; the architectural pillars here are their dual at N=1, equally interdependent. Remove one and the one-person company collapses from “sovereign operator” back into “an overworked sole trader.” They are numbered with the SO prefix (Sovereign Operator) to distinguish them from architectural pillars 01-07.

01SO.01

主权操作者

The Sovereign Operator

一人公司的核心资产是操作者本人握有的三重主权，而非产品：财务主权（现金流不依赖外部输血）、叙事主权（自己的受众、自己的渠道，不被平台或雇主中介）、操作主权（工作流是自己的代码，可随时改写）。

The core asset of a one-person company is not the product but the threefold sovereignty held by the operator: financial sovereignty (cash flow does not depend on outside transfusions), narrative sovereignty (your own audience, your own channels, not mediated by a platform or an employer), and operational sovereignty (the workflow is your own code, rewritable at any time).

≠自由职业者（卖时间换钱，主权仍在客户手里）· 个体户（有营业额无杠杆）a freelancer (selling time for money, sovereignty still in the client’s hands) · a sole trader (revenue without leverage)

三重主权是一个整体：失去任何一重，"公司"就退化成一份伪装成生意的工作。财务主权失守，你为下一笔钱打工；叙事主权失守，平台改一次算法就掐断你的命脉；操作主权失守，你成了自己流程的人肉解释器。主权操作者的全部设计，是把这三者牢牢攥在一个人手里。

The three sovereignties are one whole: lose any one and the “company” degrades into a job disguised as a business. Lose financial sovereignty and you work for the next paycheck; lose narrative sovereignty and a single algorithm change can sever your lifeline; lose operational sovereignty and you become the human interpreter of your own process. The entire design of the sovereign operator is to keep all three firmly in the hands of one person.

SPEC

Sovereignty: 财务 · 叙事 · 操作financial · narrative · operational
Anti-pattern: 伪装成生意的工作a job disguised as a business

02SO.02

反规模化即设计

Un-scaling as Design

不增长是一个被主动选择、并设计进结构的目标，而非失败。每一个会迫使你雇人、开会、加协调层的机会，都先过一道反规模化筛子——它带来的产能，是否值得用一重主权去换。

Not growing is a goal actively chosen and designed into the structure, not failure. Every opportunity that would force you to hire, hold meetings, or add a coordination layer first passes through an un-scaling filter: is the capacity it brings worth trading away a piece of sovereignty?

≠小富即安 · 没野心——这是用结构换自由的精算，不是不思进取complacency · lack of ambition; this is a calculated trade of structure for freedom, not a refusal to strive

规模在传统创业里是默认正方向，在一人公司里是需要被论证的选项（O.03）。每多一个人，协调税以 n² 增长（第 4 节），而一人公司的全部竞争力恰恰来自 n=1 时协调税为零、判断密度为 100%。把"不scale"当成设计约束，等于把这份结构优势锁死在资产负债表里。

In traditional startups scale is the default positive direction; in a one-person company it is an option that must be argued for (O.03). With each added person the coordination tax grows as n² (Section 4), while the entire competitive edge of a one-person company comes precisely from the coordination tax being zero and judgment density being 100% at n=1. Treating “not scaling” as a design constraint locks that structural advantage into the balance sheet.

SPEC

Default: 不增长，除非被论证no growth unless argued for
Filter: 产能 vs 主权capacity vs sovereignty

03SO.03

杠杆复利

Compounding Leverage

把时间系统性地投进会复利的杠杆资产（代码、内容、受众、上下文库），而不是会被消耗的劳动。一个可操作的纪律刻度：每周至少 30% 的工作时间投入复利资产，其余才用于一次性交付。

Systematically invest time into leverage assets that compound (code, content, audience, context store) rather than into labor that gets consumed. One operable discipline marker: at least 30% of working hours each week go into compounding assets, with the rest reserved for one-off delivery.

≠把所有时间投进客户交付（卖一次时间赚一次钱，零复利）pouring all your time into client delivery (sell time once, earn once, zero compounding)

这是 O.02 的执行形态：杠杆不会自己积累，它来自每周被刻意保护出来的那 30%。代码写一次被调用无数次，一篇内容发一次被搜索无数年，一个上下文库一旦建成就让每个后续判断更快——这些是会在睡觉时增值的资产。劳动则相反：停下来就归零。一人公司的长期产能，由复利资产与消耗性劳动的比例决定。

This is the executable form of O.02: leverage does not accumulate on its own; it comes from the 30% deliberately protected each week. Code is written once and called countless times, a piece of content is published once and searched for years, and a context store, once built, makes every later judgment faster: these are assets that appreciate while you sleep. Labor is the opposite: stop and it returns to zero. The long-term capacity of a one-person company is set by the ratio of compounding assets to consumable labor.

SPEC

Cadence: ≥30%/周投入复利资产≥30%/week into compounding assets
Assets: 代码 · 内容 · 受众 · 上下文code · content · audience · context

04SO.04

公开建造

Build in Public

把建造过程本身当成分发渠道：公开进展、数字、失败与决策。一个可操作的节奏刻度：每周至少 3 次公开输出——它同时是营销、是受众积累、也是把叙事主权握在自己手里的日常动作。

Treat the building process itself as a distribution channel: publish progress, numbers, failures, and decisions. One operable cadence marker: at least three public outputs per week. It is at once marketing, audience accumulation, and the construction site of narrative sovereignty.

≠发广告 · 做内容营销（那是把产品推出去；这是把过程亮出来，让受众先于产品存在）running ads · doing content marketing (that pushes the product out; this exposes the process, so the audience exists before the product)

一人公司没有市场部，公开建造就是市场部。它把 O.02 的"媒体杠杆"落地为一个可执行节奏：持续公开让陌生人变成关注者，关注者变成第一批客户，客户变成口碑。更深一层，公开建造是叙事主权（SO.01）的日常维护——你的受众长在你自己的渠道上，而不是租来的平台流量里。

A one-person company has no marketing department; building in public is the marketing department. It grounds O.02’s “media leverage” into an executable cadence: sustained openness turns strangers into followers, followers into first customers, and customers into word of mouth. At a deeper level, building in public is the daily maintenance of narrative sovereignty (SO.01): your audience grows on your own channels, not in rented platform traffic.

SPEC

Cadence: ≥3 次/周公开输出≥3 public outputs/week
Doubles as: 营销 · 受众 · 叙事主权marketing · audience · narrative sovereignty

05SO.05

利基聚焦

Niche Focus

一人公司的护城河是窄到对手懒得进、深到对手进不来的利基，而非规模。三个问题固定定位：谁是你唯一服务的人？解决他们的什么具体痛点？为什么是你——你有什么不可复制的视角或资格？

The moat of a one-person company is not scale but a niche so narrow rivals can’t be bothered to enter and so deep they can’t break in. Three questions pin down the positioning: who is the one group you serve? What specific pain of theirs do you solve? Why you: what irreproducible perspective or credential do you hold?

≠什么都做一点 · 服务所有人（在 N=1 等于不服务任何人——你没有人力覆盖宽面）doing a bit of everything · serving everyone (at N=1 this equals serving no one; you have no manpower to cover a broad surface)

规模型公司靠覆盖广面取胜，一人公司靠占领窄缝取胜。利基越窄，你的判断密度优势越能转化成别人给不了的深度；"谁/什么/为什么是你"三问回答得越具体，营销、产品、定价的所有决策就越自动收敛。模糊的定位在 N=1 是致命的——你没有部门去对冲一个错的方向。

Scale-type companies win by covering a broad surface; a one-person company wins by occupying a narrow crevice. The narrower the niche, the more your judgment-density advantage converts into depth others can’t provide; the more concretely you answer “who / what / why you,” the more every decision in marketing, product, and pricing converges automatically. Fuzzy positioning is fatal at N=1: you have no department to hedge against a wrong direction.

SPEC

Three Q's: 谁 · 什么 · 为什么是你who · what · why you
Moat: 窄 × 深，非广 × 浅narrow × deep, not broad × shallow

06SO.06

战略性拒绝

Strategic Refusal

一人公司唯一不可再生的资源是操作者的注意力，因此最重要的战略动作是拒绝。维护一份明确的 anti-list（不做的客户、不做的功能、不进的渠道、不接的合作），与 to-do list 同等重要，甚至更重要。

The one non-renewable resource of a one-person company is the operator’s attention, so the most important strategic move is refusal. Maintaining an explicit anti-list (the clients you won’t take, the features you won’t build, the channels you won’t enter, the partnerships you won’t accept) is as important as the to-do list, perhaps more so.

≠傲慢 · 挑活——这是注意力的资本配置，每个 yes 都在花掉一个不可逆的稀缺资源arrogance · cherry-picking work; this is capital allocation of attention, where every yes spends an irreversible scarce resource

在没有团队稀释负载的结构里，每一个"是"都直接吃掉操作者本人的带宽，而带宽是整个公司唯一的瓶颈。anti-list 把拒绝从临场情绪升级为预先承诺的策略门：什么样的客户、功能、机会一律不碰，写在纸上，免去每次重新动摇。SO.02 的反规模化在产能层面说"不雇人"，SO.06 在注意力层面说"不接活"——二者是同一个主权的两面。

In a structure with no team to dilute the load, every “yes” directly eats into the operator’s own bandwidth, and that bandwidth is the company’s only bottleneck. The anti-list upgrades refusal from in-the-moment emotion to a pre-committed policy gate: which clients, features, and opportunities are off-limits, written down, sparing you from wavering anew each time. SO.02’s un-scaling says “don’t hire” at the capacity level; SO.06 says “don’t take the work” at the attention level. The two are two faces of the same sovereignty.

SPEC

Artifact: anti-list（明文不做清单）anti-list (an explicit will-not-do list)
Scarce resource: 操作者注意力the operator’s attention

07SO.07

生活先于事业

Life Before Business

一人公司里，公司是生活的工具，不是生活的目的。设计顺序是先确定想要的生活（节奏、自由度、与谁共处、为何而活），再倒推出一门能支撑它的生意，而不是反过来让生意吞掉生活。

In a one-person company, the company is a tool for life, not the purpose of life. The design order is to first settle the life you want (its rhythm, its degree of freedom, whom you spend it with, what you live for) and then work backward to a business that can support it, rather than the reverse, where the business swallows the life.

≠work-life balance（那预设工作与生活对立、需要拉平）——这里工作被嵌进生活，本就同向work-life balance (which presupposes work and life are opposed and need leveling); here work is embedded into life and already points the same way

这是把前六根支柱收束起来的那一根，也是一人公司区别于"超小型创业公司"的根本。财务、叙事、操作三重主权（SO.01）、刻意的反规模化（SO.02）、战略性拒绝（SO.06）——它们最终都为了同一件事：让这门生意服务于一个被亲手设计过的生活，而不是把人异化成自己公司的最高效员工。失去这根支柱，一人公司在效率上可以很成功，在意义上却背叛了它存在的全部理由。

This is the pillar that gathers the previous six, and the root of what separates a one-person company from an “ultra-small startup.” The threefold financial, narrative, and operational sovereignty (SO.01), the deliberate un-scaling (SO.02), and the strategic refusal (SO.06) all ultimately serve one thing: making the business serve a life designed by your own hand, rather than alienating the person into the most efficient employee of their own company. Lose this pillar and a one-person company can be very successful in efficiency while betraying, in meaning, the entire reason it exists.

SPEC

Order: 先设计生活，再倒推生意design the life first, then back into the business
Telos: 公司是工具，不是主人the company is a tool, not a master

CONCENTRIC RHYTHM · 同心节奏 —— self-improving 的人类尺度实现 · the human-scale realization of self-improving

第 6 节的 L.05 把"自我改进"定为生命系统的核心机制——系统持续观察自己、评估自己、改写自己。在 N=1，这套机制没有 telemetry 流水线，也不需要——它收缩成一组嵌套的同心节奏，由操作者本人作为唯一的反馈回路亲自运转：

L.05 of Section 6 defines “self-improving” as the core mechanism of a living system: the system continuously observes itself, evaluates itself, and rewrites itself. At N=1 this mechanism has no telemetry pipeline, nor does it need one; it contracts into a set of nested concentric rhythms, run by the operator in person as the sole feedback loop:

周 · 实验——这一周押一个可证伪的小赌注（一个功能、一篇内容、一次定价试探），周末看数据，留下能复利的、砍掉不工作的。月 · 反思——这个月的实验合起来在说什么？哪条复利曲线在变陡，哪条在变平？季 · 方向——利基（SO.05）还对吗？anti-list（SO.06）该加哪一条？年 · 哲学——这门生意还在支撑我想要的生活吗（SO.07）？四圈节奏由内向外，频率递减、可逆性递减——周实验随时可弃，年哲学一旦改写就是重定方向。这就是 self-improving 在人类尺度上的实现：靠一个人按四种周期亲手转动，而非飞轮自转。

Week · experiment. Place one falsifiable small bet this week (a feature, a piece of content, a pricing probe), read the data at the weekend, keep what compounds and cut what does not work. Month · reflection. What do this month’s experiments say together? Which compounding curve is steepening, which is flattening? Quarter · direction. Is the niche (SO.05) still right? What line should be added to the anti-list (SO.06)? Year · philosophy. Is this business still supporting the life I want (SO.07)? The four rings run from inside out, with decreasing frequency and decreasing reversibility: the weekly experiment can be abandoned at any time, while rewriting the yearly philosophy is a change of direction. This is the realization of self-improving at human scale: not a flywheel spinning on its own, but one person turning it by hand on four cycles.

陷阱与不适用Pitfalls & Boundaries

Pitfalls & Boundaries

陷阱一 · 主权而无能力。握住三重主权却没有把判断兑现成产出的能力——拥有自己的渠道却没有值得分发的东西，拥有操作主权却写不出能跑的工作流。主权是必要条件，不是充分条件；一人公司放大判断密度的同时，也放大判断者的一切弱点：没有第二个判断节点做冗余校验，孤立决策的质量衰减是结构性风险，不是情绪问题。

Pitfall one · sovereignty without capability. Holding the threefold sovereignty without the ability to cash judgment out into output: owning your own channels but having nothing worth distributing, holding operational sovereignty but unable to write a workflow that runs. Sovereignty is a necessary condition, not a sufficient one. While a one-person company amplifies judgment density, it also amplifies every weakness of the judge: with no second judgment node for redundant verification, the quality decay of isolated decisions is a structural risk, not an emotional one.

陷阱二 · 利基崇拜而无市场。SO.05 要求窄，但窄到没有人愿意付费，利基就从护城河变成无人区。把"小众"误当"高端"、把"没人做"误当"蓝海"——很多时候没人做只是因为没人要。利基聚焦必须先验证市场存在，再收窄，而不是先爱上一个窄定位再去找根本不存在的需求。

Pitfall two · niche worship without a market. SO.05 demands narrowness, but narrow to the point where no one will pay turns the niche from a moat into a no-man’s-land. Mistaking “niche” for “premium,” mistaking “no one does it” for “blue ocean”: often no one does it simply because no one wants it. Niche focus must first verify that the market exists, then narrow, rather than falling in love with a narrow positioning first and then hunting for demand that does not exist at all.

不适用 · 三类业务请勿照搬。① 重资产：需要工厂、库存、物理供应链的生意，杠杆无法 permissionless，一人撑不起资本密集度；② 强协调：产出本质上需要多个不可替代判断者实时咬合的工作（大型工程、复杂谈判、需要现场多工种协同的交付），N=1 在结构上做不到；③ 需要被管理的人：如果业务的价值恰恰来自一支需要被领导、被发展、被组织的团队，那它的本质就是 N=众多，一人公司的全部前提不成立。本章的"下限"是存在性证明，不是适用性声明。

Not applicable · do not copy this to three kinds of business. ① Capital-heavy: businesses that need factories, inventory, or a physical supply chain, where leverage cannot be permissionless and one person cannot bear the capital intensity. ② Coordination-heavy: work whose output inherently requires several irreplaceable judges meshing in real time (large engineering projects, complex negotiations, delivery that needs on-site coordination across trades), which N=1 structurally cannot do. ③ People who need to be managed: if a business’s value comes precisely from a team that needs to be led, developed, and organized, then its essence is N=many and the entire premise of the one-person company fails. The “lower bound” of this section is an existence proof, not a statement of applicability.

光谱左端的现实标定Empirical Markers · 均为自报口径

Empirical MarkersAll Self-Reported

第 3 节的组织形态光谱把规模降级为自由变量，它最左端的极限解就是本章。那条光谱左端已有现实标定——这些不再是思想实验。Sam Altman 2024 年 2 月转述他与一群科技公司 CEO 朋友的赌局：赌"第一家一人十亿美元公司"会在哪一年出现。而光谱左端早已有可被引用的样本（以下数字均为当事人自报口径，未经独立审计）：

Section 3’s spectrum of organizational forms demotes scale to a free variable, and its leftmost limiting solution is this section. The left end of that spectrum already has real-world markers; these are no longer thought experiments. In February 2024 Sam Altman recounted a wager with a group of tech-company CEO friends, betting on which year the “first one-person billion-dollar company” would appear. The left end has long had citable samples (the figures below are all self-reported by the people involved and have not been independently audited):

AI SIDE 14 最小组织是判断、上下文和外置执行，而非人数。 The minimum organization is judgment, context, and externalized execution.

SELF-REPORTED

Pieter Levels — 一人产品组合年收入约 $1.6M-3M（公开仪表盘）
Marc Lou - 2025 年收入约 $1.03M（约 20 个产品）
Justin Welsh — 一人累计收入破 $10M（自报毛利约 89%）
Altman, 2024/2 — 一人十亿美元公司赌局（CEO 朋友群）

Pieter Levels: one-person product portfolio, annual revenue roughly $1.6M-3M (public dashboard)
Marc Lou: about $1.03M revenue in 2025 (around 20 products)
Justin Welsh: one-person cumulative revenue past $10M (self-reported gross margin around 89%)
Altman, 2024/2: the one-person billion-dollar company wager (a group of CEO friends)

这些数字对照同光谱上 Cursor 量级的 $2B ARR 并不大——但结构信号极强：它们证明组织的下限已经脱离人数约束，正如光谱中段的 Anysphere 与 Anthropic 证明了人均产出的上限同样脱离了直觉约束。两端是同一命题（T1）在参数空间两侧的不同解。

Against the $2B ARR of a Cursor-scale company on the same spectrum, these figures are not large, but the structural signal is very strong: they prove the lower bound of organization has already detached from a headcount constraint, just as Anysphere and Anthropic in the middle of the spectrum proved that the upper bound of output per person has likewise detached from intuitive limits. The two ends are different solutions of the same proposition (T1) on opposite sides of the parameter space.

但左端有诚实的注脚，必须连同样本一起读。其一，孤立判断没有冗余——一人公司在放大判断密度的同时放大了判断者的盲区，没有第二个节点做交叉校验，决策质量的衰减是结构性的。其二，单一供应商即生存风险——单一模型供应商一纸条款变更就能掐断命脉，多模型架构（第 7 节支柱 04）在 N=1 时是命脉，而非支柱。其三，样本有偏——左端样本几乎全部来自低边际成本的数字产品，外推到重资产、强监管或物理交付，目前没有证据。一人公司是 T1 的极限解与试金石，而非普遍处方。它在光谱上的全部意义，是把"组织必须是很多人"这个隐含假设，永久地变成了一个待论证的命题。

But the left end carries honest footnotes that must be read together with the samples. First, isolated judgment has no redundancy: while a one-person company amplifies judgment density, it amplifies the judge’s blind spots too, and with no second node for cross-verification the decay of decision quality is structural. Second, a single vendor is a survival risk: a single model vendor can sever the lifeline with one clause change, so a multi-model architecture (Section 7, pillar 04) is, at N=1, not a pillar but the lifeline. Third, the sample is biased: the left-end samples come almost entirely from low-marginal-cost digital products, and there is currently no evidence for extrapolating to capital-heavy, heavily regulated, or physically delivered businesses. The one-person company is T1’s limiting solution and litmus test, not a universal prescription: its whole meaning on the spectrum is to turn the buried assumption that “an organization must be many people” permanently into a proposition awaiting proof.

光谱右端：判断无中心

The Other Pole · Judgment Without a Center

一人公司是规模轴的极限（N=1）。但 T1 有两根轴——规模之外，还有判断的分布。在这根轴上，一人公司同样站在一端：判断极致集中于一个核。它真正意义上的"另一个极限"，是判断极致分散——没有中心，决策权按规则散布在一张自治的网络里。常规科层、网络/平台、holacracy 依次落在中段；最右端，是分布式自治组织。两端因协调成本坍塌而第一次可行，是同一根轴的两个极限解，不是孤立形态。

The one-person company is the limit of the scale axis (N=1). But T1 has two axes; beyond scale lies the distribution of judgment. On that axis the one-person company again sits at an end: judgment maximally concentrated in a single core. Its true “other limit” is judgment maximally distributed: no center, with decision rights spread by rule across a self-governing network. Conventional hierarchy, network/platform, and holacracy fall in the middle; at the far right sits the distributed-autonomous organization. The two ends are two limiting solutions of the same axis, each made viable for the first time by the collapse of coordination cost, not isolated forms.

FIG. 14.1 / THE JUDGMENT-DISTRIBUTION SPECTRUM · 判断分布光谱 FIG. 14.1 / THE JUDGMENT-DISTRIBUTION SPECTRUM 看懂：一人公司与分布式自治是同一根轴的两极 Read: one-person and distributed-autonomous are two poles of one axis

判断的分布Distribution of judgment

一人公司One-person一个判断核 + agent 网one judgment core + an agent network

常规科层Hierarchy判断集中在高层judgment concentrated at the top

网络 · 平台Network · platform判断分到节点，平台定规则judgment spread to nodes; the platform sets the rules

HolacracyHolacracy角色化分权，无固定经理authority by role, no fixed managers

分布式自治 · DeSciDistributed · DeSci无中央判断核no central judgment core

集中Concentrated分散Distributed

这一极最真实的当代样本，是 DeSci（去中心化科学）：没有一个中央机构决定做什么、谁对谁错；研究、资助与评议分散给大量自治的贡献者。它靠三件事维持连贯，而不是靠层级裁决——上下文全部公开可读（人和 agent 都能继承）、贡献与验证有公开协议、判断质量经开放同行评价沉淀为声誉。AI 在这里是放大器：把"综合海量分散判断"的成本压到可行。"无中心如何不散"的答案，不在记账技术，而在共享上下文与开放评议——这恰是 T1 两根轴在另一极的样子。

The most concrete contemporary instance of this pole is DeSci (decentralized science): no central body decides what to pursue or who is right; research, funding, and review are spread across many autonomous contributors. It holds together not by hierarchical adjudication but by three things: context kept fully open and legible (inheritable by humans and agents alike), open protocols for contribution and validation, and judgment quality settling into reputation through open peer review. AI is the amplifier here, pushing the cost of synthesizing vast distributed judgment down to the feasible. The answer to “how does a center-less organization stay coherent” lies not in a ledger technology but in shared context and open review, which is just what T1’s two axes look like at the other pole.

CODA · 结语

把这一节收成一句话：商业是生活的工具，不是它的主人。一人公司之所以值得作为一种严肃的组织设计选项被收进这套方法论，不是因为它能赚多少钱，而是因为它把组织的下限固定在了"一个判断节点 + 一座上下文库"——从此，规模彻底成为自由变量。你可以选择 N=1，可以选择 N=众多，但无论选哪一端，组织的本质都没变：判断在哪里发生，上下文如何抵达。一人公司是这条命题在最孤独的极限处，依然成立的证明。

To gather this section into one sentence: business is a tool for life, not its master. The one-person company earns its place in this atlas as a serious option in organizational design not because of how much money it can make, but because it nails the lower bound of organization to “one judgment node + one context store.” From there, scale becomes fully a free variable. You can choose N=1, you can choose N=many, but whichever end you choose, the essence of the organization does not change: where judgment happens, and how context arrives. The one-person company is the proof that this proposition still holds at its loneliest limit.

SECTION

STARTUP LIFECYCLE · 创业生命周期

行动 · 阶段ACTION · STAGES

AI 时代创业的四个阶段

The Four Stages of Building in the AI Era

分段不为命名，为判据：知道在哪一段，才知道哪些错此刻致命。本章给判据；第 16 节给每段的操作，两章不重复。

Stages matter not for their names but for their criteria: knowing which stage you are in is how you know which mistakes are fatal right now. This chapter gives the criteria; Section 16 gives the per-stage actions, and the two do not repeat.

一句话In one line

AI 时代创业分四个阶段，重点各自变了：想法阶段抵御"过早建造"、MVP 积累持久上下文、上线释放创始人、规模把领域专长编码成护城河。贯穿四段的规律，是创始人不断向"系统设计者"上移。Building in the AI era has four stages, and each one’s focus has shifted: idea resists building too early, MVP accumulates compounding context, launch frees the founder, scale encodes domain expertise into a moat. The rule running through all four is the founder rising toward “system designer.”

SOURCE

Anthropic 2026 - The Founder's Playbook
Y Combinator failure analyses
Lean AI Native Leaderboard

AI 时代的创业不只是"用 AI"，更是"用 AI 创业"。Anthropic 2026 年发布的《The Founder's Playbook: Building an AI-Native Startup》系统化了这条新路径——AI 重新定义了传统创业生命周期的每一阶段。Idea 阶段的核心从抢先建造，变成了抵御"过早建造"的诱惑；MVP 阶段不再只是写代码，而是积累持久上下文；Launch 阶段的重心从抢市场，转向"消化技术债 + 释放创始人"；Scale 阶段也告别堆人，转而把领域专长编码为不可复制的护城河。

Building in the AI era is “founding a company with AI,” not just “using AI.” The Founder’s Playbook: Building an AI-Native Startup, published by Anthropic in 2026, systematizes this new path: AI redefines every stage of the traditional startup lifecycle. The Idea stage shifts from building first to resisting the temptation to build too early. The MVP stage is no longer just writing code; it is accumulating compounding context. The Launch stage moves from a land grab to the start of “paying down technical debt and freeing the founder.” The Scale stage leaves headcount growth behind, encoding domain expertise into a moat no one can copy.

把这四个阶段叠在一起看，会发现一个核心规律——创始人的位置在每一阶段都向"系统设计者"上移。这条角色演化曲线本身，比任何工具都更接近 AI Native 方法论的本质。

Layer the four stages on top of one another and one rule appears: the founder’s position rises toward “system designer” at every stage. That curve of role evolution is itself closer to the essence of the AI Native methodology than any tool.

核心图KEY FIGFIG. 13.0 / FOUR-STAGE ARC 看懂：从理论到落地，创始人角色怎么移 Read this: how the founder’s role moves from theory to practice

创始人的位置在四个阶段中持续上移——从"建造者"到"上下文工程师"到"系统设计者"到"对外角色"。每一阶段都把更多的执行交给系统，把更多的判断留给自己。

The founder’s position keeps rising across the four stages: from “builder” to “context engineer” to “system designer” to “external-facing role”. Each stage hands more execution to the system and keeps more judgment for the founder.

Stage 01 / Idea验证而非建造Validate, don’t build

研究而非工程的阶段

A stage of research, not engineering

目标——在投入资源建造前，组装足够证据证明问题真实存在、解决方案能够解决它。这是研究、客户访谈、竞品分析、诚实评估反证的阶段，而不是写一行 production code 的阶段。

Goal: before committing resources to building, assemble enough evidence that the problem is real and that the solution can solve it. This is the stage for research, customer interviews, competitive analysis, and an honest reckoning with disconfirming evidence; it is not the stage for writing a line of production code.

退出条件——找到 problem-solution fit。能精确说出谁有这个问题、多频繁、多严重、当前如何处理；能给出可测试的具体假设（"中型公司的财务经理每周花 4+ 小时对账，因为现有工具与会计系统不兼容"）而非泛化观察（"人们对账很麻烦"）。

Exit condition: problem-solution fit. You can state precisely who has the problem, how often, how severely, and how they handle it today; you can give a specific testable hypothesis (“finance managers at mid-size firms spend 4+ hours a week reconciling accounts because their current tools don’t integrate with the accounting system”) rather than a vague observation (“reconciliation is a pain”).

AI 时代陷阱——把建造当成验证（Mistaking Building for Validating）。Anthropic Founder Playbook 引用的数据令人警醒：42% 的传统创业失败是因为造了没人要的东西。AI 让 prototype 几分钟可成，但 prototype 不是证据——它是与潜在用户对话的道具。在 prototype 被当成"原因相信假设"而非"压力测试假设的工具"时，方法论已经失败。第二个陷阱是过早扩展——agentic coding 让 execution 跑在 validation 之前，AI 不会问"这值得造吗"，它会以同样的热情把好想法和坏想法都建造出来。第三个陷阱是客观性丧失——问 AI 验证想法，它会找到支持证据；问它压测想法，它会找到反证。AI 跟随你的方向，所以 prompt 必须是"argue against my idea / find disconfirming evidence"。

AI-era trap: mistaking building for validating. The figure cited in the Anthropic Founder’s Playbook is sobering: 42% of traditional startup failures come from building something nobody wanted. AI makes a prototype possible in minutes, but a prototype is a prop for the conversation with a potential user, not evidence. The moment a prototype becomes the “reason to believe the hypothesis” rather than a tool for stress-testing it, the method has already failed. The second trap is premature scaling: agentic coding lets execution run ahead of validation, and AI never asks “is this worth building?”; it builds good ideas and bad ideas with equal enthusiasm. The third trap is loss of objectivity: ask AI to validate an idea and it finds supporting evidence; ask it to stress-test the idea and it finds counterevidence. AI follows your direction, so the prompt must be “argue against my idea / find disconfirming evidence”.

工具组合：Claude 作为 adversarial thinker 做 devil's advocate；Claude Cowork 综合用户访谈纪要、竞品 review、行业报告生成 themed findings；只有在最后才用 Claude Code 构建轻量 prototype，而且必须用于真实对话，不是作为产品发布。

Tool stack: Claude as an adversarial thinker playing devil’s advocate; Claude Cowork synthesizing interview notes, competitive reviews, and industry reports into themed findings; only at the very end, Claude Code to build a lightweight prototype, and only for real conversations, not as a product launch.

Stage 02 / MVP累积持久上下文Accumulate context

持久上下文的建造期

The build period for compounding context

目标——把验证的问题翻译为真实用户会用的产品。但 MVP 阶段同等重要的目标是——建立持久上下文（如 CLAUDE.md 文件），让每个新 Agent session 不需要从头解释代码库。AI 时代的代码库是你与 AI 一次次协作累积出来的，可读性变成基础性而非装饰性。

Goal: translate the validated problem into a product that real users will use. But an equally important goal of the MVP stage is to build a context store (such as a CLAUDE.md file) so that each new agent session need not explain the codebase from scratch. In the AI era the codebase is what you and the AI accumulate through one collaboration after another, and readability becomes foundational rather than decorative.

退出条件——product-market fit 的真实证据：特定群体足够认可产品以保留（retention）、付费（revenue）、传播（referral）。Sean Ellis 测试（问活跃用户"如果再也不能用，你会怎么样"，40%+ 回答"非常失望"是 PMF 指标）和 Effort 测试（产品开始自我拉动而非靠创始人推动）是常用 litmus test。

Exit condition: real evidence of product-market fit, where a specific group values the product enough to retain, to pay, and to refer. The Sean Ellis test (ask active users “how would you feel if you could no longer use it?”; 40%+ answering “very disappointed” signals PMF) and the effort test (the product begins to pull itself rather than relying on the founder’s push) are common litmus tests.

AI 时代陷阱——Agentic 技术债（Agentic Technical Debt）是最深的失败模式。不像传统技术债线性累积，AI 技术债是复利的——没有写下来的架构约束，每个 session 重新推导基础决策，决策之间漂移，代码库失去连贯的心智模型。其次是零摩擦 scope creep——加一个 feature 在 agentic coding 下几小时就能完成，每个单独的添加都"合理"，但产品边界会脱缰。第三是insecure by inexperience——AI 生成 working code，但不是 inherently secure code。功能漏洞容易被发现（要么 work 要么不 work），安全漏洞要被利用了才浮现。第四是误把早期热度当 PMF——朋友圈、投资人的 portfolio 公司、Hacker News 一篇热门帖产生的 spike 都不能预测第六周。

AI-era trap: agentic technical debt is the deepest failure mode. Unlike traditional technical debt, which accrues linearly, AI technical debt compounds: with no written-down architectural constraints, every session re-derives the same foundational decisions, the decisions drift apart, and the codebase loses any coherent mental model. Next is frictionless scope creep: under agentic coding a feature takes a few hours, every single addition looks “reasonable”, and the product boundary slips its leash. Third is being insecure by inexperience: AI generates working code, but not inherently secure code. Functional bugs are easy to catch (the thing either works or it doesn’t); security holes surface only once exploited. Fourth is mistaking early buzz for PMF: a spike from your social circle, an investor’s portfolio companies, or one hot Hacker News post predicts nothing about week six.

工具组合——先用 Claude 设计架构约束并写入 CLAUDE.md（项目持久记忆，每个 session 自动加载）；然后用 Claude Code 在约束内建造，Plan Mode 强制结构化输出；每个 session 结束更新上下文文档；用 Claude Code Security 在任何真实用户接触前做安全审查；从 Day 0 就建立 measurement framework，不要等数据来了再选 metric。

Tool stack: first use Claude to design the architectural constraints and write them into CLAUDE.md (project-persistent memory, auto-loaded each session); then use Claude Code to build within those constraints, with Plan Mode forcing structured output; update the context document at the end of each session; run Claude Code Security before any real user touches the product; and stand up a measurement framework from day zero, without waiting for data to arrive before choosing the metric.

Stage 03 / Launch释放创始人Free the founder

从"做工作"转向"设计做工作的系统"

From “doing the work” to “designing the system that does the work”

目标——把早期 traction 转化为可重复、可持续的增长引擎。同时把创始人从"个人持有每一根线"的位置转向"设计让线自动运转的系统"的位置。这是把控制从微观操作升级到系统设计，而非放弃控制。

Goal: turn early traction into a repeatable, sustainable growth engine. At the same time, move the founder from “personally holding every thread” toward “designing the system that runs the threads automatically”. This is upgrading control from micro-operation to system design, not surrendering it.

退出条件——三个并行的里程碑必须同时达成：增长可重复且通道化（CAC、LTV、payback 是已知数字、可被外人质疑也能站住脚的数字）；产品能承受真实的生产负载（不只是你测试时的那种负载）；运营无需创始人瓶颈即可运转（你出差一周，公司不应该停摆）。

Exit condition: three parallel milestones must be met together. Growth is repeatable and channeled (CAC, LTV, and payback are known numbers that hold up when an outsider challenges them); the product can bear real production load (not just the load you generate while testing); and operations run without the founder as a bottleneck (if you travel for a week, the company should not stall).

AI 时代陷阱——技术债开始还款。MVP 阶段为速度做的取舍，到 Launch 阶段开始计利息——产品流量、新功能、复杂度上升，让 MVP 的捷径变成结构性负债。创始人成为瓶颈——hands-on 在 MVP 是优势，在 Launch 是约束。可观察的症状：本该 1 小时的决策拖了一周；support ticket 堆积，因为只有你知道答案；运营任务只在你个人记得时才发生。过早扩张——新市场看起来像增长机会，但它们重新引入未验证的变量（用户行为、合规要求、支付基建、品类预期），让你失去对自己数据的解读能力。安全与合规不再可推迟——真实用户、真实数据、真实企业合同上桌后，"假设性风险"瞬间变成"真实暴露"。

AI-era trap: the technical debt comes due. The trade-offs the MVP stage made for speed start charging interest at Launch; rising traffic, new features, and growing complexity turn the MVP’s shortcuts into structural liabilities. The founder becomes the bottleneck: hands-on is an advantage in MVP and a constraint at Launch. Observable symptoms include a one-hour decision dragging out for a week, support tickets piling up because only you know the answers, and operational tasks happening only when you personally remember them. Premature expansion: new markets look like growth opportunities, but they reintroduce unvalidated variables (user behavior, compliance requirements, payment infrastructure, category expectations) and cost you the ability to read your own data. Security and compliance can no longer wait: once real users, real data, and real enterprise contracts are on the table, “hypothetical risk” instantly becomes “real exposure”.

工具组合：Claude Code 做架构审计与重构（输出技术债优先级清单）；Claude 把 founder 的当前注意力清单化为"可完全自动化 / 可委托但非 founder / 必须 founder"三类，前两类交给 Claude Cowork 自动化；产品管理流程系统化：sprint 节奏、bug 路由树、metric 报告自动按时运转，不需要创始人触发；Claude Code Security 配合人工审查做企业级安全姿态。

Tool stack: Claude Code for architecture audit and refactoring (producing a prioritized technical-debt list); Claude to sort the founder’s current attention into three buckets, “fully automatable / delegable but not founder-only / founder-required”, with the first two handed to Claude Cowork for automation; product-management processes systematized, so sprint cadence, bug-routing trees, and metric reports run on schedule without the founder triggering them; and Claude Code Security plus human review for a production security posture.

Stage 04 / Scale制度化与护城河Institutionalize & moat

从内部执行到对外角色

From internal execution to an external-facing role

目标——从数千用户到数百万，从单一市场到多市场。同时构建护城河——领域专长 × 用户数据 × 集成深度的复利，不是"我们用了 AI"这种被立刻复制的卖点。创始人的工作从产品内部转向公司外部——分析师简报、IPO 路演、企业级合同、监管与公关。

Goal: from thousands of users to millions, from a single market to many. At the same time, build a moat: the compounding of domain expertise, user data, and integration depth, not the “we use AI” pitch that is copied at once. The founder’s work shifts from inside the product to outside the company: analyst briefings, the IPO roadshow, enterprise contracts, regulation, and public relations.

退出条件——从单一里程碑，变成了阈值事件。三种典型形态：（一）可持续盈利无需外部资本；（二）IPO-ready，治理、合规、财务控制、战略叙事经得起公开市场审视；（三）被收购，且收购方愿意为护城河付溢价而非仅为团队。三种都要求增长系统化且可审计、产品护城河经得起 scrutiny、组织运营成熟到不再依赖创始人个人。

Exit condition: it moves from a single milestone to a threshold event. Three typical shapes: (1) sustainable profitability with no outside capital; (2) IPO-ready, with governance, compliance, financial controls, and strategic narrative that withstand public-market scrutiny; (3) acquisition, where the acquirer pays a premium for the moat rather than for the team alone. All three require growth that is systematic and auditable, a product moat that survives scrutiny, and operations mature enough to no longer depend on the founder personally.

AI 时代陷阱——护城河错觉是 Scale 阶段最危险的失败：以为"我们用了 AI"就是差异化，但通用 AI 能力两年内会被全行业平价化。真护城河是领域专长 × 时间锁定的用户数据 × 集成深度——竞争对手即使有同样模型也无法复制。委托危机——创始人难以放手已经习惯的运营层，handoff 标准不清，结果系统不被信任、决策回流到创始人。GTM 真空——Idea/MVP/Launch 阶段的 founder-led selling 撞墙后必须建立正式 GTM 功能：市场分层、信息架构、分析师关系、销售剧本，多数技术创始人从未做过这些。扩张前规模化——还没准备好就进入新市场或新品类，把验证过的 PMF 稀释回未验证状态。

AI-era trap: the moat illusion is the most dangerous failure of the Scale stage. Believing that “we use AI” is differentiation, when general AI capability gets commoditized across the whole industry within two years. The real moat is domain expertise, time-locked user data, and integration depth, which a competitor cannot copy even with the same model. The delegation crisis: the founder struggles to let go of the operating layer they have grown used to, handoff standards are unclear, and the result is a system no one trusts and decisions flowing back to the founder. The GTM vacuum: once the founder-led selling of the Idea, MVP, and Launch stages hits a wall, a formal go-to-market function must be built, with market segmentation, messaging architecture, analyst relations, and a sales playbook, none of which most technical founders have ever done. Scaling before expansion is ready: entering a new market or category before you are prepared dilutes a validated PMF back into an unvalidated state.

工具组合：Claude 把创始人的领域专长编码为产品专有知识（Skills、CLAUDE.md、Memory 系统的组合），这是护城河的基础设施；Claude Code 构建企业级 infrastructure（公共 API、SDK、第三方集成、SLA-grade observability）；Claude Cowork 接管 GTM 执行层（content pipelines、analyst briefings、CRM hygiene、PR cadence）；最终的护城河并非 AI 本身，而是 AI 与不可复制的领域知识的复合，时间越长越深。

Tool stack: Claude to encode the founder’s domain expertise into product-proprietary knowledge (a combination of Skills, CLAUDE.md, and the Memory system), which is the infrastructure of the moat; Claude Code to build production infrastructure (public APIs, SDKs, third-party integrations, SLA-grade observability); Claude Cowork to take over the GTM execution layer (content pipelines, analyst briefings, CRM hygiene, PR cadence). The final moat is the compound of AI and domain knowledge no one can copy, deepening the longer it runs, not AI itself.

CORE INSIGHT: 创始人位置在每一阶段都向"系统设计者"上移; The founder’s position rises toward “system designer” at every stage
WARNING: 从"快速建造"转向"系统化释放创始人"; From “building fast” to “systematically freeing the founder”
COMMON FAILURE: 在 Stage 1 跳过验证、在 Stage 2 跳过上下文、在 Stage 3 跳过释放、在 Stage 4 误判护城河; Skipping validation in Stage 1, context in Stage 2, founder release in Stage 3, and misjudging the moat in Stage 4

把四个阶段叠在一起看，AI Native 创业的核心节奏不是"加速建造"——这是浅层的误读。核心节奏是"加速验证 + 持续积累上下文 + 系统化释放创始人 + 把专长编码为护城河"。每一阶段，创始人的位置都在向"系统设计者"上移：Idea 阶段从"建造者"上移到"验证设计者"；MVP 阶段从"代码作者"上移到"上下文工程师"；Launch 阶段从"决策者"上移到"系统设计者"；Scale 阶段从"内部执行者"上移到"对外角色"。

Layer the four stages together and the core rhythm of AI Native founding is not “build faster”; that is the shallow misreading. The core rhythm is “validate faster, keep accumulating context, systematically free the founder, and encode expertise into a moat”. At every stage the founder’s position rises toward “system designer”: in the Idea stage from “builder” to “validation designer”; in MVP from “code author” to “context engineer”; in Launch from “decision maker” to “system designer”; in Scale from “internal executor” to “external-facing role”.

这条角色演化曲线，就是 AI Native 方法论的终极产物。它解释了为什么 Anysphere、Cognition、Replit 这样的公司能用数十到数百人创造十亿到数百亿美元估值——他们不只是"用 AI 的传统创业团队"，他们的创始人在每一阶段都把更多执行交给系统、把更多判断留给自己。媒体报道中 Anthropic 的"Hive Mind"工作方式（90 天最长规划、Slack 长文替代会议、Project Vend 让 Claude 独立运营——前两条出自报道与访谈，未经独立验证；第三条是官方公开的负结果实验[R18]）是同一逻辑的极端版本。AI Native 方法论的真正胜利不在工具——而在创始人本身的角色演化。如果你走完这四个阶段，发现自己还在做 Stage 1 时做的工作，那么方法论失败了，不论 ARR 多高。

That curve of role evolution is the ultimate product of the AI Native methodology. It explains why companies like Anysphere, Cognition, and Replit can create valuations of one to tens of billions of dollars with tens to hundreds of people. They are not just “traditional startup teams that use AI”; at every stage their founders hand more execution to the system and keep more judgment for themselves. Anthropic’s “Hive Mind” way of working as reported in the press (planning horizons of up to 90 days, long Slack write-ups in place of meetings, and Project Vend letting Claude run a business on its own; the first two come from reporting and interviews and are not independently verified, while the third is an officially published negative-result experiment [R18]) is an extreme version of the same logic. The real win of the AI Native methodology is in the evolution of the founder’s own role, not in the tools. If you finish these four stages and find yourself still doing the work you did in Stage 1, the methodology has failed, however high the ARR.

SECTION

OPERATOR PLAYBOOK · 实操路径

行动 · 落地计划Action · Rollout Plan

四阶段的操作者手册

The Four-Stage Operator’s Handbook

底本是 Anthropic《The Founder's Playbook》（2026），放大到组织，而不只是创业。

The source text is Anthropic’s The Founder’s Playbook (2026), scaled up to the organization rather than just the startup.

一句话In one line

上一章回答"你在哪"，这一章回答"明天早上做什么"：同样四个阶段换成操作者视角，给出每段的目标、退出条件、专属陷阱，以及 Claude 三种形态各自的岗位。第一个月不写代码，先把第一性原理对齐。The previous chapter answers “where are you,” this one answers “what to do tomorrow morning”: the same four stages from the operator’s view, with each stage’s goal, exit condition, signature trap, and the role of Claude’s three forms. Month one writes no code and aligns first principles instead.

The 6-Month Architect Playbook

从 0 到第一波生产 Agent 部署

From 0 to the first wave of production agent deployments

如果第 6 个月还在 Agent Theater，回到第 1 个月——你的第一性原理不清

If you are still in Agent Theater by month 6, go back to month 1: your first principles are not clear.

M.01Month 1

第一性原理对齐 — First Principles Alignment

不要先选工具栈，先回答 5 个问题。(1) 你的工作流图（不是组织图）是什么？把组织视为流的网络而非角色的网络。(2) 哪些步骤 AI 可以完成？哪些必须人来？(3) 你的判断锚点（不可逆决策、声誉决策、价值观决策）在哪里？(4) 你的数据飞轮在哪里？组织行动如何反哺 Agent 训练？(5) 当 AI 出错，责任在谁？这一个月不写代码，不部署 Agent。所有早期失败的根因都是第一性原理含糊。

Don’t pick the tool stack first; answer five questions first. (1) What is your workflow graph (not your org chart)? See the organization as a network of flows, not a network of roles. (2) Which steps can AI do, and which must a human do? (3) Where are your judgment anchors (irreversible decisions, reputational decisions, values decisions)? (4) Where is your data flywheel? How does organizational action feed back into agent training? (5) When the AI gets it wrong, who is responsible? This month you write no code and deploy no agents. The root cause of every early failure is vague first principles.

M.02Month 2

工作流代码化 — Workflow as Code

选 3 个最高频的工作流，用 Temporal / n8n / LangGraph 写出可执行版本。这一步是最痛但最关键的——它把组织流程从"人脑里"变成"代码里"。完成标准：3 个工作流可以由代码执行，可以版本化，可以被测试。没完成不要进入下一步——你还在用人脑跑流程，AI 加进来只会放大混乱。

Pick the three highest-frequency workflows and write executable versions with Temporal / n8n / LangGraph. This step is the most painful but the most important: it moves organizational process out of “people’s heads” and into “code.” Done when: the three workflows can be executed by code, versioned, and tested. Don’t move to the next step before this is done: you are still running processes in people’s heads, and adding AI will only amplify the chaos.

M.03Month 3

上下文层建设 — Context Layer

建立向量数据库 + 决策日志 + Agent 可读文档结构。所有重要会议产生 Agent 可检索的总结。所有客户互动被结构化捕获。这是组织复利积累的开始——3 个月之后，你的 Agent 会比同样模型的竞争对手 Agent 显著更对齐你的组织。完成标准：核心知识可被 Agent 在 < 5 秒内检索到正确上下文。

Build a vector database plus a decision log plus an agent-readable document structure. Every important meeting produces an agent-retrievable summary. Every customer interaction is captured in structured form. This is where the organization’s compounding accumulation begins: three months on, your agents will be markedly better aligned to your organization than a competitor’s agents running the same model. Done when: an agent can retrieve the correct context for core knowledge in under 5 seconds.

M.04Month 4

多模型架构 + 可观测性 — Multi-Model + Observability

配置至少两家模型供应商（Anthropic + OpenAI 或 + Google），建立 evaluation harness（quality regression）。部署 LangSmith / Helicone / Arize。在能看见之前不要扩规模。完成标准：所有 Agent 调用被记录、可追溯、可重放；模型切换是 1 周以内的工程任务而不是 3 个月的重构。

Configure at least two model vendors (Anthropic + OpenAI, or + Google) and build an evaluation harness (quality regression). Deploy LangSmith / Helicone / Arize. Don’t scale before you can see. Done when: every agent call is logged, traceable, and replayable; switching models is an engineering task of under a week rather than a three-month rebuild.

M.05Month 5

第一波 Agent 部署 — First Production Agents

从最低风险的工作流开始——内部知识检索、报告生成、代码审查、内部 ticket 分诊。不要从客户面开始（Klarna、Cursor "Sam"、Air Canada 都死在这）。设定明确的 human-in-the-loop 节点。每周 retro 一次。退出条件（挂可靠性折扣）：先跑通一个可逆的环（失败能便宜回滚），按 pass@k 而非 pass@1 验收——别拿单次通过率当上线线；参照 Gartner／MIT 的落地折扣给生产失败率设基线[R10][R14]。完成标准：至少 1 个 Agent 在生产稳定运行 4 周以上，错误率相对基线可量化、可改进。

Start with the lowest-risk workflows: internal knowledge retrieval, report generation, code review, internal ticket triage. Don’t start customer-facing (Klarna, Cursor’s “Sam,” and Air Canada all died there). Set explicit human-in-the-loop nodes. Run a retro once a week. Exit conditions (with a reliability discount): first get one reversible loop working (failures roll back cheaply), accept on a pass@k, not pass@1 basis: don’t treat single-run pass rate as the go-live bar; set a production failure-rate baseline against the deployment discounts from Gartner / MIT[R10][R14]. Done when: at least one agent has run stably in production for more than 4 weeks, with an error rate quantifiable and improvable against the baseline.

M.06Month 6

节奏建立 — Establish Cadence

确立 90 天滚动规划周期（Anthropic 模式）。形成 Agent 部署 + 监控 + 改进的稳定循环。开始考虑哪些工作流可以从 human-in-the-loop 升级到 human-on-the-loop。这不是"完成 AI Native 转型"——AI Native 没有完成态，只有持续演化态。完成标准：组织能够在不增加员工的情况下，每月增加 1-3 个新 Agent 工作流到生产环境。

Establish a 90-day rolling planning cycle (the Anthropic model). Form a stable loop of agent deployment, monitoring, and improvement. Begin considering which workflows can be promoted from human-in-the-loop to human-on-the-loop. This is not “completing the AI Native transformation”: AI Native has no finished state, only a state of continuous evolution. Done when: the organization can add 1 to 3 new agent workflows to production each month without adding headcount.

RULE: 没完成上一步
不要进入下一步; Don’t move to the next step
until the previous one is done
SIGN OF FAILURE: 第 6 月仍在演示而非生产; Still demoing rather than in production by month 6
RECOVERY: 回到 Month 1 的 5 个问题; Return to the five questions of Month 1

这个路径刻意做得最低限度——它不是关于"如何成为下一个 Anysphere"，是关于"如何不在前 6 个月陷入 AI Theater"。多数 AI Native 转型在第 3 个月就崩盘了，原因不是技术问题，是第一性原理没清就开始堆工具栈。建立了原则、代码化了流程、有了上下文层和可观测性——剩下的是时间和复利的工作。没建立这些底层，再多的 Agent 也只是 Theater。

This path is deliberately kept minimal: it is not about “how to become the next Anysphere” but about “how not to fall into AI Theater in the first six months.” Most AI Native transformations collapse by month 3, not for a technical reason but because they start stacking tool stacks before the first principles are clear. Once you have established the principles, turned process into code, and built the context layer and observability, what remains is the work of time and compounding. Without those foundations, any number of agents is still just Theater.

最小单位经济：一个可逆环，一个月，花多少Unit Economics of One Loop

Unit Economics of One Loop

路径讲了"做什么"，没算"要花多少钱"。这张最小单位经济表不是预算，是让你在第一个环上线前能说出量级——只给区间与估法、不给假精确数字，照口径自己算。

The path covers “what to do” but not “what it costs.” This unit-economics sheet is not a budget but enough to state the order of magnitude before the first loop goes live: ranges and a method, not fake-precise numbers; compute your own on this basis.

单环月成本项Monthly cost of one loop	量级Order	怎么估How to estimate
token 消耗Token spend	通常最小项usually smallest	每次运行输入＋输出 token × 单价 × 月运行次数；agent 迭代与重试按峰值乘。tokens (in + out) per run × unit price × runs per month; agent iteration and retries multiply it, so size at peak.
验证人力时Verification labor-hours	常是最大项usually largest	人审一件的分钟数 × 逃过机检的件数／月；机器可检占比越高越低——即工程卷"单位验证成本"。minutes to review one output × items escaping machine checks per month; higher machine-checkable share means lower cost: the Engineering volume’s “cost per verification.”
返工预算Rework budget	按返工率标度scales with rework rate	别用 pass@1 的乐观数。按 pass@k 估：单次通过率 p，(1−p)^k 就是要人返工的比例，用它定返工工时。don’t use the optimistic pass@1 figure. Estimate on a pass@k basis: with single-run pass rate p, (1−p)^k is the share needing human rework: size the rework hours from that.

盈亏平衡触发点：当"单环月成本 ÷ 该环替下的人力月成本"降到 1 以下，环才真正省钱，在此之前是投资、非节流。成本失控是 agentic 项目最常见的死法之一——Gartner 预测 40%+ 项目将在 2027 底前取消，成本与不清晰的价值列为主因[R10]。先算表，再决定这环值不值得上。

Break-even trigger: the loop only truly saves money once “monthly cost of the loop ÷ monthly labor cost it displaces” drops below 1; before that it is an investment, not a saving. Runaway cost is one of the most common ways an agentic project dies: Gartner forecasts 40%+ of agentic projects will be canceled before end-2027, with cost and unclear value among the leading causes[R10]. Run this sheet before deciding whether the loop is worth turning on.

周一早上先填三张起手模板：Monday morning, fill three starter templates first: spec ↗ · checker ↗ · permissions ↗

SECTION

OPERATOR'S TOOLKIT · 实施工具包

行动 · 可拷贝模板ACTION · COPYABLE TEMPLATE

实施工具包

Operator’s Toolkit

这是一个会生长的工具箱，分两轨。给人的：纸笔或聊天框就能用——工作流图建模、自测与诊断、目的层三卡、操作协议五卡、可填工作表。给 AI 的：喂给 Claude Code / agent 的可执行件——架构师 skill 与六面配套 skill。开源 · MIT。

This is a toolbox that will grow, in two tracks. For humans: usable with pen and paper or any chatbox — workflow-graph modeling, self-tests and diagnostics, the three purpose-layer cards, five operating-protocol cards, and fill-in worksheets. For your AI: executables you feed to Claude Code or your agents, the Architect skill and six per-surface companions. Open-source, MIT.

一句话In one line

前十七节讲"为什么"和"按什么顺序"，这一节给两轨工具：给人的（建模法、自测、三卡、操作协议、工作表——纸笔就能用），给 AI 的（架构师与六面 skill——喂给 agent 就能跑）。把 M.01"组织即工作流图"从口号变成你今天就能填的脚手架。开源 · MIT。The first seventeen sections cover “why” and “in what order”; this one hands over tools in two tracks: for humans (the modeling method, self-tests, the three cards, operating protocols, worksheets, pen and paper suffice) and for your AI (the Architect and six per-surface skills, feed them to your agents and they run). Turning M.01, organization-as-workflow-graph, from a slogan into a scaffold you can fill in today. Open-source, MIT.

两轨 · 别混着拿给人的工具（①③④⑥⑦）不需要安装任何东西，纸笔或一个聊天框就够，产出的是你的判断；给 AI 的工具（②⑤）是喂给 Claude Code / agent 的可执行件，产出的是真实工作产物。建议顺序：先用给人的想清楚，再用给 AI 的把它跑起来。

Two tracks · do not mix them upThe tools for humans (①③④⑥⑦) need nothing installed: pen and paper or one chatbox suffice, and what they produce is your judgment. The tools for your AI (②⑤) are executables you feed to Claude Code or your agents, and what they produce is real work product. Suggested order: think it through with the human tools first, then make it run with the AI tools.

① 工作流图建模Workflow Graph Modeling

① Workflow Graph Modeling

轨道：给人的 · 纸笔画图；填好的模板可直接喂给 AGENT

TRACK: FOR HUMANS · model on paper; the filled template feeds straight into agents

M.01 说"工作流图是真相"。但真相得能被画出来才可落地。建模法只有三类节点、四个标注——足够暴露"瓶颈在图的哪条边上"（这正是第 4 节十六瓶颈反复指认的：吞吐是图的属性，不是节点的属性）。

M.01 says “the workflow graph is the truth.” But a truth has to be drawable before it can be built on. The modeling method has only three node types and four annotations: enough to expose “which edge of the graph the bottleneck sits on” (exactly what Section 4’s these bottlenecks keep pointing at: throughput is a property of the graph, not of a node).

agent · 执行默认工种（M.02），近零边际成本生成/转换/执行。 human · 判断锚决定什么值得做、为后果担责（M.05）。 policy · 门禁不可逆动作前的自动门 + 例外上报（支柱 05）。

agent · runsthe default worker (M.02): generate, transform, and execute at near-zero marginal cost. human · judgment anchordecides what is worth doing and owns the consequences (M.05). policy · gatean automatic gate before irreversible actions, plus exception escalation (pillar 05).

四个标注：可并行扇出（拆 B.01 串行链）· 判断锚（人承担后果处）· 不可逆门禁（policy 必签）· 复利上下文写入（M.03/M.04，下游可检索）。

Four annotations: parallel fan-out (break B.01’s serial chains) · judgment anchor (where a human carries the consequences) · irreversible gate (policy must sign off) · compounding-context write (M.03/M.04, retrievable downstream).

核心图KEY FIGFIG. 17.0 / 建模示例：一家风投基金的行业研究流程，改造前 → 改造后Worked example: a VC fund’s industry-research process, before and after the redesign看懂：左串行接力（流动效率<15%），右并行扇出 + 单判断锚 + 门禁Read this: left is a serial relay (flow efficiency < 15%); right is a parallel fan-out with one judgment anchor and a policy gate

下面是可直接拷贝的骨架（完整可填写版 + 填写四步说明，见 templates/workflow-graph.md ↗）。骨架里已经填好的示例，就是上图右半那条改造后的风投研究流程：三个调研 agent 并行扫描，人只出现在两个判断点——定选题、投委会拍板。把节点换成你自己的业务，骨架照用：

Below is a skeleton you can copy directly (the full fillable version plus the four-step guide is in templates/workflow-graph.md ↗). The example already filled in is the redesigned VC research flow from the right half of the figure above: three research agents scan in parallel, and people appear at only two judgment points: framing the thesis and making the IC call. Swap in your own business and the skeleton works as-is:

# 三个 scan 并行扇出；人只在 thesis 选题与 ic 拍板两处three scans fan out in parallel; humans act only at thesis framing and the ic call
workflow: VC-research-pipeline
nodes:
  - id: thesis    ; type: human  ; owner: "GP"        ; parallelizable: false
  - id: scan_a    ; type: agent  ; owner: ""          ; parallelizable: true
  - id: scan_b    ; type: agent  ; owner: ""          ; parallelizable: true
  - id: scan_c    ; type: agent  ; owner: ""          ; parallelizable: true
  - id: synth     ; type: agent  ; owner: ""          ; parallelizable: false
  - id: ic_judge  ; type: human  ; owner: "投委会"   "IC"         ; parallelizable: false
  - id: term_gate ; type: policy ; owner: "合伙人会签""partner sign-off" ; parallelizable: false
judgment_anchors: [thesis, ic_judge]   # 选题与拍板是人的判断framing and the call are human judgment
policy_gates: [term_gate]              # 出 term sheet 必签issuing a term sheet requires sign-off

本件性质 · 脚手架非真理图是为了找到该删的串行边，不是为了画图本身。规模触发线：团队 <5 人且口头对齐够用、流程稳定时——别先上 BPMN/向量库。该建，是当端到端吞吐没随"加 AI"改善、找不到瓶颈在图哪条边时。

What this is · scaffold, not truthThe graph exists to find the serial edges worth cutting, not for its own sake. Scale trigger: when the team is <5 people, verbal alignment suffices, and the process is stable, do not reach for BPMN or a vector store yet. Build it when end-to-end throughput has not improved as you “added AI” and you cannot tell which edge of the graph the bottleneck sits on.

② AI-Native 架构师 · 可执行 skillThe AI-Native Architect

② The AI-Native Architect

轨道：给 AI 的 · 在 CLAUDE CODE 里运行

TRACK: FOR YOUR AI · runs inside Claude Code

前十七节讲"做什么、按什么顺序"；这一件替你把图画出来。给它一个业务、一个创业切入点、或一次组织重构的意图，它先过一道范围闸，分流到四条轨：绿地新建 / 增量"从零切出" / 仅"AI 赋能"（诚实判定为不属于本方法论的目标群体并说明，而非粉饰）/ 情感劳动边界（AI 辅助、不主导）；再按第 3 节的 T1 落出判断的分布与上下文的流动：重画工作流图、铺运营底座、按需展开九个深度模块（经济测算要算得拢、合规落到具体法律文书、追到"最后一个被伤害的人"、护城河双向赛跑……），最后收束到内核：当执行近乎免费、判断成为稀缺，这套架构如何沿模型曲线复利、引领而非追赶。

The first seventeen sections cover “what to draw and in what order”; this piece draws it with you. Give it a business, a startup wedge, or an intent to rebuild an organization, and it first runs a scope gate into four tracks: greenfield, a from-zero carve-out, mere “AI-enablement” (judged honestly out of scope and told so, not dressed up), or an emotional-labor boundary (AI assists, never leads). Then it designs T1’s two structures from Section 3, the distribution of judgment and the flow of context: it redraws the workflow graph, specifies the four-layer substrate, opens only the depth modules a case demands (economics that tie out, compliance grounded in the actual legal instrument, tracing harm to the last human harmed, a two-sided moat race), and closes on the kernel: once execution is nearly free and judgment is the scarce factor, the architecture compounds along the model curves, leading rather than catching up.

三类节点，沿用 M.01：agent · 执行human · 判断锚policy · 门禁

Three node types, carried from M.01:agent · runshuman · judgmentpolicy · gate

# 在 Claude Code 里调用invoke inside Claude Code
$ /skill ai-native-architect
> "帮我把这家公司按 AI 重新设计：……""redesign this company around AI: ..."

  → 范围闸 · Track A / B / 出域 / 边界scope gate · Track A / B / out-of-scope / boundary
  → T1 · 工作流图 · 运营底座 · 深度模块 · 内核T1 · workflow graph · four-layer substrate · depth modules · kernel
  → 一份 AI-Native 架构蓝图one AI-Native Architecture Blueprint

开源仓库：Open-source: github.com/watterfall/ai-native-architect ↗

本件性质 · 与正文互为表里正文是"为什么"与"应该是什么"，skill 是可执行的"怎么做"；二者共享同一套词汇（T1 / 十六瓶颈 / 运营底座 / 七支柱 / 内核）。开源协议 MIT；经议会式多角色评审（10 角色 × 随机案例，均分 9.0/10）迭代验证。系统设计见仓库 docs/SYSTEM-DESIGN.md。

What this is · the mirror of the methodologyThe methodology is the “why” and the “what should be”; the skill is the executable “how”, and both share one vocabulary (T1 / these bottlenecks / the four-layer substrate / the architectural pillars / the kernel). Licensed MIT; iterated and validated by a multi-role council review (10 roles × random cases, mean 9.0/10). The system design lives in docs/SYSTEM-DESIGN.md.

③ 更简单的入手 · 随取随用Lower-barrier tools, ready to hand

③ Lower-barrier tools, ready to hand

轨道：给人的 · 纸笔或任意聊天框

TRACK: FOR HUMANS · pen & paper, or any chatbox

不是每个人都用 Claude Code。这一组是纸笔、或任意聊天框就能上手的轻量工具：无需安装，由易到难，给想先摸到内核、还没准备好上 skill 的人。一个决定一切的问题：

Not everyone uses Claude Code. This set is low-barrier and needs nothing but pen and paper, or any chatbot: no install, simplest first, for people who want to touch the kernel before reaching for the skill. The one question that decides everything:

redraw-vs-graft · 一题自测把方案里的 AI 全部删掉——这个组织会塌回成一张普通组织架构图、角色和交接照旧吗？是 = 你做的是 AI 赋能；否 = 没有 agent 它根本不成立，这才是原生形态。

redraw-vs-graft · the one self-testDelete all the AI from your plan. Does the org collapse back into a normal org chart with the same roles and hand-offs? Yes = you designed AI-enablement; No = it cannot exist without agents, which is the native form.

自测卡 · 你是 AI-Native 还是 AI 赋能?在哪条轨?(纸笔 · 约 2 分钟)
十六瓶颈诊断表 · 给组织打 0-16 分,读出区段。(纸笔 · 一页)
T1 画布 · 判断的分布 × 上下文的流动,一页填空。(约 10 分钟)
可移植提示词 · 粘进任意聊天框(ChatGPT / Claude / Gemini),跑一份精简蓝图。(无需安装)

Self-test · AI-Native or AI-enabled? Which track? (pen + paper, ~2 min)
16-Bottleneck Scorecard · score your org 0-16, read the band. (one page)
T1 Canvas · judgment x context, on one page. (fill-in, ~10 min)
Portable Prompt · paste into any chatbot (ChatGPT / Claude / Gemini) for a lite blueprint. (no install)

四件随取随用：All four, ready to hand: github.com/watterfall/ai-native-architect/tools ↗

④ 对齐目的层 · 三张卡Three Cards for the Purpose Layer

④ Three Cards for the Purpose Layer

轨道：给人的 · 打印出来就能用

TRACK: FOR HUMANS · print and use

前三件让组织跑得动，这三张卡让它跑得对——把内核那条人本主张（机器吞掉执行，判断与意义重新落回人手里），连同判断分布与组织拓扑，变成今天就能填的自检。可直接拷走。

The first three pieces make the organization run; these three cards keep it running toward the right thing: they turn the kernel’s human claim (machines absorb execution; judgment and meaning fall back to people), together with judgment distribution and org topology, into self-checks you can fill in today. Copy them as-is.

卡 1 · 人的尺度

Card 1 · The Human Measure

这一季，团队的判断权变多了，还是变少了？
This quarter, did the team’s judgment expand or shrink?
AI 接走的是杂活，还是把人变成了喂料工？
Did AI take the drudgery, or turn people into feeders for the machine?
有人因为更少琐事、而更投入、更愿意来吗？
Is anyone more engaged, more willing to show up, because of less busywork?
若指标都在涨、人却越来越忙——警报：本末倒置。
If every metric is up yet people are busier, that is the alarm: the inversion.

卡 2 · 判断分布定位

Card 2 · Locate Your Judgment

沿光谱标出你现在的位置：一人 → 小核心 → 科层 → 网络 → holacracy → 分布式自治。
Mark where you sit on the spectrum: one-person → small core → hierarchy → network → holacracy → distributed-autonomous.
该往集中端（少数核，连贯快）还是分散端（开放网络，抗单点盲区）走？
Move toward the concentrated end (few cores, fast and coherent) or the distributed end (open network, resistant to single-point blind spots)?
这一步赌的是什么：速度与连贯，还是多元与冗余？
What is this move betting on: speed and coherence, or diversity and redundancy?

卡 3 · 组织拓扑填空

Card 3 · Map Your Topology

判断节点（人）：列出 ___ 个，各自为哪张图担责。
Judgment nodes (people): list ___, and which graph each is accountable for.
agent 网：哪些执行可整体下放？预期 ___ 个 agent。
Agent network: which execution can be handed over wholesale? ___ agents expected.
上下文层：你的"共享世界模型"在哪？谁还在靠人肉转译？那就是下一个要删的瓶颈。
Context layer: where is your shared world model? Who still relays it by hand? That is the next bottleneck to delete.

⑤ 可执行配套 · 六个面各一件 skillExecutable Companions

⑤ Executable Companions · one skill per surface

轨道：给 AI 的 · 喂给 AGENT 的六件可执行 SKILL

TRACK: FOR YOUR AI · six executable skills for your agents

②的架构师设计组织（出蓝图）；这里把各面的活跑起来。同一个内核现在落成七件系统：架构层一件（架构师）+ 六个可执行配套件——组织运营 · 工程 · 设计 · 研究 · 学习 · 创新，六个面、同一内核、彼此耦合、阅读无固定起点。这里的"可执行"不是总图里的"执行面向"：研究、学习、创新仍在认知与价值上游，只是各自也有能把方法跑起来的 skill。其中组织这一面的配套件是 ai-native-org：架构师把组织设计好之后，它负责日常运转——调度 agent 队、跑运营节奏、把异常路由到指定的人、保上下文不越用越乱、跑复审队列。

Card ② designs the organization (the blueprint); this section makes each surface runnable. The one kernel now ships as a seven-piece system: one architecture skill (the Architect) plus six executable companions: org-operate · engineering · design · research · learning · innovation, six faces of one kernel, mutually coupled, with no fixed reading entry. “Executable” here is not the same as “execution-facing” in the system map: research, learning, and innovation remain upstream cognition and value surfaces; they simply have runnable skills too. The organization companion is ai-native-org: once the Architect has designed the org, it runs it day to day — orchestrating the agent fleet, keeping an operating cadence, routing exceptions to named humans, keeping context from rotting, and running the review queue.

# 在 Claude Code 里调用（六件之一）invoke inside Claude Code (one of the six)
$ /skill ai-native-org
> "我们 6 人 AI 客服团队架构已搭好，帮我把日常运营流程定下来""our 6-person AI support team is built, set its daily operating runbook"

  → 一份运营 runbook · 复审队列路由 · 上下文保鲜 · 例外上报到指定的人an Operating Runbook · review-queue routing · context upkeep · exceptions to named humans

六件随取随用：All six, ready to hand: github.com/watterfall/ai-native-architect/skills ↗ · 一次装齐：install all: /plugin marketplace add watterfall/ai-native-architect

可执行配套 · 七件系统的另外六件架构师是判断层（设计组织）；可执行配套是把各域方法跑起来的六件，每件产出真实工作产物（工程＝代码＋规格＋评测套件；设计＝设计产物＋品味依据＋指纹；研究＝可信账本；学习＝守住该练的难度；创新＝押注表；组织＝运营 runbook）。组织件 ai-native-org 的止步线：永不把 agent 放在被架构保留给人的信任/安全节点中心——例外只上报给指定的人，承压时继续往上报，绝不自动了结。开源 MIT。

Executable companions · the other six of the sevenThe Architect is the judgment tier (it designs the org); the executable companions are the six skills that make each domain runnable, each producing a real work product (engineering = code + spec + eval suite; design = artifact + taste rationale + fingerprint; research = a credibility ledger; learning = protecting the difficulty worth keeping; innovation = a bet sheet; org = an operating runbook). The org piece ai-native-org’s stop-line: never put an agent at the center of a trust/safety node the architecture reserved for a human: exceptions route to a named person, escalate up under pressure, never auto-resolve. MIT.

⑥ 可填工作表 · 一域一张Worksheet Pack

⑥ Worksheet Pack · one sheet per surface

轨道：给人的 · 复制、填写、喂回上下文库

TRACK: FOR HUMANS · copy, fill, feed back into the context store

前五件是脚手架与 skill；这一组是可直接拷走填的工作表：三张跨域起手模板 + 六域各一张。复制、填、把成品喂回该域的上下文库。

The first five are scaffolds and skills; this set is copy-and-fill worksheets: three cross-surface starters plus one per surface. Duplicate, fill, and feed the filled artifact back into that surface’s context store.

起手三张（跨域）· 机器可检规格 ↗ · 独立检查器 ↗ · 权限边界 ↗
组织 · 工作流图 ↗
工程 · 工程验证仪表盘 ↗
设计 · 设计硬／软判据表 ↗
研究 · 研究证据清单 ↗ · 研究环 ↗
学习 · 退出脚手架演练 ↗
创新 · 创新下注日志 ↗

Three starters (cross-surface) · machine-checkable spec ↗ · independent checker ↗ · permission boundary ↗
Organization · workflow graph ↗
Engineering · eval dashboard ↗
Design · rule matrix ↗
Research · evidence ledger ↗ · research loop ↗
Learning · withdrawal drill ↗
Innovation · bet log ↗

全部模板与用法：All templates and their use: templates/README.md ↗

⑦ 给人的操作协议 · 五张卡Operating Protocols for Humans

⑦ Operating Protocols for Humans · five cards

轨道：给人的 · 不需要安装任何东西

TRACK: FOR HUMANS · nothing to install

前面的工具帮你把组织画出来、跑起来；这五张卡管的是人每天怎么待在这个组织里。内容全部来自正文已论证的纪律（各卡末尾标出处），只是收拢成开完会就能执行的动作。

The tools above help you draw the organization and run it; these five cards govern how a person inhabits it day to day. Everything here is discipline already argued in the body text (each card cites its source section), gathered into actions you can take right after the meeting ends.

卡 A · 同步时间协议

Card A · The Synchronous-Time Protocol

默认异步：工作流状态对所有人可见，交接由事件触发，决策带理由留档。
Async by default: workflow state visible to everyone, hand-offs triggered by events, decisions filed with their reasons.
同步在场只花在三件事上：判断分歧、关系建立、危机处理。
Spend synchronous presence on only three things: resolving judgment disputes, building relationships, handling crises.
每个例会都答一次："取消这个会，哪个决策会变慢？"答不上来，就取消。（出处：第 4 节协调税）
For every standing meeting, answer once: “cancel it, and which decision slows down?” No answer, cancel it. (Source: the coordination tax, Section 4)

卡 B · 决策日志

Card B · The Decision Log

每个不可逆决策记四行：决定了什么 / 基于什么信息 / 赌的是什么 / 什么信号说明赌错了。
For every irreversible decision, four lines: what was decided / on what information / what the bet is / what signal means the bet was wrong.
每月回看一次：错的决策里，多少是信息没送到（上下文问题），多少是判断本身错了（人的问题）。二者的解法完全不同。
Review monthly: of the wrong calls, how many were information failing to arrive (a context problem) versus judgment itself failing (a human problem). The two have entirely different fixes.
（出处：第 3 节「组织 = 判断 × 上下文」——日志就是这条公式的原始数据。）
(Source: Section 3, “organization = judgment × context”: the log is that formula’s raw data.)

卡 C · 招聘与角色标尺

Card C · The Hiring & Role Gauge

面判断力，不面产出量：给候选人一个你们真实经历过的决策，问"你会怎么定、还缺什么信息"。
Interview for judgment, not output: give the candidate a decision you actually faced and ask “what would you decide, and what information is missing?”
岗位说明写"为哪张图的哪些后果负责"，不写"做哪些活"——活是 agent 的。
Write the role as “accountable for which consequences on which graph,” not “does which tasks”; tasks belong to agents.
警惕在招"人肉路由器"：如果职责主要是转译与转发信息，先问这条边为什么还需要一个人。（出处：第 4 节瓶颈清单）
Beware of hiring a human router: if the role is mostly translating and relaying information, first ask why that edge still needs a person. (Source: the bottleneck list, Section 4)

卡 D · 一周操作节奏

Card D · The Weekly Operating Rhythm

周一看端到端吞吐，不看个人产出；指认本周最贵的一条串行边。
Monday: look at end-to-end throughput, not individual output; name the week’s most expensive serial edge.
周中把判断队列清零——积压的判断比积压的执行贵得多，agent 在等的是你。
Midweek: clear the judgment queue; queued judgment costs far more than queued execution, and the agents are waiting on you.
周五把这周学到的写回上下文库，让下周的 agent 比这周聪明。（出处：第 16 节操作者手册的节奏建立）
Friday: write the week’s learning back into the context store, so next week’s agents are smarter than this week’s. (Source: cadence, the operator’s handbook, Section 16)

卡 E · 反模式月检

Card E · The Monthly Anti-Pattern Check

每月拿第 11 节的失败清单自查：我们在表演吗（这个项目立给谁看）？人在给 AI 打工吗？指标涨了，真实盈亏动了吗？
Each month, run Section 11’s failure list on yourselves: are we performing (who is this project staged for)? Are people working for the AI? Metrics are up: did real P&L move?
命中任何一条：先停止扩张，回到 T1 画布重新画，再继续。
Hit any one of them: stop scaling first, go back to the T1 canvas and redraw, then continue.
（出处：第 11 节失败模式——它们可预见，因此可避免，但前提是有人定期看。）
(Source: the failure modes, Section 11: foreseeable and therefore avoidable, but only if someone actually looks on schedule.)

REF	级GR	SOURCE	承重论断Load-bearing claim
R1	Ⅲ	Shahidi, Rusak, Manning, Fradkin & Horton《The Coasean Singularity? Demand, Supply, and Market Design with AI Agents》NBER WP 34468 · 2025-11 · nber.org/papers/w34468（手册章节 PDF (handbook chapter PDF c15309）)	交易成本四要素恰为 Agent 可执行任务；make-or-buy 边界移动；存量三阶段路径；agent-first 市场从终点设计The four components of transaction cost map exactly onto agent-executable tasks; the make-or-buy boundary shifts; a three-stage path for the installed base; agent-first markets designed backward from the endpoint
R2	Ⅲ	Hadfield & Koh《An Economy of AI Agents》arXiv:2509.01063 · 2025-09（内引 Chen-Elliott-Koh, JET 2023【Ⅱ】· DOI 10.1016/j.jet.2023.105647） (internally cites Chen-Elliott-Koh, JET 2023 [Ⅱ] · DOI 10.1016/j.jet.2023.105647) · arxiv.org/abs/2509.01063	企业规模上限源于人类固有约束、不适用于 Agent；巨型企业相变预测（条件性）The ceiling on firm size stems from intrinsic human constraints and does not apply to agents; a (conditional) prediction of a mega-firm phase transition
R3	Ⅱ	Agrawal, Gans & Goldfarb《Exploring the Impact of Artificial Intelligence: Prediction versus Judgment》NBER WP 24626 (2018) · Information Economics and Policy 47 (2019):1-6 · doi.org/10.1016/j.infoecopol.2019.05.001 · nber.org/w24626	AI 降低的是预测成本；判断=目标函数不可编码时的人类能力；委托定理What AI lowers is the cost of prediction; judgment is the human capacity invoked when the objective function cannot be coded; the delegation theorem
R4	Ⅲ	Agrawal, Gans & Goldfarb《The Economics of Bicycles for the Mind》NBER WP 34034 · 2025-07 · nber.org/w34034	机会判断恒互补 / 收益判断条件互补 / 实现技能被替代（均为模型假设与定理）Opportunity judgment is always complementary / return judgment is conditionally complementary / execution skills are substituted (all as model assumptions and theorems)
R5	Ⅲ	Gans《AI as Strategist》NBER WP 33650 · 2025（Prop. 6, p.37 引 2025-04 版；2025-12 存在修订版，命题/页码或漂移） (Prop. 6, p.37 cites the 2025-04 version; a 2025-12 revision exists, so proposition/page numbers may drift) · nber.org/w33650	控制权增量价值随可信度递减；AI 控制权应逐域分配；透明度替代权威The marginal value of control declines as trustworthiness rises; AI control should be allocated domain by domain; transparency substitutes for authority
R6	Ⅳ	Karpathy《Software Is Changing (Again)》YC AI Startup School · 2025-06-16 · ycombinator.com/library/MW；《Power to the people: How LLMs flip the script on technology diffusion》2025-04-07 · karpathy.bearblog.dev	Software 3.0；扩散反转；Agent 十年（Waymo 论据，Waymo 否认"远程驾驶"定性）；顺行性遗忘症；验证瓶颈Software 3.0; the diffusion reversal; the decade of agents (the Waymo argument, though Waymo disputes the “remote driving” characterization); anterograde amnesia; the verification bottleneck
R7	Ⅱ	Bick, Blandin & Deming《The Rapid Adoption of Generative AI》NBER WP 32966 · Management Science (2026) · doi.org/10.1287/mnsc.2025.02523 · nber.org/w32966	2024 末约 40% 的 18–64 岁人口使用 genAI；企业正式采纳 5-9%；注：企业两年 28% ≈ PC 时代速度By end of 2024, ~40% of the US population aged 18–64 use genAI; formal enterprise adoption 5-9%; note: enterprise reaches 28% in two years, roughly the pace of the PC era
R8	Ⅳ	Mollick《Reshaping the tree: rebuilding organizations for AI》One Useful Thing · 2023-11-27 · oneusefulthing.org；1855 McCallum 图谱：美国国会图书馆diagram: US Library of Congress loc.gov/2017586274	组织技术预设仅人类智能；第一张现代组织图谱 1855；地基重建论（规范性主张）Organizational technology presupposes human intelligence alone; the first modern org chart, 1855; the case for rebuilding the foundation (a normative claim)
R9	Ⅳ	Mollick《Making AI Work: Leadership, Lab, and Crowd》2025-05 · oneusefulthing.org/making-ai-work；《Detecting the Secret Cyborgs》2023 · oneusefulthing.org/secret-cyborgs；Sana 播客（官方词级转录核实）podcast (verified against the official word-level transcript) · sanalabs.com/strange-loop	裁员叙事→员工隐藏 AI 收益；Leadership/Lab/Crowd 三要素The layoff narrative drives employees to hide their AI gains; the Leadership / Lab / Crowd triad
R10	Ⅴ	Gartner 新闻稿press release · 2025-06-25 · gartner.com	40%+ agentic 项目 2027 底前取消（预测）；agent washing 定义；"数千家供应商约 130 家真实"40%+ of agentic projects canceled before end of 2027 (forecast); definition of agent washing; “of thousands of vendors, roughly 130 are real”
R11	Ⅴ	Gartner 新闻稿press release · 2026-02-03（底层调查 2025-10，n=321 客服负责人） (underlying survey 2025-10, n=321 customer-service leaders) · gartner.com	50% 归因 AI 裁员的公司 2027 前回聘（预测）；实际因 AI 裁坐席的公司仅 20%50% of companies that attributed layoffs to AI will rehire before 2027 (forecast); only 20% actually cut agent seats because of AI
R12	Ⅰ	Lavingia《No Meetings, No Deadlines, No Full-Time Employees》2021-01-07 · sahillavingia.com/work；SEC EDGAR Form C/C-AR, CIK 1532978 · sec.gov/edgar（FY2020 C-AR）sec.gov/edgar (FY2020 C-AR)	Gumroad 零全职 + 约 25 承包者；FY2020 净营收 $9.21M／净利 $1.06M（⚠ 前 Agent 时代对照，2023-24 后模式已弃）Gumroad: zero full-time staff plus roughly 25 contractors; FY2020 net revenue $9.21M / net profit $1.06M (⚠ a pre-agent-era reference point; the model was abandoned after 2023-24)
R13	Ⅳ	Hoffman & Beato《Superagency: What Could Possibly Go Right with Our AI Future》Authors Equity · 2025-01-28 · superagency.ai	个体被 AI 赋能后能力在社会中复利扩散——"操作者即编排者"的思想谱系旁证Once individuals are empowered by AI, their capability compounds and diffuses across society: a collateral source in the intellectual lineage of “the operator as orchestrator”
R14	Ⅲ	MIT NANDA《The GenAI Divide: State of AI in Business 2025》预印 · 2025-07（报告自述方法：52 家组织访谈 + 153 份高管问卷 + 300+ 公开部署梳理；媒体广传"150 访谈+350 问卷"系二手转述口径；非同行评议）· 公开镜像preprint · 2025-07 (report’s self-stated method: 52 organizational interviews + 153 executive surveys + 300+ public deployments reviewed; the widely circulated “150 interviews + 350 surveys” is a second-hand restatement; not peer-reviewed) · public mirror mlq.ai（v0.1 preliminary）mlq.ai (v0.1 preliminary)	95% 定制试点六个月窗口内无可衡量 P&L 影响；归因=学习缺口；外购成功率约为自建两倍（反向信号）95% of custom pilots show no measurable P&L impact within a six-month window; attributed to the learning gap; buying succeeds about twice as often as building (a counter-signal)
R15	Ⅲ	METR 随机对照试验（RCT 设计强；arXiv 预印本＋机构报告，未经同行评审，故记 Ⅲ）randomized controlled trial (strong RCT design; arXiv preprint plus institutional report, not peer-reviewed, hence graded Ⅲ) · 2025-07 · arXiv:2507.09089 · arxiv.org/abs/2507.09089 · 机构原发institutional original metr.org（16 名资深开源维护者；作者警告勿外推至 greenfield） (16 senior open-source maintainers; the authors warn against extrapolating to greenfield work)	资深开发者用 AI 实测慢 19%、自感快 20%——合成自信的刻度Senior developers measured 19% slower with AI yet felt 20% faster: a gauge of synthetic confidence
R16	Ⅱ	Acemoglu《The Simple Macroeconomics of AI》NBER WP 32487 (2024) · Economic Policy 40(121) 2025:13-58 · doi.org/10.1093/epolic/eiae042 · nber.org/w32487	AI 十年累计 GDP 贡献约 1.1-1.6%（接近 Acemoglu 上界情景；其中心 TFP 估计约 0.66%）AI’s cumulative GDP contribution over a decade is roughly 1.1-1.6% (near Acemoglu’s upper-bound scenario; his central TFP estimate is ~0.66%)
R17	Ⅰ	Moffatt v. Air Canada, 2024 BCCRT 149（BC 民事调解仲裁庭 CRT 裁决） (BC Civil Resolution Tribunal, CRT decision) · canlii.org/2024bccrt149	公司须为 chatbot 承诺承担法律责任——责任锚的首例裁决（仲裁庭层级，非上级法院判例）A company must bear legal responsibility for its chatbot’s promises: the first ruling that anchors liability (at tribunal level, not a higher-court precedent)
R18	Ⅳ	Anthropic《Project Vend》Phase 1 · 2025-06-27 · anthropic.com/research/project-vend-1（官方结局：亏损／被诱导／虚构账户；2025 末 Phase 2 续篇自报显著改善 (official outcome: losses / manipulated / fabricated accounts; a late-2025 Phase 2 sequel self-reports marked improvement at project-vend-2，本表仍以 Phase 1 为负结果探针口径）, but this registry still treats Phase 1 as the negative-result probe)	本质视角的负结果探针——引用为边界证据，不是成功案例A negative-result probe from the ontological angle: cited as boundary evidence, not as a success case
R19	Ⅳ	Tobi Lütke 内部备忘录（Shopify）· 2025-04-07 本人公开于 Xinternal memo (Shopify) · 2025-04-07, posted publicly by Lütke on X · x.com/tobi；Luis von Ahn 全员邮件（Duolingo）· 2025-04-28 官方 LinkedIn 发布all-hands email (Duolingo) · 2025-04-28, published on the official LinkedIn · linkedin.com/duolingo（2025-06 von Ahn 公开回调表态） (2025-06: von Ahn publicly walked the stance back)；Klarna：官方新闻稿 2024-02Klarna: official press release 2024-02 · klarna.com/press ＋ CEO 2025 访谈（回聘人工）+ CEO 2025 interview (rehiring humans)	AI-first 转型一手文本与回调对照（第 9 节案例口径）First-hand texts of AI-first transformations set against their walk-backs (Section 9 case sourcing)
R20	Ⅳ	黄益贺（newtype 社群主理人 · AI 实践者/独立开发者；非所述三家公司内部人）《AI 原生组织的底层逻辑》Huang Yihe (host of the newtype community · AI practitioner / independent developer; not an insider at the three companies discussed) “The Core Logic of AI-Native Organizations” · 2026-06 · 书面版written version newtype.pro · 视频video 2026-06-10 · bilibili.com/BV1FVEQ6cEfY（链接待复核） (link pending re-check) · 本地转写稿 references/黄益贺-AI原生组织的底层逻辑-转写稿.mdlocal transcript references/huang-yihe-core-logic-of-ai-native-orgs-transcript.md	三段式组织形态叙事（启发式，非实证分类；各代事实基底另见 R46/R47）；串行瓶颈曝光效应；风投并行生产/串行消费案例（从业者观察）A three-stage narrative of organizational form (heuristic, not an empirical taxonomy; for each generation’s factual base see R46/R47); the serial-bottleneck exposure effect; the VC case of parallel production / serial consumption (practitioner observation)
R21	Ⅳ	Anthropic《How Anthropic teams use Claude Code》2025-07-24 · anthropic.com/news；《How AI Is Transforming Work at Anthropic》2025 · anthropic.com/research	10 个团队（含法务/增长）把 agentic 工作流嵌入部分流程（公司自述，不支持"整体运转"强表述）；132 名工程师/研究员调查样本自报 60% 工作借助 Claude／生产率感知 +50%／"可完全委托"仅 0-20%10 teams (including legal and growth) embedded agentic workflows into parts of their process (company self-report, which does not support the stronger claim of “running the whole thing”); a sample of 132 engineers/researchers self-reports 60% of work aided by Claude / perceived productivity +50% / “fully delegable” only 0-20%
R22	Ⅳ	Ivan Zhao《Steam, Steel, and Infinite Minds》Notion · 2025-12-22 · notion.com/blog	AI=组织的钢铁/承重墙；换水车=加 AI 退化论；公司是晚近发明并随规模退化；Notion 1000 员工/700+ agents（自报）；legibility↔scale 取舍AI is the steel / load-bearing wall of the organization; the watermill swap is the “bolt-on AI degrades it” argument; the company is a recent invention that degrades with scale; Notion 1000 employees / 700+ agents (self-reported); the legibility↔scale trade-off
R23	Ⅰ/Ⅱ	VOC 1602（首家向公众公开发行、可转让股份的股份公司） (the first joint-stock company to issue transferable shares to the public) · UK Limited Liability Act 1855 / Joint Stock Companies Act 1856 · Chandler《The Visible Hand》1977 · 综述survey Micklethwait & Wooldridge《The Company: A Short History》2003 · VOC 参考VOC reference britannica.com	现代/公开交易意义上的公司约 400 年、分层叠加的发明（股份制/有限责任/科层各自年轻）——史实硬，"公司非永恒"为论点The company in the modern, publicly traded sense is about 400 years old, an invention assembled in layers (joint-stock, limited liability, and hierarchy each young in its own right): the history is solid, while “the company is not eternal” is the argument
R24	Ⅱ	Edmondson《Psychological Safety and Learning Behavior in Work Teams》ASQ 1999, 44(2):350-383 · doi.org/10.2307/2666999	B.13 信任半径——心理安全高的团队报告更多错误（差异来自报告意愿而非犯错频次）B.13 Radius of trust: teams with high psychological safety report more errors (the difference comes from willingness to report, not the rate of erring)
R25	Ⅱ	Pfeffer《Power in Organizations》Pitman 1981（专著） (monograph)；Bachrach & Baratz《Two Faces of Power》APSR 56(4) 1962:947-952 · doi.org/10.2307/1952796	B.14 权力梯度——议程设置是权力第二张面孔B.14 Power gradient: agenda-setting is the second face of power
R26	Ⅱ	Deci, Koestner & Ryan 元分析meta-analysis《A Meta-Analytic Review of Experiments Examining the Effects of Extrinsic Rewards on Intrinsic Motivation》Psychological Bulletin 125(6) 1999:627-668 · doi.org/10.1037（自我决定论谱系的实证锚） (the empirical anchor of the self-determination-theory lineage)；Frey & Jegen《Motivation Crowding Theory》J. Economic Surveys 15(5) 2001:589-611 · doi.org/10.1111/1467-6419.00150	B.15 动机抽干——外在控制挤出内在动机B.15 Motivation drain: extrinsic control crowds out intrinsic motivation
R27	Ⅱ	Hannan & Freeman《The Population Ecology of Organizations》AJS 1977, 82(5):929-964 · doi.org/10.1086/226424（主锚） (primary anchor)；algorithmic feudalism 框架另见for the algorithmic-feudalism frame see also Varoufakis《Technofeudalism》2023（大众向专著 Ⅳ，尚有争议） (a popular monograph, Ⅳ, still contested)	B.16 生态位锁定——组织命运由生态位置与依赖结构决定B.16 Niche lock-in: an organization’s fate is set by its ecological position and dependency structure
R28	Ⅱ	Holmström & Milgrom《Multitask Principal-Agent Analyses: Incentive Contracts, Asset Ownership, and Job Design》JLEO 7 (Special Issue) 1991:24-52 · doi.org/10.1093/jleo	激励维透镜——多任务下可度量任务挤出不可度量任务（补"判断质量如何度量"）Incentive-dimension lens: under multitasking, measurable tasks crowd out unmeasurable ones (addressing “how do you measure judgment quality”)
R29	Ⅱ	Galbraith《Designing Complex Organizations》Addison-Wesley 1973 专著（《Organization Design》系 1977 年另一部）monograph (the separate Organization Design is a 1977 work)／《Organization Design: An Information Processing View》Interfaces 4(3) 1974:28-36 · doi.org/10.1287/inte.4.3.28（"organizations as information processing systems" 为 OIPT 通行概括，非直接引语） (“organizations as information processing systems” is the common OIPT paraphrase, not a direct quote)	信息维透镜——组织设计=信息处理能力与需求的匹配Information-dimension lens: organization design is the match between information-processing capacity and demand
R30	Ⅱ	Simon bounded rationality（术语首见《Models of Man》1957, p.198；1947《Administrative Behavior》用 limits of rationality；考证见 SEP (the term first appears in Models of Man 1957, p.198; the 1947 Administrative Behavior uses “limits of rationality”; provenance per SEP plato.stanford.edu/bounded-rationality）；认知负荷：); cognitive load: Sweller《Cognitive Load During Problem Solving》Cognitive Science 12 (1988):257-285 · doi.org/10.1207	认知维透镜——注意力/理性有界是组织的根约束Cognitive-dimension lens: bounded attention and rationality are the root constraint on organizations
R31	Ⅲ	Holland《Hidden Order》1995（Addison-Wesley, ISBN 0-201-40793-0）／《Emergence: From Chaos to Order》1998	复杂适应系统——局部规则→全局涌现（理论框架，映射组织为类比 Ⅲ）Complex adaptive systems: local rules give rise to global emergence (a theoretical frame; mapping to organizations is an analogy, Ⅲ)
R32	Ⅲ	Kauffman《The Origins of Order》1993（OUP）／《At Home in the Universe》1995（OUP）	适应度景观 / NK 模型——探索-利用平衡（抽象数学模型，映射组织 Ⅲ）Fitness landscapes / the NK model: the exploration-exploitation balance (an abstract mathematical model; mapping to organizations is Ⅲ)
R33	Ⅲ	Grassé 1959《La théorie de la stigmergie》Insectes Sociaux 6(1):41-80 · doi.org/10.1007/BF02223791；Heylighen 2016《Stigmergy as a universal coordination mechanism I》Cognitive Systems Research 38:4-13（DOI 10.1016/j.cogsys.2015.12.002）· sciencedirect.com/S1389041715000376	stigmergy——间接协调=共享环境留痕（白蚁实证 Ⅱ，映射组织为类比 Ⅲ）Stigmergy: indirect coordination is traces left in a shared environment (termite empirics Ⅱ; mapping to organizations is an analogy, Ⅲ)
R34	Ⅲ	Forrest, Perelson, Allen & Cherukuri《Self-Nonself Discrimination in a Computer》IEEE S&P 1994:202-212；Hofmeyr & Forrest《Architecture for an Artificial Immune System》Evolutionary Computation 8(4) 2000:443-473（DOI 10.1162/106365600568257）	人工免疫——分布式异常检测（双层类比：免疫→安全→组织；研究本身 Ⅱ，映射组织 Ⅲ）Artificial immunity: distributed anomaly detection (a two-step analogy, immunity → security → organization; the research itself is Ⅱ, the mapping to organizations is Ⅲ)
R35	Ⅲ	Tero et al.《Rules for Biologically Inspired Adaptive Network Design》Science 327(5964) 2010:439-442（DOI 10.1126/science.1177894）· science.org/10.1126/science.1177894	黏菌自组织——无中央调度的资源再分配可逼近工程级最优网络（实验本身 Ⅱ 硬实证，映射组织为类比 Ⅲ——不得用黏菌实证给组织结论背书）Slime-mold self-organization: resource reallocation with no central scheduler can approach an engineering-grade optimal network (the experiment itself is hard empirics, Ⅱ; mapping to organizations is an analogy, Ⅲ; slime-mold evidence must not be used to endorse organizational conclusions)
R36	Ⅲ	Argyris & Schön《Organizational Learning: A Theory of Action Perspective》Addison-Wesley 1978（ISBN 0-201-00174-8）	单环/双环学习——self-improving 的人类尺度前身（高被引理论 Ⅱ/Ⅲ，映射 AI 自改进为类比 Ⅲ）Single-loop / double-loop learning: the human-scale forerunner of self-improving (a highly cited theory, Ⅱ/Ⅲ; mapping to AI self-improvement is an analogy, Ⅲ)
R37	Ⅳ／Ⅲ	Boyd《The Essence of Winning and Losing》简报 1995/96（OODA loop 首次完整出现，无正式专著）；同行评审锚briefing 1995/96 (the first complete appearance of the OODA loop; no formal monograph); peer-reviewed anchor Osinga《Science, Strategy and War: The Strategic Theory of John Boyd》Routledge 2007	OODA——感知-定向-决策-行动闭环（从业者简报 Ⅳ，Osinga 二手专著补 Ⅱ；映射组织 Ⅲ，"快循环即赢"为常见误读，Orient 才是枢纽）OODA: the observe-orient-decide-act loop (practitioner briefing Ⅳ, with Osinga’s secondary monograph adding Ⅱ; mapping to organizations is Ⅲ; “fast loop wins” is a common misreading, since Orient is the pivot)
R38a	Ⅱ	Coase《The Nature of the Firm》Economica 4(16) 1937:386-405（DOI 10.1111/j.1468-0335.1937.tb00002.x）	一人公司的经济学起点——企业边界由交易成本决定；AI 压低交易成本 → 边界向"个人+市场/agent"移动（外推为 Ⅲ 条件句）The economic starting point for the one-person company: the firm’s boundary is set by transaction costs; AI lowers those costs, so the boundary shifts toward “individual + market/agent” (the extrapolation is a Ⅲ conditional)
R38b	Ⅳ	Naval Ravikant《How to Get Rich (without getting lucky)》tweetstorm 2018-05-31（四杠杆框架：劳动力/资本/代码/媒体，后两者 permissionless） (the four-leverage frame: labor / capital / code / media, the latter two permissionless) · nav.al/product-media · 原帖original post x.com/naval	permissionless leverage——个人靠 code+media 获得过去需整个组织才有的杠杆（从业者一手框架 Ⅳ，非已验证规律）Permissionless leverage: through code and media an individual gains the leverage that once required a whole organization (a practitioner first-hand frame, Ⅳ, not a verified law)
R38c	Ⅳ	Paul Jarvis《Company of One: Why Staying Small Is the Next Big Thing for Business》Houghton Mifflin Harcourt 2019（ISBN 978-1-328-97235-4）	刻意保持小是可持续策略——"company of one"指以小为常态的经营哲学，≠字面一个人的公司（从业者规范性主张 Ⅳ）Staying deliberately small is a sustainable strategy: “company of one” names a business philosophy of small-as-default, not literally a one-person firm (a practitioner normative claim, Ⅳ)
R39	Ⅱ/Ⅲ	Charles Perrow《Normal Accidents: Living with High-Risk Technologies》Basic Books 1984；修订版revised edition Princeton University Press 1999 · press.princeton.edu/normal-accidents。对位锚. Counterpoint anchor Karl Weick & Kathleen Sutcliffe《Managing the Unexpected》Jossey-Bass（1st 2001 / 3rd 2015）	正常事故理论（NAT）：高交互复杂度＋紧耦合系统中事故是结构性产物、不可设计消除；HRO 为"可缓解不可消除"的对位学派——二者是风险观两极、不可混为一谈（经典社会学理论 Ⅱ/Ⅲ；映射到多 agent 自治系统的"系统事故必然"是 Ⅴ 级类比推演，非 Perrow 原结论）Normal Accident Theory (NAT): in systems of high interactive complexity plus tight coupling, accidents are a structural product that cannot be designed away; HRO is the counterpoint school of “mitigable but not eliminable.” The two are opposite poles of a view on risk and must not be conflated (classic sociological theory, Ⅱ/Ⅲ; the claim that “system accidents are inevitable” in multi-agent autonomous systems is a Ⅴ-grade analogical extrapolation, not Perrow’s original conclusion)
R40	Ⅱ/Ⅲ	NASA 技术成熟度量表（TRL 1-9，Sadin 1974 提出 / Mankins 1995 标准化为 9 级）NASA Technology Readiness Levels (TRL 1-9, proposed by Sadin 1974 / standardized to 9 levels by Mankins 1995) · nasa.gov/technology-readiness-levels；ISO 16290:2013《Space systems - Definition of the TRLs》· iso.org/standard/56064；EU Horizon Europe 采用adopted by EU Horizon Europe	9 级技术成熟度量表（TRL 1 基本原理→TRL 9 实任务飞行验证）；本书借用为"技术汇流四曲线"的成熟度标尺（工程量表本身 Ⅱ；用于软件/AI/组织能力为类比刻度、降级为 Ⅲ 标注工具，非 NASA 原义飞行验证）A 9-level technology readiness scale (TRL 1 basic principles → TRL 9 flight-proven on an actual mission); borrowed here as the maturity ruler for the “four curves of technological convergence” (the engineering scale itself is Ⅱ; applied to software/AI/organizational capability it is an analogical gauge, downgraded to a Ⅲ labeling tool, not NASA’s literal flight validation)
R41	Ⅴ	IEA《World Energy Outlook Special Report: Energy and AI》2025-04 · iea.org/reports/energy-and-ai；Stanford HAI《2025 AI Index Report》（推理成本数据引 Epoch AI） (inference-cost data cited from Epoch AI) · hai.stanford.edu/ai-index/2025	两条反向曲线：数据中心电力 2024 ~415 TWh→2030 Base Case ~945 TWh（约翻倍，AI 加速服务器为主驱动）；达 GPT-3.5 级推理成本 18 个月内约 280×下降（Epoch AI：年降 9-900×）（能源为 Ⅴ 级机构情景外推、成本为 Ⅲ 特定基准趋势线；"净效应组织算力近乎免费"是 Ⅴ 级推演，IEA/Epoch 均未作此断言）Two opposing curves: data-center electricity ~415 TWh in 2024 → ~945 TWh in the 2030 Base Case (roughly double, driven mainly by AI-accelerated servers); inference cost at GPT-3.5 level fell about 280× within 18 months (Epoch AI: 9-900× per year). Energy is a Ⅴ-grade institutional scenario extrapolation, cost a Ⅲ-grade benchmark-specific trend line; “the net effect is near-free compute for the organization” is a Ⅴ-grade extrapolation that neither IEA nor Epoch asserts
R42	Ⅳ	MCP：MCP: Anthropic《Introducing the Model Context Protocol》2024-11-25 · anthropic.com/news/model-context-protocol；A2A：Google Cloud 2025-04-09（后移交 Linux Foundation）A2A: Google Cloud 2025-04-09 (later handed to the Linux Foundation) · developers.googleblog.com/a2a；x402：Coinbase 2025x402: Coinbase 2025 · coinbase.com/x402	agent 经济协议层一年内成形：MCP 连工具、A2A 连 agent、x402 走稳定币 M2M 微支付——与 R1 Coasean Singularity 衔接的基础设施实现（均为厂商一手公告 Ⅳ；"机器自主交易经济体已成形"是 Ⅴ 级推演，当前真实 M2M 量级微小、缺独立渗透率审计）The protocol layer of the agent economy took shape within a year: MCP connects tools, A2A connects agents, x402 runs stablecoin M2M micropayments; the infrastructure realization that links back to R1’s Coasean Singularity (all vendor first-hand announcements, Ⅳ; “an autonomous machine-trading economy has formed” is a Ⅴ-grade extrapolation, since real M2M volume is currently tiny and lacks an independent penetration audit)
R43	Ⅳ/Ⅴ	人形机器人量产与渗透（Figure 02/03 / Tesla Optimus / 1X NEO / Unitree G1）2025-2026 · 一手：Figure 官方Humanoid-robot mass production and penetration (Figure 02/03 / Tesla Optimus / 1X NEO / Unitree G1) 2025-2026 · first-hand: Figure official figure.ai/production-at-bmw · BMW 官方通稿BMW official release press.bmwgroup.com（含 Tesla 2025 Q4 earnings call）。⚠️部署/计费数字多来自二手行业博客追踪、未经一手财报或审计交叉验证 (includes the Tesla 2025 Q4 earnings call). ⚠️ Most deployment/billing figures come from second-hand industry-blog tracking, not cross-checked against first-hand filings or audits	具身智能从演示走向早期商业部署——BMW Spartanburg 试点历时约 11 个月、2025 年底完成并退役 Figure 02（双方均未披露台数，分析师估 10-30 台；后续转向 Figure 03 并扩展莱比锡厂）；1X NEO 开放家用预订；Tesla 自承"尚未大规模工厂实用"、Optimus 量产目标滑至 2026 夏（厂商声明 Ⅳ＋二手追踪 Ⅴ，本波未取得一手审计级数据故显式降级；受控工业单元 TRL 6-7、开放/家用通用操作 TRL 4-5；外推为"组织物理边界消解"是 Ⅴ 级推演，不确定性最高之一）Embodied intelligence moving from demo to early commercial deployment: the BMW Spartanburg pilot ran about 11 months, finished by end of 2025, and retired Figure 02 (neither party disclosed unit counts; analysts estimate 10-30 units; the program then shifted to Figure 03 and expanded to the Leipzig plant); 1X NEO opened home preorders; Tesla concedes it is “not yet in large-scale factory use” and slipped the Optimus mass-production target to summer 2026 (vendor statements Ⅳ plus second-hand tracking Ⅴ; explicitly downgraded because this wave obtained no first-hand audit-grade data; controlled industrial cells TRL 6-7, open/home general manipulation TRL 4-5; extrapolating to “the dissolution of the organization’s physical boundary” is a Ⅴ-grade extrapolation, among the most uncertain)
R44	Ⅴ	BCI / 生物计算远场（Neuralink 2025 扩展早期临床、获 FDA 言语恢复突破性设备认定 · BCI / biocomputing far field (Neuralink expanded early-stage clinical work in 2025 and received an FDA Breakthrough Device designation for speech restoration · neuralink.com/updates；Cortical Labs CL1 - 80 万活体人类神经元＋硅芯片，2025-03 称"全球首台商用生物计算机" · ; Cortical Labs CL1: 800,000 living human neurons plus a silicon chip, called in 2025-03 “the world’s first commercial biological computer” · spectrum.ieee.org），厂商公告＋2025 神经技术综述（STAT News / IEEE Spectrum）), vendor announcements plus 2025 neurotech reviews (STAT News / IEEE Spectrum)	四曲线中最不成熟、最不确定的远场——BCI 临床 TRL 3-5、生物计算 TRL 2-4，远未达组织生产用途（全为厂商一手公告 Ⅳ＋早期临床/早期商用，无规模化与独立长期审计；外推为"人机融合改变组织认知边界"纯属 Ⅴ 级最远期 speculative，应标"最低成熟度、最高不确定性"，不与 R41-R43 同等对待）The least mature and most uncertain far field of the four curves: BCI clinical TRL 3-5, biocomputing TRL 2-4, far short of any organizational production use (all vendor first-hand announcements, Ⅳ, plus early clinical / early commercial work, with no scaling and no independent long-term audit; extrapolating to “human-machine fusion reshapes the organization’s cognitive boundary” is purely the most distant Ⅴ-grade speculation, to be labeled “lowest maturity, highest uncertainty” and not treated on par with R41-R43)
R45	Ⅱ	情景规划法（双轴 2×2 / GBN）：Scenario planning (two-axis 2×2 / GBN): Pierre Wack《Scenarios: Uncharted Waters Ahead》HBR 1985-09 · hbr.org/1985/09；《Scenarios: Shooting the Rapids》HBR 1985-11 · hbr.org/1985/11（Shell Group Planning 实践） (Shell Group Planning practice)；Peter Schwartz《The Art of the Long View》Doubleday/Currency 1991（ISBN 978-0-385-26732-8；后联合创立 Global Business Network） (ISBN 978-0-385-26732-8; later co-founded Global Business Network)	INSTRUMENT 05「情景台」的方法论注脚——取两条最关键且最不确定的驱动力为两轴、张成四象限四情景；目的是拓宽感知而非预测单一未来（经典方法论 Ⅱ、可直接采用；但由它生成的具体四情景内容仍是 Ⅴ 级推演，方法可靠性不传染给情景内容）The methodological footnote for INSTRUMENT 05, the “Scenario Bench”: take the two most critical and most uncertain driving forces as the axes, spanning four quadrants and four scenarios; the aim is to widen perception, not to predict a single future (a classic methodology, Ⅱ, directly usable; but the specific four scenarios it generates remain Ⅴ-grade extrapolation, since the method’s reliability does not carry over to the scenario content)
R46	Ⅱ	Zhang & Murmann《Transforming Product Development at Huawei: The IPD Initiative》，载, in《The Management Transformation of Huawei》Ch.3 · Cambridge University Press 2020 · cambridge.org（开放获取版 (open-access version alexandria.unisg.ch）；当事人记述另见夏忠毅《从偶然到必然：华为研发投资与管理实践》清华大学出版社 2019（作者系华为 IPD 核心组成员，Ⅳ）); for a participant’s account see also Xia Zhongyi, From Chance to Inevitability: Huawei’s R&D Investment and Management Practice, Tsinghua University Press 2019 (the author was a core member of Huawei’s IPD group, Ⅳ)	GEN 1 事实基底——华为 1998-1999 起由 IBM 顾问主导 IPD 流程变革（学术案例研究 Ⅱ＋当事人专著 Ⅳ）GEN 1 factual base: from 1998-1999, Huawei ran an IPD process transformation led by IBM consultants (academic case study Ⅱ plus participant monograph Ⅳ)
R47	Ⅳ	张一鸣《如何应对公司变大之后的管理挑战》源码资本 Code Class · 2017 · 主办方官网记录Zhang Yiming, “Meeting the Management Challenges of a Growing Company,” Source Code Capital Code Class · 2017 · record on the host’s official site sourcecodecap.com（"Context, not Control" 与超级计算机 vs 分布式类比的原始出处） (the original source of “Context, not Control” and the supercomputer-vs-distributed analogy)	GEN 2 事实基底——字节跳动"Context, not Control"组织理念的创始人一手讲话记录（Ⅳ）GEN 2 factual base: the founder’s first-hand talk record of ByteDance’s “Context, not Control” organizational philosophy (Ⅳ)
R48	Ⅳ	GitHub Docs「Adding repository custom instructions for GitHub Copilot」与「Best practices for Copilot coding agent」· 2026 访问 · , “Adding repository custom instructions for GitHub Copilot” and “Best practices for Copilot coding agent” · accessed 2026 · docs.github.com/copilot	AGENTS.md / copilot-instructions.md 让仓库级协作规范成为 agent 可读的操作边界；这支持"上下文/规格/责任要沉淀进文件"的工程化路径AGENTS.md / copilot-instructions.md make repository-level collaboration rules an agent-readable operating boundary; this supports the engineering path of putting context, specs, and responsibility into files
R49	Ⅳ	OpenAI《A Practical Guide to Building Agents》PDF · 2025 · cdn.openai.com/business-guides	agent 应被设计为带工具、工作流、护栏与评估的系统；用于把本方法论对 agent 的口径，从"prompt"更新为"组织运行时组件"Agents should be designed as systems with tools, workflows, guardrails, and evals; used to update this atlas’s framing of an agent from “a prompt” to “a runtime component of the organization”
R50	Ⅳ	Anthropic《Building Effective Agents》· 2024-12 · anthropic.com/engineering	先从简单直接的模式开始，只在需要时增加复杂度；workflow 与 agent 的区分为"不要把自动化复杂度当成成熟度"提供工程纪律Start with simple direct patterns and add complexity only when needed; the workflow/agent distinction gives engineering discipline against treating automation complexity as maturity
R51	Ⅴ	Microsoft《2026 Work Trend Index》与 2025 Work Trend Index（Frontier Firm / agent boss）· Microsoft, 2026 Work Trend Index and 2025 Work Trend Index (Frontier Firm / agent boss) · microsoft.com/worklab	"agent boss"是组织角色变化的机构预测与管理叙事，不是已审计事实；用于提示 2026 的组织问题已从工具采买转向 agent 编排与责任设计“Agent boss” is an institutional forecast and management narrative, not an audited fact; used to signal that the 2026 organization question has shifted from tool procurement to agent orchestration and accountability design
R52	Ⅱ/Ⅳ	Stanford HAI《2026 AI Index Report》· 2026-04 · hai.stanford.edu/ai-index/2026	年度能力、成本、产业投入与治理数据的横截面；用于校准"AI-Native"的时效性：能力边界与产业部署节奏每年都在改写A yearly cross-section of capability, cost, industry investment, and governance data; used to calibrate the time sensitivity of “AI-Native”: capability boundaries and deployment tempo are rewritten yearly
R53	Ⅰ/Ⅳ	NIST《AI Risk Management Framework》与《AI 600-1 Generative AI Profile》· 2023 / 2024 · NIST, AI Risk Management Framework and AI 600-1 Generative AI Profile · 2023 / 2024 · nist.gov/ai-rmf · NIST.AI.600-1.pdf	治理函数 Govern / Map / Measure / Manage 与 GenAI 风险行动清单；用于把本方法论的"责任重画"落到风险识别、度量和管理动作Govern / Map / Measure / Manage plus GenAI risk actions; used to land this atlas’s “redraw responsibility” thesis in risk identification, measurement, and management actions
R54	Ⅳ	MindXO《AI Governance Framework Navigator》· 2025-04-04 · mind-xo.com/ai-governance	将 NIST AI RMF、ISO/IEC 42001、EU AI Act、OECD、UNESCO 等框架做交叉映射；用于提示：治理的关键是把不同坐标系对齐到组织自己的风险图，而非照抄单一框架Crosswalks NIST AI RMF, ISO/IEC 42001, the EU AI Act, OECD, UNESCO, and related frameworks; used to show governance as framework mapping, not copying one framework wholesale
R55	Ⅴ	McKinsey《The State of AI: Global Survey》· 2025-03 · mckinsey.com/state-of-ai	高绩效组织更常把 genAI 接到工作流重构、风险缓解与业务影响里；咨询调查只能作趋势信号，不能当因果证明High performers more often connect genAI to workflow redesign, risk mitigation, and business impact; a consulting survey is a trend signal, not causal proof
R56	Ⅴ	Deloitte《2026 State of Generative AI in the Enterprise》（注：2026 版报告已更名为《State of AI in the Enterprise》，去"Generative"） (note: the 2026 edition was retitled “The State of AI in the Enterprise,” dropping “Generative”)· 2026-01 · deloitte.com/state-of-genai	企业从试点转向流程嵌入与 agentic AI 运营叙事；用于补充"价值来自流程重构"的企业调查侧证，证据等级仍按咨询预测处理Enterprise narratives move from pilots toward process embedding and agentic-AI operations; used as survey-side support for “value comes from workflow redesign,” still graded as advisory forecast
R57	Ⅳ/Ⅴ	World Economic Forum「Are you onboarding AI agents? Here's how to govern them」· 2025 · , “Are you onboarding AI agents? Here’s how to govern them” · 2025 · weforum.org/ai-agents-governance	把 agent onboarding 视为身份、权限、监督与问责设计问题；其中采纳率预测为咨询/机构预测，引用时不当作事实完成态Frames agent onboarding as identity, permissions, oversight, and accountability design; adoption-rate claims are advisory forecasts and are not treated as realized facts
R58	Ⅳ/Ⅴ	Ethan Mollick《The State of AI: September 2025》与 One Useful Thing 相关短论 · Ethan Mollick, The State of AI: September 2025 and related One Useful Thing essays · oneusefulthing.org	意见领袖侧的前沿观察：模型更擅长帮助普通人搭建 agent 与自动化，但这是实践判断与观察，不作为实证规律；用于提醒"builder"能力正在下沉Opinion-leader observation: models increasingly help ordinary users build agents and automations, but this is practice judgment, not an empirical law; used to note that builder capability is moving downward
R59	Ⅲ	Kwa, West, Becker et al.《Measuring AI Ability to Complete Long Tasks》METR · 2025-03 · arXiv:2503.14499 · arxiv.org/abs/2503.14499（预印本，未经同行评审） (preprint, not peer-reviewed)	「50% 成功率时间地平线」口径：前沿模型 2025 年初约 1 小时，2019 年以来约每 7 个月翻倍（作者提示 2024 年后或有加速）；用于复核层抽检率随地平线季度校准The 50%-success time-horizon metric: roughly one hour for frontier models in early 2025, doubling about every seven months since 2019 (authors note possible acceleration after 2024); used to calibrate review-layer sampling against the moving horizon
R60	Ⅲ	Xu, Song, Li et al.《TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks》CMU · 2024-12 · arXiv:2412.14161 · arxiv.org/abs/2412.14161（预印本 + 公开榜单） (preprint plus public leaderboard)	175 项模拟公司职业任务：最强 agent 自主完成率 v1（2024-12）约 24%，2025 年榜单口径约 30.3%（Gemini 2.5 Pro）；用作「失效率是设计参数」的量化基线175 simulated-company professional tasks: the strongest agent completes ~24% autonomously at v1 (2024-12), ~30.3% on the 2025 leaderboard (Gemini 2.5 Pro); the quantitative baseline for treating error rates as a design parameter
R61	Ⅳ	OWASP GenAI Security Project《OWASP Top 10 for LLM Applications v2025》· 2024-11 · genai.owasp.org（v2025 PDF）（行业社区标准，非实证研究） (industry community standard, not empirical research)	v2025 排序：提示注入连续两版居首；「敏感信息泄露」从第 6 位升至第 2 位；用作数据边界成为一等设计件的行业信号v2025 ranking: prompt injection holds first place for a second edition; “sensitive information disclosure” rises from sixth to second; industry signal that data boundaries are a first-class design object
R62	Ⅳ	Samsung 内部资料经 ChatGPT 外流事件（The Economist Korea 首报，TechRadar／Forbes 转述）· 2023-04／05 · Samsung internal-data leaks via ChatGPT (first reported by The Economist Korea; relayed by TechRadar/Forbes) · 2023-04/05 · techradar.com（媒体报道口径，公司未逐条确认） (press-reported account, not itemized-confirmed by the company)	2023-03 末解禁后约 20 天内三起外流（设备源码、测试序列、会议纪要），2023-05 起公司范围禁用外部生成式 AI；用于「数据边界是组织设计问题而非合规问题」Three leaks within about twenty days of the late-March 2023 un-ban (device source code, a test sequence, meeting notes); a company-wide ban on external generative AI from May 2023; used for “data boundaries are organizational design, not compliance”

REV	DATE	DESCRIPTION
2.0	2026-05	架构规约成形：世界观 · 支柱 · 底座 · 案例Architecture spec takes shape: worldview · pillars · foundation · cases
3.0	2026-06	新增十二个结构瓶颈章（时为第 3 节，5.0 起为第 4 节）Added the twelve structural-bottleneck chapter (then Section 3; Section 4 from 5.0 on)
3.1	2026-06	并入《AI原生组织的底层逻辑》一手转写与 METR / NANDA 实证Folded in the first-hand transcript of “The Core Logic of AI-Native Organizations” plus the METR / NANDA empirics
4.0	2026-06	整页版式重构 —— 图纸化版式 · Amdahl 实验台 · 十二瓶颈诊断表Recast as a full-page layout overhaul: blueprint-style layout · the Amdahl bench · the twelve-bottleneck diagnostic table
4.1	2026-06	证据核验与口径修订 · 新增对照与失败公司案例组 · 可证伪条件 · 可访问性修复 —— 单文件迭代制自此始，旧版入 archive/Evidence verification and calibration revisions · added control and failed-company case sets · falsifiability conditions · accessibility fixes. The single-file iteration regime starts here; older versions move to archive/
5.0	2026-06	新增第 3 节内核 —— 三公理推导链与本质命题 · 管理五职能去向表 · 组织形态光谱并联《一人公司方法论》姊妹篇 · INSTRUMENT 03 协调税计算器 · 总览图Added the Section 3 kernel: the three-axiom derivation chain and the essence proposition · the disposition table for the five management functions · the organizational-form spectrum paired with the “One-Person Company Methodology” sister piece · INSTRUMENT 03 the coordination-tax calculator · the overview diagram
5.1	2026-06	内核插图组（适配自 references/ai_native_core_figures）—— 命题解剖双层图 · 持续学习飞轮 · 判断锚点地图 · 支柱×底座总成Kernel illustration set (adapted from references/ai_native_core_figures): the two-layer proposition-anatomy diagram · the continuous-learning flywheel · the judgment-anchor map · the pillars × foundation assembly
6.0	2026-06	表述精准化（七支柱差异条＋「≠」误读行）· 阅读大厅（五幕故事 · 三条可交互路线 · 分幕目录）· 总图海报化（图签＋审定章）· 三股力量图重绘入统一制图语言 · 诊断悬浮计数器Sharper phrasing (seven-pillar contrast bars plus the “≠” misreading rows) · the reading hall (a five-act story · three interactive routes · the act-by-act table of contents) · the master map turned into a poster (title block plus review stamp) · the three-forces diagram redrawn into the unified drafting language · the floating diagnostic counter
6.1	2026-06	入口与层级重构 —— 首屏差异卡（AI-enabled ≠ AI Native）· 问题—原因—重画三步链 · 用法导引 · 图纸性质标签与章首告示（实证/推论分离）· 图示任务标签与核心图标记 · 完整目录折叠化Reworked entry point and hierarchy: the above-the-fold contrast card (AI-enabled ≠ AI Native) · the problem / cause / redraw three-step chain · usage guidance · blueprint-nature tags and chapter-head notices (separating empirics from inference) · figure task tags and core-figure markers · a collapsible full table of contents
6.2	2026-06	证据增重与文献对话 —— 第一性原理三小节接入正主文献（Coasean Singularity · Hadfield-Koh 规模相变 · AGG 判断经济学三部曲 + Gans 逐域控制权）· 扩散反转段 · Karpathy/Mollick 注入世界观与陷阱章 · Gumroad SEC 审计级对照样本 · Agent 洗白与裁员自反噬两条新陷阱 · 轨迹章校准锚与相变对赌 · APPENDIX 证据与引用登记（R1-R19 · 五级分级 · 3 票对抗验证）Heavier evidence and dialogue with the literature: the three first-principles subsections wired to the primary sources (the Coasean Singularity · the Hadfield-Koh scale phase transition · the AGG judgment-economics trilogy + Gans’s domain-by-domain control) · the diffusion-reversal passage · Karpathy/Mollick injected into the worldview and pitfalls chapters · the Gumroad SEC audit-grade control sample · two new pitfalls, agent washing and the layoff backlash · the trajectory chapter’s calibration anchor and phase-transition bet · the APPENDIX evidence and citation registry (R1-R19 · five-level grading · 3-vote adversarial verification)
6.3	2026-06	叙事如实化 —— 三段式组织形态降级为从业者启发式（标注黄益贺出处 · 范式叠加并存修正 · GEN 3 改"假设"并以 Anthropic 自述数据校准："10 团队嵌入部分流程"替代"全部运转"，补"可完全委托仅 0-20%"）· 90 天规划周期与 Hive Mind 两处加"据报道未经独立验证"口径注 · 风投案例补出处 · 登记表增 R20-R21Narrative made faithful: the three-stage organizational form downgraded to a practitioner heuristic (Huang Yihe credited as source · corrected to paradigms coexisting in layers · GEN 3 changed to “hypothesis” and calibrated to Anthropic’s self-reported data: “10 teams embedded in parts of the process” replaces “running the whole thing,” with “fully delegable only 0-20%” added) · a “reportedly unverified independently” caveat added at both the 90-day planning cycle and the Hive Mind · the VC case given a source · registry adds R20-R21
6.4	2026-06	瓶颈六维透镜（成因分析：信息/激励/权力/认知/时间/生态，各带文献锚）· 新增 4 个结构瓶颈 B.13-16（信任半径/权力梯度/动机抽干/生态位锁定）· INSTRUMENT 04 维度透镜台（与诊断表联动）· 公司简史面板（VOC 1602→有限责任 1855→科层 1870s，"公司非永恒"深时框架）· Ivan Zhao 钢铁/换水车锚与 Notion 1000/700 数据 · 诊断表扩容至 16 · 登记表增 R22-R30The bottleneck six-dimension lens (cause analysis: information / incentive / power / cognition / time / ecology, each with a literature anchor) · four new structural bottlenecks B.13-16 (radius of trust / power gradient / motivation drain / niche lock-in) · INSTRUMENT 04 the dimension-lens bench (linked to the diagnostic table) · a short-history-of-the-company panel (VOC 1602 → limited liability 1855 → hierarchy 1870s, the “company is not eternal” deep-time frame) · Ivan Zhao’s steel / watermill anchor and the Notion 1000/700 data · the diagnostic table expanded to 16 · registry adds R22-R30
6.5	2026-06	融合与生命系统 —— 一人公司去隔离融入正典（新第 14 节《组织的下限》· 删外链/VOL 拆分/姊妹篇引用 · oneperson 归档）· 新世界观 M.06 组织即生命系统 · 新第 6 节生命系统与涌现（涌现/NK 景观/免疫/菌丝/self-improving · 统一 N=1 与 N=众多）· 第 15 节→17 重编号 · 登记表增 R31-R37 + R38a-cIntegration and living systems: the one-person company de-siloed into the canon (new Section 14 “The Lower Bound of Organization” · removed external links / VOL split / sister-piece references · oneperson archived) · new worldview M.06 the organization as a living system · new Section 6 living systems and emergence (emergence / NK landscape / immunity / mycelium / self-improving · unifying N=1 and N=many) · Section 15→17 renumbered · registry adds R31-R37 + R38a-c
6.6	2026-06	未来推演 —— 未来章节原地升格（深时开场 · 四条技术汇流曲线含 TRL/证伪 · INSTRUMENT 05 交互情景台「能力集中度×监管接受度」四象限 · 3 件 design-fiction 未来文物 · 条件化的深远影响）· 登记表增 R39-R45 · 无重编号The extrapolation act: the future section promoted in place (a deep-time opening · four technological-convergence curves with TRL/falsifiability · INSTRUMENT 05 the interactive Scenario Bench, a four-quadrant “capability concentration × regulatory acceptance” · 3 design-fiction future artifacts · conditioned far-reaching effects) · registry adds R39-R45 · no renumbering
6.7	2026-06	暗色蓝图模式 —— [data-theme=dark] 普鲁士蓝图调色板（翻转原始调色板变量）· topbar 切换钮（localStorage 持久 · 系统偏好默认 · FOUC 防闪脚本）· SVG 图示暗色下保留浅图纸面板（蓝图钉在深色墙上）· WCAG AA 对比（ink-fade 提亮达标）· 打印强制浅色 · 不动内容/编号Dark blueprint mode: a [data-theme=dark] Prussian-blueprint palette (inverting the original palette variables) · a topbar toggle (persisted in localStorage · defaulting to system preference · an anti-FOUC script) · SVG figures keep a light drawing panel in dark mode (the blueprint pinned to a dark wall) · WCAG AA contrast (ink-fade lightened to meet it) · forced light on print · content and numbering untouched
6.8	2026-06	落地工具包(一) —— 新增第 17 节实施工具包（可生长容器）· 工作流图建模法（三节点类型 agent/human/policy + 四标注）· FIG 17.0 VC 研究流水线 before/after · 可拷贝模板 + 真文件 templates/workflow-graph.md · 规模触发线（脚手架非真理）· closing 顺延第 18 节 · 正文 16 章编号不动Field toolkit (1): new Section 17 the construction toolkit (a container that can grow) · the workflow-graph modeling method (three node types agent/human/policy + four annotations) · FIG 17.0 the VC research pipeline before/after · a copyable template plus the real file templates/workflow-graph.md · scale trigger lines (the scaffold is not the truth) · closing moved to Section 18 · the 16 body chapters keep their numbering
6.9	2026-06	双语版基建(W1) —— [data-lang] 中/英运行时切换（lang 属性成对 + 隐藏非当前语言 CSS + owl-lang 持久 + FOUC 脚本，与明暗正交）· 语言开关钮 · 英文排版覆盖 · 双语授权契约 + 术语表 docs/glossary-zh-en.md · proof：hero 报头/第 2 节三股力量图 SVG/INSTRUMENT 03 动态串/「一」两种读法双语块 · 未译内容 EN 下优雅降级显中文Bilingual-edition infrastructure (W1): [data-lang] runtime ZH/EN toggle (paired lang attributes + CSS hiding the non-active language + owl-lang persistence + FOUC script, orthogonal to light/dark) · the language toggle · English typography overrides · the bilingual authoring contract + the glossary docs/glossary-zh-en.md · proof: the hero masthead / the Section 2 three-forces SVG / the INSTRUMENT 03 dynamic string / the bilingual block for the two readings of “one” · untranslated content degrades gracefully to Chinese in EN
6.10	2026-06	引用权威性核对 —— 45 条登记全量溯源（7 路并行核验）· 元数据修正（Mollick 2023-11 · Galbraith 1973 书名 · NANDA 方法口径改报告原文 52+153+300 · METR 降 Ⅲ · Grassé 卷页 · Air Canada 改仲裁庭口径）· 20+ 条补 DOI/一手直链（HBR/CanLII/EDGAR/Klarna/Lütke 原帖…）· R20 黄益贺升级：书面一手版 newtype.pro + B 站视频直链 + 作者身份标注 · 新增 R46 华为 IPD（剑桥 2020）/ R47 字节 Context-not-Control（源码资本 2017）事实基底锚并入 GEN 卡 · R43 机器人口径更新至"试点完成"态Citation-authority check: all 45 registry entries fully traced to source (7-way parallel verification) · metadata corrections (Mollick 2023-11 · Galbraith 1973 book title · NANDA method restated to the report’s own 52+153+300 · METR downgraded to Ⅲ · Grassé volume/pages · Air Canada restated as tribunal) · 20+ entries given DOIs / first-hand direct links (HBR/CanLII/EDGAR/Klarna/Lütke original post…) · R20 Huang Yihe upgraded: the written first-hand version on newtype.pro + a direct Bilibili video link + author-identity labeling · new R46 Huawei IPD (Cambridge 2020) / R47 ByteDance Context-not-Control (Source Code Capital 2017) factual-base anchors folded into the GEN cards · R43 robot framing updated to a “pilot complete” state
6.11	2026-06	品味与结构修缮（taste-skill 审计）—— EN 译文反 AI 痕迹：em-dash 334→1（仅存直接引语）· 355 处语法/比喻/slop 重写（术语表锁定不动）· 三条阅读路线文案与 data-r 高亮同源化（修复 6.5 重编号遗漏的 P0 矛盾）· 总图计数更新为十八张图纸·五台仪器 · 七支柱逐根标注源模型 M.0x（兑现"支柱几乎是推论"）· 05→06→07 接缝过渡句 · 案例综述"四类"修为五类并补对照组角色句 · 第 16 节改题"操作者手册" · TOC 增附录入口Taste and structural mending (taste-skill audit): de-AI’d the EN translation, em-dash 334→1 (kept only in direct quotes) · 355 grammar / metaphor / slop rewrites (the glossary stays locked) · the three reading-route copy made co-sourced with the data-r highlights (fixing a P0 contradiction the 6.5 renumbering had left) · the master-map count updated to eighteen blueprints and five instruments · each of the seven pillars labeled with its source model M.0x (delivering on “the pillars are nearly inferences”) · 05→06→07 seam transition sentences · the case survey’s “four types” corrected to five with a control-group role sentence added · Section 16 retitled “The Operator’s Handbook” · the TOC gains an appendix entry
6.12	2026-06	双语版 W2 全文翻译完成 —— 第 11-18 节 + 全部组件（hero / 导航 / TOC / 总图海报 / APPENDIX R1-R47 / colophon 全 19 版）成对翻译 · INSTRUMENT 01/02 i18n（五台仪器全双语，langchange 重渲染）· 修复语言开关隐藏规则特异性 bug（隐藏规则加 !important，已配对 lang=zh 节点此前在 EN 下仍显）· 修复 closing 内核 strong 在深色底不可见的预存 bug · 全文 EN 浏览器审计 0 可见 CJK（双语 × 双主题 × 390px QA 通过）Bilingual edition W2, full translation complete: Sections 11-18 plus every component (hero / nav / TOC / general-arrangement poster / APPENDIX R1-R47 / colophon, all 19 versions) paired · INSTRUMENT 01/02 i18n (all five instruments bilingual, re-rendering on langchange) · fixed the language-toggle hide-rule specificity bug (added !important so already-paired lang=zh nodes hide in EN) · fixed a pre-existing bug where the closing-kernel strong was invisible on the dark panel · full-doc EN browser audit shows 0 visible CJK (bilingual × dual-theme × 390px QA passed)
6.13	2026-06	方法论前置校准 —— 第 2 节增「两份清单与技术束」方法卡：证据清单负责可靠性，探索清单承载先行指标/边界/证伪；把 AI 从唯一原因改写为当前最可落地的前台技术，补足协议、机器支付、机器人、能源/算力、生物/脑机等约束迁移视角 · future 四曲线导语同步强化「AI 不是唯一核心技术」口径 · REV R20Methodological calibration moved forward: Section 2 gains the “two ledgers and a technology bundle” method card, with the evidence ledger carrying reliability and the exploration ledger carrying leading indicators / boundaries / falsifiers; AI is reframed from the sole cause to the most buildable front-stage technology, adding constraint-migration lenses for protocols, machine payments, robotics, energy/compute, and bio/brain-computer interfaces · the future four-curves introduction now states that AI is not the only core technology · REV R20
6.14	2026-06	生图视觉层 —— 使用 imagegen 生成两张纸质蓝图风格位图：首屏约束迁移图、第 12 节技术汇流图；新增 gen-plate / hero-side 图纸样式与中英图注；压缩展示版与源图保存到 references/generated · REV R21Generated visual layer: two paper-blueprint bitmap plates generated with imagegen, one for the hero constraint-migration view and one for the Section 12 technology-convergence view; added gen-plate / hero-side figure styling with bilingual captions; compressed display versions and source images saved under references/generated · REV R21
6.15	2026-06	章节级配图迭代 —— 首屏约束迁移图 V2 改为强论断总图并前置；新增三张 SVG 配图：结构瓶颈叠加悖论、七支柱×四底座总成、一人公司 N=1 内核；SVG 配图改为不裁切展示 · REV R22Chapter-level illustration pass: the hero constraint-migration visual becomes a stronger V2 thesis plate and moves above the comparison card; added three SVG plates, the structural-bottleneck overlay paradox, the seven-pillars × four-layer substrate assembly, and the N=1 one-person kernel; SVG plates now render without cropping · REV R22
6.16	2026-06	图版系统再校准 —— 首屏改为 imagegen 位图 V3 并跨栏置顶，统一 generated plate 的图幅、宽度与图注字号；N=1 内核改为 imagegen 位图 V2，保留 SVG 用于精确结构图，形成位图冲击力 + SVG 可读性的混合图版系统 · REV R23Figure-system recalibration: the hero becomes an imagegen bitmap V3, placed full-width at the top; generated plates now share consistent width, aspect ratio, and caption sizing; the N=1 kernel becomes an imagegen bitmap V2 while SVG remains for precise structural diagrams, creating a mixed bitmap-impact + SVG-legibility figure system · REV R23
6.17	2026-06	生图风格重置 —— 三张 imagegen 位图统一改为简洁未来主义科技界面：首屏 V5、四曲线 V2、N=1 V3；去掉纸质蓝图与复杂机械细节，改为大模块、大标签和一条价值句，保证图内有可读信息但不过度拥挤 · REV R24Generated-image style reset: the three imagegen bitmap plates move to a simple futuristic technology-interface language, hero V5, four-curves V2, and N=1 V3; paper-blueprint texture and complex mechanical detail are removed in favor of large modules, large labels, and one value line, keeping the image informative without crowding · REV R24
6.18	2026-06	配图系统降噪 —— 撤掉首屏、四曲线、N=1 三张大幅 imagegen 位图，回到文字主导；新增 5 张克制的左侧边栏 AI 小图（开篇 / 三股力量 / 结构瓶颈 / 案例图谱 / N=1），图像只做概念提示，语义由正文和图注承担 · REV R25Illustration system quieted: the hero, four-curves, and N=1 large imagegen plates are removed, returning the document to text-first hierarchy; added five restrained left-rail AI miniatures for opening, three forces, bottlenecks, case atlas, and N=1, with images serving as conceptual cues while the text and captions carry meaning · REV R25
6.19	2026-06	2026 来源更新与交互细节 —— 首页新增 SOURCE UPDATE 六格入口，登记表增 R48-R58（GitHub AGENTS.md / OpenAI agents guide / Anthropic agent patterns / Microsoft WTI / Stanford AI Index / NIST-MindXO 治理 / McKinsey-Deloitte-WEF / Mollick）· 顶部六模块导航切换放慢 · 左侧 rail、来源卡、路线卡、AI 指引与细部控件增加克制 hover/focus 微交互 · REV R262026 source update and interaction detail: the homepage gains a six-cell SOURCE UPDATE entry, the registry adds R48-R58 (GitHub AGENTS.md / OpenAI agents guide / Anthropic agent patterns / Microsoft WTI / Stanford AI Index / NIST-MindXO governance / McKinsey-Deloitte-WEF / Mollick) · the six-module top navigation transition is slowed · restrained hover/focus micro-interactions are added to the left rail, source cards, route cards, AI guide, and detail controls · REV R26
6.20	2026-06	SVG 图版回收 —— 组织页 5 张旧侧栏生图全部替换为内联 SVG（开篇重画 / 三股力量 / 结构瓶颈 / 案例图谱 / N=1 内核），删除未引用 JPG/PNG 资产，只保留可维护、可压缩、与正文语义绑定的 SVG 图示 · REV R27SVG figure recovery: all five old generated sidebar images on the organization page are replaced by inline SVGs (opening redraw / three forces / structural bottleneck / case atlas / N=1 kernel); unreferenced JPG/PNG assets are removed, leaving maintainable, compressible SVG figures tied to the surrounding argument · REV R27

AI Native组织方法论

AI NativeOrganization Methodology

AI Native 的分界线

The Boundary of AI Native

"AI Native"三种本质不同的现象

“AI Native”: Three Fundamentally Different Phenomena

结构视角Structural - Agent as team member

Structural Perspective: Agent as Team Member

运营视角Operational - AI-first workflows

Operational Perspective: AI-First Workflows

本质视角Ontological - Agent-first organization

Ontological Perspective: Agent-First Organization

前提，正在同时失效

The Premises Are Failing

Coase 边界的当代重画The Coase Boundary, Redrawn

The Coase Boundary, Redrawn

代理理论的扩展Agency Theory, Extended

Agency Theory, Extended

控制论的回归Cybernetics, Returning

Cybernetics, Returning

判断稀缺性的经济学The Economics of Judgment Scarcity

The Economics of Judgment Scarcity

组织中可被重画的部分

The Designable Part of Organization

如果 agent 判断得比人更好，人还应该是最后的决定者吗？

If agents judge better than people, should people still make the final call?

让上下文自由流动，什么时候会变成一套监控系统？

When does free-flowing context become a surveillance system?

协调变便宜以后，组织会变小，还是会大到前所未有？

When coordination gets cheaper, do organizations shrink or grow beyond precedent?

责任一定要落在一个具体的人身上吗？

Must responsibility belong to a specific person?

瓶颈真的搬家了，还是成本只是被藏到了别处？

Did the bottleneck move, or was the cost merely hidden elsewhere?

AI-Native 原生于模型，还是原生于持续变化的能力条件？

Is AI-Native native to models, or to continuously changing capability conditions?

管理五职能的去向What Happens to Fayol's Five Functions

What Happens to Fayol’s Five Functions

组织形态的光谱The Spectrum of Organizational Forms

The Spectrum of Organizational Forms

"加 AI"解不开的结构性瓶颈

The Structural Bottlenecks That Overlaying AI Cannot Solve

串行依赖链The Serial Dependency Chain

协调成本平方律The Quadratic Coordination Tax

决策带宽天花板The Executive Bandwidth Ceiling

层级信息衰减Hierarchical Signal Decay

部门墙与局部最优Functional Silos & Local Optima

会议同步税The Synchronous Coordination Tax

知识私有化Tacit Knowledge Lock-in

审批链与责任稀释Approval Chains & Diffused Accountability

人头即产能Headcount-as-Capacity

规划节奏失配The Planning Cadence Mismatch

试错成本与风险规避Experiment Cost & Risk Aversion

指标剧场Metric Theater

信任半径坍缩Trust-Radius Collapse

权力梯度与议程垄断Power Gradient & Agenda Capture

动机抽干Motivation Crowding-Out

生态位锁定Niche Lock-In

底层世界观

Foundational Worldviews

组织即工作流图

Organization-as-Workflow-Graph

Agent 即默认工种

Agent as the Default Role

上下文即核心资产

Context as the Core Asset

持续学习即操作系统

Continuous Learning as the Operating System

人即判断锚点

Humans as Judgment Anchors

组织即生命系统

Organization as a Living System

操作者即编排者Operator as Orchestrator

Operator as Orchestrator

组织作为生命系统

Organization as a Living System

涌现 · Emergence

Emergence · CAS & Stigmergy

适应度景观 · Fitness Landscape

Fitness Landscape · NK Model (Kauffman)

AI Native
组织方法论

AI Native
Organization Methodology