This volume should not read like the others. The others tell you the next step — the bottleneck has moved, so a drawing can be made; this one hands you a compass that tells you which way is worth going. Direction has no assembly line: the container is still the SHEET, but each one is a single mark on one compass, not step 1→2→3. This is the upstream "value-discovery" volume. Start from how to read it ↓
① generation collapses to free → ② judgment retreats to "recognizing what deserves commitment along the externalizability gradient" → ③ context is the deep world-understanding AI cannot give → ④ people return to inner conviction about what truly matters. This volume only fills those four steps as they apply to innovation, and stands on its own without the organization volume.
Drag the slider: innovation moves from idea production to value recognition. Enter SHEET 00 · Concept
AI-NATIVE DOCUMENT PACK · PART VI
创新文档包:用罗盘校准“值得吗”
Innovation Pack: calibrating “worth it?” with a compass
创新卷不交付流程图,而交付一具判断罗盘:在无限看似可行中,识别真实需求、可行路径和内在确信的交点。
The innovation volume does not deliver a process diagram; it gives a judgment compass for spotting the intersection of real need, viable path, and inner conviction inside infinite plausibility.
Thesis
可能性变充裕后,稀缺不是点子,而是价值感知。
When possibility is abundant, the scarce thing is not ideas but value perception.
AI-Native innovation is not bulk brainstorming. It is recognizing the signal that truly links real need to viable path after the noise floor has risen. It is value discovery, not idea production.
Compass Marks
罗盘由几道刻度组成,不是步骤。
The compass is made of marks, not steps.
信噪比:看似可行变多,信号并没有变多。
Signal-to-noise: plausibility increases; signal does not.
价值感知:真实需求 × 可行路径 × 内在确信。
Value perception: real need × viable path × inner conviction.
散木空间:保护暂时无用、无法立刻 KPI 化的探索。
Useless-tree space: protect exploration that cannot yet become a KPI.
涌现识别:不是预先生产新物种,而是事后认出它。
Emergence literacy: not producing a new species in advance, but recognizing it after it appears.
Ecology Boundary
共识可训练,异质价值只能营造涌现条件。
Consensus can be trained; heterogeneous value can only have emergence conditions cultivated.
Externalizable preference signals can be systematized; tacit anti-consensus value anchors become average when forced into a system. The volume’s base posture is ecology design: keep redundancy, slow lanes, and useful uselessness.
Run one direction through three axes: is the need lived, does a path really exist, is conviction not borrowed? If any axis is low, do not ask generation to make it prettier.
Innovation's bottleneck moved from generating ideas to recognizing what deserves commitment
承重命题:AI-Native 创新的转向,是从"用 AI 批量头脑风暴 / 生成点子",走向在可能性充裕后识别值得投入的方向。当点子、方案、可能性都近乎无限生成,稀缺的不再是"生成新的",而是价值感知:在无限"看似可行"里,识别真正连接真实需求与可行路径、值得投入资源的信号。种类之别,非程度之别。
Load-bearing claim: AI-Native innovation shifts from "brainstorming in bulk with AI" to recognizing what deserves commitment once possibility is abundant. Once ideas, plans, and possibilities generate at near-infinite, near-free scale, the scarce thing is no longer generating the new but value perception: recognizing, in an infinity of "looks-feasible," the signal that truly connects real need to a viable path and deserves commitment. A difference of kind, not degree.
First, how to read it, because that governs how it is used. In the other volumes SHEET 02→07 is roughly a causal chain: mechanism → redrawn process → landing. Here SHEET 02→06 are several marks on one compass — signal-to-noise, value perception, the useless tree, the systematization fork, emergence — not sequential steps but different readings of one needle from different angles. Enter anywhere; SHEET 07 teaches you to read and calibrate it.
Ideas were scarce, so the methodology taught "how to have more ideas" — divergence tools, brainstorming drills, idea quotas. The bottleneck was assumed to sit at generation.
Generation collapses to free, and the bottleneck moves wholesale to recognition: every idea "looks feasible," and the signal drowns in infinite plausibility. The task flips from "produce more" to "recognize what deserves commitment, and protect the space where signal can emerge."
为什么创新坐在系列最上游:它供"方向",不供"产能"
Why innovation sits furthest upstream: it supplies direction, not throughput
The series is two triads on one fuel chain: the upstream (research → learning → innovation = discovery of truth / capability / value) supplies the downstream (organization → engineering → design) with truth / capability / direction. Innovation is the apex of the upstream triad, and it supplies the fuel that is hardest to compensate for — direction, i.e. "what is worth doing." The three downstream volumes handle the moving of bottlenecks: organization moves people, engineering moves verification, design moves taste; once a bottleneck is located, the next construction drawing can be drawn. Innovation handles a deeper layer: whether the bottleneck should exist at all, whether this direction is worth anyone moving toward. That layer has no drawing, because it is not asking "how do we go faster" but "should this thing be done at all."
所以本卷的特殊性不是修辞。下游卷的内核②是 α 机制——瓶颈搬家、可工程化、可度量、快反馈;创新卷的内核②更靠近一种"反 α"的东西——它抗度量、慢反馈、且故意保留不可外化的判断。把创新当成"用 AI 多产点子"的人,错的不是工具用法,而是把一个本属上游"方向判断"的问题,硬塞进了下游"产能"的框里。结果是用更便宜的生成,制造了更多看似可行——把真正的瓶颈(识别)越推越后,越埋越深。
So the specialness of this volume is not rhetoric. The downstream volumes' kernel step ② is the α mechanism — the bottleneck moves, it can be engineered, measured, fast-fed-back; the innovation volume's step ② sits closer to an "anti-α" — it resists measurement, feeds back slowly, and deliberately keeps a layer of judgment that cannot be externalized. Whoever treats innovation as "use AI to produce more ideas" is wrong not about tool usage but about category: they have crammed an upstream question of direction into a downstream frame of throughput. The result is that cheaper generation manufactures more looks-feasible — pushing the real bottleneck (recognition) ever further back and burying it deeper.
FIG. 0.0从点子充裕到选择稀缺的漏斗The funnel from idea-abundance to selection-scarcity · 看懂:Read: 入口越宽,出口越窄——成本压到零的是生成,没压下来的是识别。the wider the mouth, the narrower the throat — generation went to zero, recognition did not.
看点:漏斗的入口随工具变便宜而无限变宽,喉部(识别)却没变。技术进步加宽的全是入口;本卷讲的全是喉部。Takeaway: the mouth of the funnel widens without limit as tools cheapen; the throat (recognition) does not. Progress widens only the mouth; this whole volume is about the throat.
证伪信号Falsified if
若有一天识别本身也塌成免费——即存在一种工具,能可靠地从无限候选里判定"哪个真正连接了真实需求与可行路径",并且这判定可被独立复核——那么本卷的承重命题就被推翻:稀缺不再是价值感知。目前所有证据指向反面(生成-验证不对称是信息论级常量,见 SHEET 02),但这是本卷自己标出的死亡条件。If recognition itself one day collapses to free — i.e. a tool reliably decides, out of infinite candidates, "which one truly links real need to viable path," and that verdict can be independently checked — then this volume's load-bearing claim is overturned: value perception is no longer scarce. All current evidence points the other way (the generation-verification asymmetry is an information-theoretic constant, see SHEET 02), but this is the death condition the volume names for itself.
"AI makes innovation faster" is a difference of degree; "AI relocated the entire bottleneck of innovation" is a difference of kind. The distinction between these two is not rhetorical intensity but the fact that they point to completely different methodologies. If it were only a difference of degree, the old idea methodology would still hold, needing only that each step be sped up — brainstorm faster, produce more ideas, shorter iteration cycles. If it is a difference of kind, then the whole assumption of the old methodology (the bottleneck is at generation) has failed, and accelerating generation only jams the real bottleneck (recognition) further. There is a clean test for which it is: drop generation cost to zero — does the old methodology still hold? The old idea methodology's core moves — diverge, produce more — degrade into a noise amplifier when generation is free, so it does not hold. That proves it: this is a difference of kind.
种类之别还有一个常被忽略的后果:它意味着过去的成功经验可能反向有害。一个在创意稀缺时代成功的人,他的肌肉记忆是"想得多、产得快、抓住每个机会"——这些在生成端是瓶颈时是美德,在识别端是瓶颈时却是恶习:想得越多越淹没信号、抓住每个机会就是不敢放弃。所以这一卷不只是"加一套新工具",它要求一次判断习惯的重新校准——从"产更多、抓更多"校准到"押更少、砍更狠、守住散木"。这正是为什么本卷的承重不是"用 AI 创新的技巧",而是一套关于"在充裕中如何重新分配注意力"的方向判断。
A difference of kind has a consequence often overlooked: it means past success can be actively harmful. Someone who succeeded in the idea-scarce era has the muscle memory of "think much, produce fast, seize every opportunity" — virtues when generation is the bottleneck, vices when recognition is. The more you think, the more you drown the signal; "seize every opportunity" is the inability to abandon. So this volume is not "add a new tool" but a demand for a recalibration of judgment habits — from "produce more, seize more" to "bet fewer, cut harder, hold the useless tree." This is exactly why the volume's load-bearing content is not "techniques for innovating with AI" but a body of direction judgment about "how to reallocate attention amid abundance."
Nail the volume in one line: it is value discovery, not idea production. The two phrases point to completely different activities. Idea production cares about "output" — more ideas, faster iteration, wider coverage, its marks of success being quantity and speed. Value discovery cares about "recognition" — in an already-infinite space of possibility, spotting the one that truly links real need to viable path, its marks of success being hits and abandonments. If an organization sets its innovation department's KPI as "how many ideas produced, how many pilots run," it is doing idea production and will most likely sink into innovation theatre (SHEET 08); if it sets the criterion as "hit rate, abandon rate, useless-tree retention, emergence-recognition latency," only then is it doing value discovery. The renaming is small; the relocation of the bottleneck is large — what this volume guards from beginning to end is exactly that relocation.
这也澄清了本卷与"用 AI 创新"这个流行说法的距离。市面上多数"AI 创新"的教法,是把 AI 当成创意生产的加速器——更快头脑风暴、更多方案、更短周期。本卷的立场是:那是把 AI 嫁接到一个瓶颈假设已经失效的旧流程上,加速的恰恰是噪声。AI-Native 的创新,不是用 AI 产更多,是认清生成已经免费、识别才是瓶颈,于是把方法论的全部重量从生成端移到识别端。这不是对"用 AI 创新"的微调,是对它的重画——种类之别,不是程度之别。后面十四张 SHEET,每一张都是这次重画在一个具体刻度上的展开。
This also clarifies the distance between this volume and the popular phrase "innovating with AI." Most market teachings of "AI innovation" treat AI as an accelerator of idea production — faster brainstorming, more options, shorter cycles. This volume's position: that grafts AI onto an old process whose bottleneck assumption has already failed, and what it accelerates is precisely noise. AI-Native innovation is not using AI to produce more but recognizing that generation is already free and recognition is the bottleneck, and therefore moving the entire weight of the methodology from the generation side to the recognition side. This is not a tweak to "innovating with AI" but a redraw of it — a difference of kind, not of degree. The fourteen sheets that follow are each the unfolding of that redraw on one concrete mark.
INV
01
KERNEL · 内核特化
KERNEL
机理 · 内核母版
Mechanism · Kernel
可能性变富,"值得吗"反而变难
Possibility grows abundant; "is it worth it?" grows harder
Load-bearing claim: the same kernel master (① abundance → ② judgment → ③ context → ④ meaning) acting on the surface of innovation. The counter-intuitive core: the possibility explosion raises the difficulty of judgment, because every idea "looks feasible," the noise floor rises without limit, and signal-to-noise collapses.
Fill the kernel's four steps with the specifics of direction and you have the whole thesis of this volume. Note how far it sits from the α mechanism: the downstream volumes' step ② is "the bottleneck moves, a drawing can be made"; here step ② is "judging direction," and only a compass can be given.
① 充裕ABUNDANCE
点子 / 方案 / 可能性
Ideas / plans / possibilities
无限生成、可批量、近乎免费;生成新方案不再稀缺,噪声地板被无限抬高。
Infinite, batchable, near-free; generating new plans is no longer scarce, and the noise floor rises without limit.
② 判断JUDGMENT
价值感知 · 信噪比 · "值得吗"
Value perception · S/N · "worth it?"
新瓶颈=价值识别:认出真正连接真实需求与可行路径的那一个,不是生成创意。
The new bottleneck is value recognition: spotting the one that truly links real need to viable path, not generating ideas.
③ 上下文CONTEXT
对世界的深理解
Deep understanding of the world
来自亲历、深耕、与现实长期摩擦——恰恰是 AI 给不了的那部分,不是可索引语料。
From lived experience, deep tenure, long friction with reality — precisely the part AI cannot give, not an indexable corpus.
④ 人MEANING
价值确信 · 护无用 · 识涌现
Conviction · protect the useless · spot emergence
人回归对"什么真正重要"的内在笃定,守护无用之用空间,事后认出涌现的新物种。
People return to inner conviction about what truly matters, protect the space of useful uselessness, and recognize emergent new species after the fact.
第②步的分叉:可外化的共识,与构成性的反共识
Step ②'s fork: the externalizable consensus vs. the constitutive anti-consensus
Step ② is not one rung of a uniform retreat — it forks along the "externalizability gradient" into two branches. This fork is the spine of the volume and directly decides what the methodology becomes (see SHEET 05):
可系统化支 → 并入 ① 充裕Systematizable branch → folds into ① abundance
价值感知的可外化部分:已成形的社群共识、可表达的偏好信号
The externalizable part of value perception: settled community consensus, expressible preference signals
RLCF 证它可学——"淘汰不可实现者"、逼近共识口味
RLCF shows it is learnable — "cull the unachievable," converge on consensus taste
它不再"留给人",变成又一种被自动化的执行(训练手册的局部)
It no longer "stays with humans"; it becomes another automated form of execution (the local training-manual part)
构成性支 → 下沉 ④ 价值基岩Constitutive branch → sinks into ④ the value bedrock
tacit 价值锚 · 反共识的前沿价值:只对某个体/群体成立的异质价值
The tacit value anchor · the anti-consensus frontier: heterogeneous value that holds only for a given individual or group
RLCF 学社群共识 = "predict taste without having taste";过度优化会挤出反共识
RLCF learns community consensus = "predict taste without having taste"; over-optimization crowds out the anti-consensus
强行系统化它=亲手制造平均。它只能营造涌现条件,不能直接传授
Forcing it into a system = manufacturing the average by hand. It can only have its emergence conditions cultivated, never taught directly
证据账 · preprint 等级Evidence ledger · preprint grade
分叉的证据收敛:RLCF(Reinforcement Learning from Community Feedback, Li et al. 2025-06,探索账·Ⅲ preprint)学的是社群共识、过度优化挤出反共识;MaxMin-RLHF 与单模型对齐异质偏好的不可能定理(Ⅲ 理论);Preference-Validity Compression(arXiv:2606.10569,Ⅲ preprint);RLHF≈Condorcet(arXiv:2506.12350,Ⅲ preprint)。合起来:共识可学(训练手册局部)、反共识 / 异质不可学(只能营造涌现=生态指南)。与基岩①②、Specification Trap 的"from value specification to value emergence"逐字同构。The fork's evidence converges: RLCF (Reinforcement Learning from Community Feedback, Li et al. 2025-06, exploration ledger · Grade III preprint) learns community consensus, and over-optimization crowds out the anti-consensus; MaxMin-RLHF and the impossibility theorem of aligning a single model to heterogeneous preferences (Grade III theory); Preference-Validity Compression (arXiv:2606.10569, Grade III preprint); RLHF≈Condorcet (arXiv:2506.12350, Grade III preprint). Together: consensus is learnable (the training-manual part); anti-consensus / heterogeneity is not (only emergence can be cultivated = the ecology guide). Word-for-word isomorphic with bedrock ①②, and with the Specification Trap's "from value specification to value emergence."
Make the "impossibility theorem" explicit, or it stays cited rather than derived. For an optimizer to judge "what is worth doing" on your behalf, it must aggregate many people's separate value orderings into a single "worth" order to serve as its objective. Arrow proved exactly this: given at least three alternatives and at least two agents with heterogeneous preferences, no aggregation function satisfying three minimal sanity conditions at once (independence of irrelevant alternatives, Pareto, non-dictatorship) can yield a coherent social ordering — any such function is either self-contradictory or degenerates into copying one person's ranking (dictatorship). Carry this onto alignment: a model aligned to group preference is solving for an aggregated "worth" order; Arrow says no coherent solution exists, so the optimizer has only two exits — converge on consensus and grind heterogeneous value toward the mean (the gravity machine Doshi-Hauser measured empirically[R1], Grade I–II; i.e. SHEET 01's externalizable branch), or collapse into "dictatorship" and copy a single anchor (which hands the question "whose worth?" straight back to a human). Either way, "what is worth doing" cannot be losslessly outsourced to an optimizer: not because the engineering is unfinished, but because the aggregation function structurally does not exist. This is the load-bearing claim's second wall — not information theory (the generation-verification asymmetry, SHEET 02) but the impossibility of preference aggregation. This step is an argument, grade Ⅴ: Arrow itself is a Grade I theorem, but "therefore worth is unoutsourceable" is an inference carrying the theorem into the alignment context, logged as Ⅴ by this volume's rule.
FIG. 5.0不可能定理:为什么这道墙没有门The impossibility theorem: why the wall has no door · 看懂:Read: 优化器想替你判断"什么值得",先得把异质偏好聚合成一道序;Arrow 说这道序不存在,只剩两条都把问题退回给人的退路。to judge "what's worth it" for you, an optimizer must first aggregate heterogeneous preferences into one ordering; Arrow says that ordering doesn't exist, leaving only two exits that both hand the question back to a human.
看点:这张图不是"AI 还做不到",而是"结构上无解"。把许多人的异质排序压成一道连贯的"值得"序,Arrow 证明在三条最弱约束下不可能——优化器只剩两条退路,且两条都把判断退回给人。这就是 FIG 13.5 那道墙固定不动的机理:墙不是工程难度,是定理。Takeaway: this is not "AI can't do it yet" — it is "structurally unsolvable." Compressing many people's heterogeneous orderings into one coherent "worth" order is impossible under three minimal constraints (Arrow); the optimizer is left with two exits, and both hand judgment back to a human. This is the mechanism behind the fixed wall of FIG 13.5: the wall is not engineering difficulty, it is a theorem.
可外化性梯度:判断退守的不是一个台阶,是一条斜坡
The externalizability gradient: judgment retreats not down a step but along a slope
Treating "value perception" as one indivisible lump is the easiest mistake in this volume. It is not one lump. Inside value judgment runs an externalizability gradient: the closer to "settled community consensus," the more it can be expressed, labelled, trained as a reward signal — RLCF (reinforcement learning from community feedback) is precisely the externalization of that end; the closer to "the anti-consensus frontier," the more it is a conviction earned only by an individual's long friction with the world, unexpressible as a rule, and forcing it into a system merely grinds it toward the mean. Judgment retreats along this slope: the externalizable stretch gets automated like the downstream volumes and folds into ① abundance; the inexternalizable stretch sinks into ④, the value bedrock, for which the methodology can only cultivate emergence conditions.
这条梯度解释了一个否则会自相矛盾的现象:为什么"AI 会写出新颖的东西"(Psittacines of Innovation? arXiv:2404.00017,Ⅲ)与"AI 系统性地产平均"(Doshi-Hauser 等多篇期刊级因果实证,Ⅰ–Ⅱ)同时成立。前者发生在梯度的可外化一端:模型能重组已有共识里的元素,产出"与人不同"的新颖;后者是它的默认引力——在没有刻意施力时,post-training 把分布拉向原型(regression to prototype)。所以公理的正确表述不是"异质性只能来自人"(这条太强,已被 novelty-search / MAP-Elites / 开放式算法证伪——放弃单一目标函数,机器也能产异质),而是:异质性的敌人是单一目标的过度优化,不是机器本身。人的不可替代之处,是定义"什么值得不同"。
This gradient resolves what would otherwise be a contradiction: why "AI can write something novel" (Psittacines of Innovation? arXiv:2404.00017, Grade III) and "AI systematically produces the average" (Doshi-Hauser et al., several journal-grade causal studies, Grade I–II) both hold at once. The first happens at the externalizable end of the gradient: the model recombines elements of settled consensus into novelty "distinct from humans"; the second is its default gravity — absent deliberate force, post-training pulls the distribution toward the prototype (regression to prototype). So the axiom's correct statement is not "heterogeneity can only come from humans" (too strong — falsified by novelty-search / MAP-Elites / open-ended algorithms: drop the single objective and machines produce heterogeneity too) but: the enemy of heterogeneity is over-optimization of a single objective, not the machine itself. What is irreplaceably human is defining what is worth being different about.
FIG. 1.0可外化性梯度The externalizability gradient · 看懂:Read: 从左到右,价值判断越来越难外化;自动化只能吃掉左半段。left to right, value judgment grows harder to externalize; automation can only eat the left half.
看点:这不是"机器 vs 人"的二分,是一条斜坡。自动化前线随能力右移,但右端的构成性价值锚有信息论与不可能定理的双重护栏——它不是暂时守住,是结构性守住。Takeaway: this is not a "machine vs human" binary but a slope. The automation front moves right over time, yet the constitutive value anchor at the right end is doubly walled by information theory and an impossibility theorem — it is not held temporarily but structurally.
最不像 α:为什么内核作用在创新面会"反向"
Least like α: why the kernel "inverts" on the surface of innovation
The same kernel master acts on six surfaces, but on the surface of innovation its flavour is inverted. In the downstream volumes, ① abundance is pure good news — execution gets cheap and the freed effort can go to judgment; here ① abundance first manufactures a crisis — the possibility explosion makes the "is it worth it?" judgment harder (SHEET 02, signal-to-noise). The downstream volumes' ② judgment is the α mechanism: the bottleneck moves to a new spot that can be engineered, measured, guard-railed; here ② judgment resists all three — its core (constitutive value) resists externalization, feeds back slowly, and forcing it into engineering grinds it flat (SHEET 05, the fork). The downstream volumes' ③ context is agent-readable infrastructure (guardrails, specs, design systems); here ③ context is precisely the part AI cannot read — a person's deep understanding of the world (SHEET 03). The shape of the four steps is unchanged, but the sign of each step is flipped on the innovation surface.
This "inversion" is not an exception but a necessity of the series structure. The whole methodology is one fuel chain: the upstream supplies direction, the downstream supplies execution. The further downstream, the closer the bottleneck sits to "execution" and the more it can be handled by the α mechanism (move it, engineer it); the further upstream, the closer the bottleneck sits to "direction" and the more it resists engineering. The innovation volume is at the apex of the upstream triad, so it is the volume on the whole chain furthest from α and closest to γ (emergence). Grasp this and you grasp why the container is still the SHEET (keeping the series consistent) while the content logic deliberately departs from the drawing (compass marks, not process steps) — the form aligns with the series, the substance aligns with the series' upstream position. Reading it as "yet another drawing set" is exactly the misreading SHEET 00 keeps guarding against.
INV
02
SIGNAL · 信噪比刻度
SIGNAL/NOISE
机理 · 受力分析
Mechanism · Force analysis
噪声地板被抬到无限高,信号没变
The noise floor rises to infinity; the signal does not
Load-bearing claim (compass mark one): the asymmetry between generation and recognition — generating ideas becomes near-free, while "which one links real need to viable path" gets no cheaper. AI raises the noise floor to infinity while the absolute amount of signal is unchanged, so signal-to-noise collapses. What is needed is not more information but stronger value perception.
This is the sharpest place the volume parts from the others. Elsewhere "abundance" is good news — execution gets cheap, the bottleneck moves to judgment, judgment can be engineered. On the surface of innovation, "abundance" first manufactures a perception crisis: once everything looks feasible, "looking feasible" carries no information at all. Value perception — inner conviction about what truly matters — comes from deep understanding of the world, not from AI (see SHEET 03).
Candidate count → ∞, unit cost → 0. Every candidate wears the look of "feasible," because the model is good at making any direction sound coherent. The noise floor rises without limit.
The signal that truly links real need to viable path is unchanged in absolute terms — it is bounded by the real jobs that exist in the world, not by generation speed. Signal ÷ noise → 0, so the capacity to abandon (the nerve to cut "looks-feasible") becomes the new scarce skill.
为什么这条不对称是常量,而不是会被下一代模型抹平
Why the asymmetry is a constant, not something the next model erases
"等模型更强,识别也会变便宜"——这是最常见的反驳,也错得最深。生成与验证的不对称不是当前模型的缺陷,是信息论级的结构常量:生成一个候选只需局部连贯(听起来成立、内部不矛盾),而验证它真正连接了真实需求与可行路径,需要全局一致(与世界里真实存在的待办任务对得上,与可行性的物理/经济约束对得上)。局部连贯可以被语言模型廉价地批量制造;全局一致要求对照一个模型并不拥有的东西——真实世界的当前状态。这条不对称在工程卷里是"写码便宜、验证贵",在研究卷里是 Terence Tao 的"想法便宜、真相贵",在创新卷里就是"看似可行便宜、值得便宜不了"。同一条信息论常量,三个面。
"When models get stronger, recognition gets cheap too" — the most common rebuttal, and the most deeply wrong. The generation-verification asymmetry is not a flaw of current models but an information-theoretic structural constant: generating a candidate needs only local coherence (it sounds right, it does not contradict itself), whereas verifying that it truly links real need to viable path needs global consistency (it matches the jobs that actually exist in the world, it matches the physical/economic constraints of feasibility). Local coherence a language model can manufacture cheaply, in bulk; global consistency requires checking against something the model does not possess — the current state of the real world. This asymmetry is "code is cheap, verification is dear" in the engineering volume, Terence Tao's "ideas are cheap, truth is expensive" in research, and "looks-feasible is cheap, worth-it cannot be made cheap" here. One information-theoretic constant, three faces.
这条不对称还有一个可度量的推论,比"信噪比"更硬。多份实证(Measuring Creativity in the Age of GenAI, arXiv:2604.19799,Ⅱ)[R2]发现:人共享 AI 之后,产出不是单峰塌缩,而是双峰分布——一簇贴近模型默认(高流畅、低原创),一簇明显偏离(人驱动的重组、重构),中间稀疏。竞争优势整体转向"能在生成系统主导模式之外操作的个体"。这把"异质性"从一个模糊的褒义词,变成了一个可度量的分布属性——相对 AI baseline 的 distinctiveness。信噪比刻度的实操含义因此非常具体:不是"产更多有创意的东西",而是"刻意把自己挪到分布的右峰,并能解释为什么右峰那一簇连接了真实需求"。
The asymmetry has a measurable corollary harder than "signal-to-noise." Several empirical studies (Measuring Creativity in the Age of GenAI, arXiv:2604.19799, Grade II)[R2] find that after people share AI, output is not a single-peak collapse but a bimodal distribution — one cluster hugging the model default (high fluency, low originality), one clearly departing from it (human-driven recombination, reframing), with the middle sparse. Competitive advantage shifts wholesale to "individuals who can operate outside the mode the generative system dominates." This turns "heterogeneity" from a fuzzy compliment into a measurable distributional property — distinctiveness relative to an AI baseline. The practical meaning of the signal-to-noise mark is therefore very concrete: not "produce more creative things" but "deliberately move to the right peak of the distribution, and be able to explain why that cluster links to a real need."
FIG. 2.0两条成本曲线分叉,信号被埋Two cost curves diverge; the signal gets buried · 看懂:Read: 生成成本坠向零,识别成本横着不动——信噪比=两者之比,于是塌陷。generation cost plunges to zero, recognition cost stays flat — S/N is their ratio, so it collapses.
看点:两条曲线由不同的东西决定——生成由模型能力决定(坠落),识别由真实世界里待办任务的真实数量决定(不动)。它们注定分叉,所以信噪比注定塌陷;这不是悲观,是定位:把劲使在喉部。Takeaway: the two curves are governed by different things — generation by model capability (it plunges), recognition by the real count of real jobs in the world (it does not move). They are bound to diverge, so S/N is bound to collapse; this is not pessimism but positioning: spend your effort at the throat.
检验信号Test signal
识别命中率(押中的方向 ÷ 总押注)与放弃率(敢于砍掉"看似可行"的比例)一起上升——信噪比改善的真实标志,不是产出更多候选,而是更少更准地押注、更狠地砍。(探索账:作为先行指标提出,需团队长期记账校准,未作已证现实。)Hit rate (directions that paid off ÷ total bets) and abandon rate (the share of "looks-feasible" you dared to cut) rise together — the real mark of improving S/N is not more candidates but fewer, sharper bets and harder cuts. (Exploration ledger: offered as a leading indicator; needs long-run team bookkeeping to calibrate, not asserted as established fact.)
The collapse of signal-to-noise has a spatial metaphor that makes its structural nature visible: the adjacent possible. At any moment, from where you currently stand, the possibilities genuinely within one step form a ring of "adjacent possible." Cheaper tools do not create value from nothing; they push that ring outward — a solution that yesterday took a team three months to reach, today one person prototypes in an afternoon. The ring grows, and the points inside it multiply. But here is the crux: the ring's area (possibility) is exploding, while the number of points inside that truly connect to a real need (signal) is not exploding in step. So you stand inside a far larger ring of possibility, able to go a hundred times more places, with not many more places worth going — the collapse of signal-to-noise is just another way of stating this spatial fact.
The metaphor also corrects an optimistic misreading: "a larger adjacent possible = more innovation opportunity." Opportunity does grow, but the burden of recognizing opportunity grows by the same factor — the bigger the ring, the harder it is to find the few worthy points in it, because the distractors (points that look feasible but connect to no real need) expand fastest. So the net effect of cheaper tools is not "innovation is easier" but "the bottleneck of innovation shifts from out-of-reach to unrecognizable." This is exactly why the volume spends all its effort at the recognition side: the expansion of the adjacent possible is both gift and curse — the gift is that you can reach more places, the curse is that the places worth reaching are drowned. The entire reason the compass exists is to point north inside this expanding ring.
FIG. 2.1邻近可能随工具变便宜而外推The adjacent possible pushed outward as tools cheapen · 看懂:Read: 圈在膨胀,圈里值得去的点没有同步膨胀——多出来的几乎全是干扰项。the ring expands; the worthy points inside do not — almost all the increase is distractors.
看点:把"信噪比塌陷"画成空间:旧圈到新圈之间的那一圈环形地带(annulus)就是工具变便宜新增的可能性,它几乎全是空心的噪声点,只偶尔有一个实心信号点。这解释了为什么"机会变多"和"更难创新"可以同时为真——多出来的机会,绝大多数不值得去。Takeaway: draw "S/N collapse" as space: the annulus between the old ring and the new is the possibility newly added by cheaper tools, and it is almost entirely hollow noise points, with only the occasional solid signal point. This explains how "more opportunity" and "harder to innovate" can both be true — the great majority of the added opportunity is not worth going to.
FIG. 2.2搜索空间膨胀,选择与责任的带宽不动The search space explodes; selection and responsibility bandwidth holds flat · 看懂:Read: 工具变便宜,可搜索的空间几十倍地涨;人能负责任地选中、并为之买单的额度是一条几乎不动的天花板——能负责任覆盖的比例于是坍塌。as tools cheapen, the searchable space grows tens-fold; the budget a human can responsibly select and stand behind is a near-flat ceiling — so the fraction you can responsibly cover collapses.
看点:FIG 2.1 画的是圈在膨胀;这张把膨胀和一条恒定的人类带宽放在一起,让稀缺显形。选择不是"更多算力能解决"的瓶颈——它受限于人能为之负责、为之买单的额度(affordable loss),而这条线不随工具变便宜而上移。空间涨百倍、带宽不动,于是你能负责任覆盖的比例趋零:稀缺从"够不着"彻底搬到了"认得出且担得起"。Takeaway: FIG 2.1 drew the ring expanding; this one places that expansion beside a constant human bandwidth so the scarcity becomes visible. Selection is not a "throw more compute at it" bottleneck — it is capped by the budget a human can be responsible for and stand behind (affordable loss), and that line does not rise as tools cheapen. Space grows hundred-fold, bandwidth holds, so the responsibly-coverable fraction tends to zero: scarcity has moved fully from "can't reach it" to "can recognize it and can afford to own it."
The collapse of signal-to-noise has a counter-intuitive corollary worth stating plainly: the scarcest skill of the abundance era is not "thinking it up" but "cutting it down." In the old idea-scarce era, abandoning an idea had a cost — it was hard-won, and cutting it might leave no next one; so "persistence" was a virtue and "casting a wide net" a strategy. Abundance flips this incentive entirely: ideas are no longer scarce, so the cost of holding onto every "looks-feasible" is not missing out but diluted attention — every looks-feasible you bet on crowds out the judgment bandwidth you should have spent on the few true signals. So the abandon rate (the share of looks-feasible you dared to cut) is a leading indicator even earlier than the hit rate: a team that cannot bear to cut anything is probably still on the old calibration, reading abundance as more opportunity rather than more noise.
Why is abandoning hard? Because it demands two things that run against human nature. One is admitting sunk cost — you have already put time into a looks-feasible direction, and cutting it means admitting that investment was wasted; the more invested, the harder to cut, the standard trap of loss aversion. The other is resisting the safety of "looking busy" — keeping ten candidate directions looks more diligent, more responsible, safer than betting on only two (the micro form of innovation theatre, SHEET 08). So "the nerve to abandon" is not merely a skill but a stance that needs institutional support: the bet-retrospective sheet (SHEET 10) records the reasons for cutting, so abandonment is re-seen from "a loss" into the active choice of "freeing bandwidth for the true signal." This is also why the volume keeps stressing "generate many, bet few and sharp" — generating many is free; betting few is the judgment.
噪声地板抬高,伤的不是信号本身,是信号的"可识别性"
A raised noise floor harms not the signal itself but its detectability
"噪声地板被抬到无限高,信号没变"这句话的承重,藏在一个容易被略过的细节里:被破坏的不是信号的质量,是信号的可识别性——你在一堆东西里把真信号挑出来的能力。这两者完全不同。一个真正连接真实需求的方向,它的内在价值并没有因为 AI 能批量生产看似可行而下降一分;下降的是它被认出来的概率。机制是信号检测论里最经典的一条:识别能力不取决于信号的绝对强度,取决于信号与噪声的相对可分性(信噪比)。当噪声地板被抬高,即使信号的绝对高度不变,信号探出噪声的那一截也被压薄了,判断者要把它与噪声分开就越来越难——这正是"地板抬高、信号没变"为什么仍然是灾难的原因。
The load-bearing weight of "the noise floor rises to infinity; the signal does not change" hides in a detail easy to skip: what is degraded is not the quality of the signal but its detectability — your ability to pick the true signal out of a heap. The two are entirely different. A direction that genuinely connects to a real need has not lost an ounce of its intrinsic value because AI can mass-produce looks-feasible; what has fallen is the probability it gets recognized. The mechanism is one of the most classic in signal-detection theory: the power to detect depends not on a signal's absolute strength but on its relative separability from noise (the signal-to-noise ratio). When the noise floor rises, even if the signal's absolute height is unchanged, the sliver of it poking above the noise is thinned, and the judge finds it ever harder to separate from noise — which is exactly why "the floor rises while the signal stays" is still a catastrophe.
更尖锐的破坏发生在基础率这一层,它解释了为什么"看起来在涨的命中"其实在跌。设想旧时代:能讲圆的方案稀少,假设一百个被认真提出的方向里,有十个是真信号(基础率 10%)——评审即使不完美,从一百个里挑出真信号也并不算难。现在 AI 把"能讲圆"的成本降到零:同样十个真信号还在,但它们被淹没在一万个看似可行里(基础率掉到 0.1%)。这里是关键的反直觉:哪怕你的判断力一点没退化、识别准确率还是九成,在 0.1% 的基础率下,你挑出来的"看起来对"的方向里,绝大多数仍然是假信号——这是贝叶斯定理的冷酷推论,叫精确率塌陷(false-discovery 飙升)。不是你变笨了,是基础率被噪声稀释后,同样的准确率会产出海量的假阳性。这把"信噪比塌陷"从一句比喻,钉成一个可算的机制。
The sharper damage happens at the layer of the base rate, and it explains why "hits that look like they are rising" are in fact falling. Picture the old era: coherent plans were scarce; suppose that of a hundred seriously-proposed directions, ten are true signals (a 10% base rate) — a review, even imperfect, does not find it especially hard to pick the true signals out of a hundred. Now AI drives the cost of "sounding coherent" to zero: the same ten true signals remain, but they are submerged in ten thousand looks-feasible (the base rate drops to 0.1%). Here is the crucial counter-intuition: even if your judgment has not decayed at all and your detection accuracy is still ninety percent, at a 0.1% base rate the vast majority of the "looks-right" directions you pick out are still false signals — the cold corollary of Bayes' theorem, called precision collapse (the false-discovery rate soars). You did not get dumber; once the base rate is diluted by noise, the same accuracy yields a flood of false positives. This nails "signal-to-noise collapse" from a metaphor into a computable mechanism.
Run one concrete calculation and you see how counter-intuitive this is. Suppose your judgment is quite good: 90% recognition of true signals (sensitivity), 90% correct rejection of false ones (specificity). In the old era at a 10% base rate, of the directions you call "worth it," about 50% are true signals — fifty-fifty, still convergeable by later verification. After AI presses the base rate to 0.1%, that same ninety-percent eye yields, among the directions you call "worth it," a true-signal share that falls below 1%: for every 100 you pick as promising, more than 99 are false positives. Your detection power did not change at all, yet the trustworthiness of the output collapsed by two orders of magnitude. This is why "generate more, review more, score more" does not solve the problem amid abundance but worsens it — it only enlarges the denominator (candidate count), does not change the base rate drowning you, and makes the absolute number of false positives soar linearly with candidates. The way out is not raising that 90% accuracy (marginal gain is tiny) but doing two things that change the base-rate structure: pre-screen candidates with the falsification checklist before they enter review (artificially raising the in-pool base rate), and use affordable-loss trials to let reality, not appearance, do the culling — switching judgment from "detecting within a sea of noise" to "verifying within an already pre-screened small pool."
This precision-collapse mechanism conversely gives "the nerve to abandon" its hardest justification: abandoning is not conceding but actively managing the base rate. Every looks-feasible you cut presses the denominator down and lifts the true-signal share of the pool; a team with a high abandon rate is essentially maintaining for itself an in-pool base rate far above the environment's, so the same judgment yields a far higher precision. It also explains why "cast a wide net, keep everything for now" is the worst strategy amid abundance — it does the exact opposite: it enlarges the denominator without bound, dilutes the base rate to the floor, and makes each "promising" call more likely a false positive. Put differently, the nerve to abandon is upgraded from a virtue to a survival skill because it is the only lever that pries the base-rate structure in a favorable direction, while raising that 90% acuity barely moves it. Screen first then verify, dare to cut more than you dare to keep — this is not a personality preference but the optimal strategy Bayes computes for you.
为什么"更多信息"不解决问题,反而加重它
Why "more information" does not solve the problem but worsens it
Facing the collapse of signal-to-noise, the most natural instinct is "then let me gather more information, do more analysis, generate more options to help me judge." This is precisely the most dangerous misdiagnosis. The problem is not too little information but too much noise — you are already drowning in infinite "looks-feasible," and adding information just pours another bucket into the same noise. More analysis often makes you more certain of a direction that should have been cut, because analysis can find support for any direction (models are especially good at this). What is truly missing is not information but value perception — inner conviction about what truly matters, which comes from understanding the world, not from the quantity of information (see SHEET 03). So the operational discipline of the signal-to-noise mark is a bit counter-intuitive: when judging direction, more information is a liability, not an asset; the move is not to gather more but to cut harder and bet sharper.
INV
03
PERCEPTION · 价值感知刻度
PERCEPTION
重画 · 原理
Redraw · Principle
"值得吗"来自世界理解,不来自 AI
"Is it worth it?" comes from understanding the world, not from AI
Load-bearing claim (compass mark two): value perception = real need × viable path × inner conviction, the point where all three meet. Its upstream is lived experience and deep tenure — starting from who you are, what you know, whom you know, not from a preset goal. This is where the kernel's step ③ "context not from AI" lands concretely.
The asymmetry, stated plainly: AI can vastly expand the search over viable paths — it has read more plans than any person. But real need and inner conviction are products of a person's long friction with the world, and AI cannot supply them. It can tell you how a path could be made to work; it cannot tell you whether what that path leads to is something a real person actually wants. This is the root difference between the downstream volumes' step ③ (agent-readable guardrails, specs, design systems) and this one: here the context is a person's deep understanding of the real world, not an indexable corpus.
Real need → JTBD's "job-to-be-done": people "hire" a product to get a real job done (Christensen / Ulwick's outcome-driven innovation[R8]). The test is "is there a real person, in a real situation, who truly needs to get this done" — not an imagined need.
可行路径 → 路径真实存在,还是只是"看起来可行"?这是 AI 最能帮的一轴——但帮的是搜索宽度,不是判定真伪。
Viable path → does the path truly exist, or merely "look feasible"? This is the axis AI helps most with — but it helps the breadth of search, not the verdict of truth.
Inner conviction → does this certainty come from your deep understanding of the world, or from "AI said so too"? Borrowed conviction is the most dangerous false signal in the noise.
接驳锚 · 手中之鸟Cross-link · bird in hand
价值感知的起点是 effectuation 的"手中之鸟"(bird-in-hand):从你是谁、你知道什么、你认识谁出发,而非从预设目标倒推(Sarasvathy 五原则[R9])。它与设计卷切分清楚——设计判"好不好 / 为不为人"(品味);创新判"值不值得 / 连不连真实需求"(价值感知)。一个在产物的体验层,一个在产物该不该存在的方向层。Value perception starts from effectuation's "bird in hand": begin from who you are, what you know, whom you know, not by reasoning back from a preset goal (Sarasvathy's five principles[R9]). It is cleanly split from the design volume — design judges "good or not / for people or not" (taste); innovation judges "worth it or not / connected to a real need or not" (value perception). One lives at the experience layer of the artifact; the other at the direction layer of whether the artifact should exist at all.
三轴里,只有一轴 AI 帮得上——这正是危险所在
Of the three axes, AI helps on only one — and that is exactly the danger
把三轴摊开看,会看到一个不对称的结构:AI 在可行路径这一轴上能力极强(它读过的方案比任何人多,能瞬间给出十条走通某目标的路),但在真实需求与内在确信两轴上几乎帮不上忙。危险恰恰从这里来:当一轴被极大增强、另两轴没动,人会下意识地用"可行路径很丰富"去顶替"真实需求被验证"——因为前者廉价、即时、看得见,后者昂贵、滞后、要离开屏幕去和真实的人摩擦。于是判断的重心被悄悄拽向 AI 擅长的那一轴,三轴的交点被一轴的丰盛冒充。这是"看似可行"伪信号的根部机制(见 SHEET 08)。
Lay the three axes side by side and an asymmetric structure appears: AI is extremely strong on the viable-path axis (it has read more plans than anyone and instantly offers ten ways to make a goal work), but is almost no help on the real-need and inner-conviction axes. The danger comes from precisely there: when one axis is hugely amplified while the other two stay put, people unconsciously substitute "viable paths are plentiful" for "real need is verified" — because the former is cheap, instant, visible, and the latter is expensive, lagging, and demands leaving the screen to rub against real people. The centre of gravity of judgment is quietly dragged toward the axis AI is good at, and the intersection of three axes is counterfeited by the abundance of one. This is the root mechanism of the "looks-feasible" false signal (see SHEET 08).
This is also the core of how entrepreneurship theory is being rewritten by GenAI (Journal of Management Studies 2026, Ramoglou/Chandra/Jin, Grade III): in the GenAI era the bottleneck of venturing is not a shortage of ideas but Knightian uncertainty — machine creativity expands the idea space by generating variation, human judgment contracts it by culling the unrealizable. Successful opportunity search "depends less and less on human creativity, more and more on eliminating what cannot be realized." In this volume's terms: the search over viable paths can be outsourced; the verdict on real need and the inner conviction about it cannot. Effectuation's "bird in hand" gains a precise meaning here — not reasoning back from an imagined market but starting from the small patch of the world you have truly worked and truly rubbed against, because only on that patch do your "real need" judgment and "inner conviction" have anything to stand on.
FIG. 3.0价值感知=三轴的交点Value perception = the intersection of three axes · 看懂:Read: 三环相交才是信号;AI 只把一个环吹大,那一个环的丰盛不等于交点。signal is the three-way overlap; AI only inflates one ring, and that ring's abundance is not the intersection.
看点:信号只在三环交点出现。AI 把"可行路径"环吹得极大,制造一种"信号很多"的错觉——但那只是一个环的面积,不是交点。两个人才有的环(真实需求、内在确信)才是把交点钉住的东西。Takeaway: signal appears only at the three-way intersection. AI inflates the "viable path" ring enormously, producing an illusion of "lots of signal" — but that is the area of one ring, not the intersection. The two human-only rings (real need, inner conviction) are what pin the intersection in place.
借来的确信:充裕时代最隐蔽的自我欺骗
Borrowed conviction: the abundance era's most hidden self-deception
三轴里最该单独拎出来讲的是内在确信,因为它最容易被悄悄掉包。确信本来是一种昂贵的东西——它是你对世界长期摩擦后才长出的笃定,错了要你自己承担。但 AI 提供了一种廉价的替代品:"AI 也说可行"。这句话听起来像证据,实则是确信的赝品:它让你感觉有了笃定,却没有付出长出笃定该付的成本(亲历、试错、为判断买单)。借来的确信比没有确信更危险,因为没有确信的人会去找,而有了借来确信的人会停止找——他以为已经到了。受力分析:AI 把"听起来笃定"的成本降到零,于是笃定的卖相和笃定的实质脱钩,正如可行性的卖相和实质脱钩(SHEET 08)。同一条充裕逻辑,作用在确信这一轴上。
Of the three axes the one most worth pulling out separately is inner conviction, because it is the easiest to quietly swap out. Conviction is meant to be expensive — it is the certainty grown only from your long friction with the world, and being wrong is yours to bear. But AI offers a cheap substitute: "AI said it's viable too." That sentence sounds like evidence but is a counterfeit of conviction: it makes you feel certain without paying the cost certainty should cost (lived experience, trial and error, paying for the judgment). Borrowed conviction is more dangerous than no conviction, because someone without conviction goes looking, while someone with borrowed conviction stops looking — they think they have arrived. Force analysis: AI drops the cost of "sounding certain" to zero, so the appearance of certainty decouples from the substance of certainty, just as the appearance of feasibility decouples from its substance (SHEET 08). The same logic of abundance, acting on the conviction axis.
怎么分辨自己的确信是真是借?一个实操的问法(落进 INSTRUMENT 06 的确信轴):"如果 AI 明天改口说这条路不可行,我的笃定会动摇吗?"如果会,那份确信本就建立在 AI 的输出上,是借来的;如果不会——因为你的笃定来自一个 AI 无法触及的源头(你亲历过的、你深耕的领域里你才知道的东西)——那才是真的内在确信。这也回扣 effectuation 的 pilot-in-the-plane:真正的确信不是"我预测这条路会通",而是"我知道这件事值得做,并愿意用行动去塑造它通"——前者依赖预测(AI 能给),后者依赖价值判断(AI 给不了)。
How do you tell whether your conviction is real or borrowed? One operational question (landing in INSTRUMENT 06's conviction axis): "if AI reversed itself tomorrow and said this path is not viable, would my certainty waver?" If it would, the conviction was built on AI's output and is borrowed; if it would not — because your certainty comes from a source AI cannot touch (something only you know from the field you have lived and worked) — that is real inner conviction. This also ties back to effectuation's pilot-in-the-plane: real conviction is not "I predict this path will work" but "I know this is worth doing and am willing to shape it into working by acting" — the former leans on prediction (which AI can give), the latter on value judgment (which AI cannot).
真实需求:人雇用产物去完成的那件事
Real need: the job people hire a product to get done
Of the three axes, "real need" is the easiest for imagined need to counterfeit, and JTBD (Jobs-to-be-Done, Christensen / Ulwick's outcome-driven innovation) gives it a sharp criterion. JTBD's core flip: people do not "buy a product" but "hire a product to get a real job done." The famous example is the milkshake — someone buys one in the morning, and the "job" they hire it for is not "drink something sweet" but "have something to do on a long, dull commute that lasts until lunch." If you think the need is "a tastier milkshake," you optimize flavor; if you see the real job, you find the competitors are actually bagels and bananas. The criterion is therefore hard: is there a real person, in a real situation, who truly needs to get something done? If you can be concrete about "who, in what situation, getting what done," it is a real need; if you can only say "users would probably want this," it is an imagined need.
为什么这一轴 AI 帮不上、且最容易被它带偏?因为 AI 没有处境——它没有早上的通勤、没有撑到午饭的焦虑、没有一个具体身体在一个具体世界里的待办任务。它能基于读过的语料生成"听起来像需求"的描述,但那是对需求语言的模仿,不是对需求本身的接触。当你问 AI "用户要什么",你得到的是需求的平均表述,恰恰滤掉了真实 job 里那些反直觉的、具体的、只有亲历者才知道的细节(奶昔的真竞品是香蕉,这种洞察不在平均里)。所以真实需求轴的纪律是 effectuation 的"手中之鸟"落到操作层:从你真正深耕、真正有处境的那一小块出发,因为只有在那里,你才分得清真实的 job 和想象的需求。这也是为什么 SHEET 10 的田野脚本要你去现场问"你上次怎么办成的 / 卡在哪",而不是"你要不要"——前者逼出真实 job,后者只收到想象。
Why does AI not help on this axis, and most easily lead you astray on it? Because AI has no situation — it has no morning commute, no anxiety about lasting until lunch, no concrete body with a to-do in a concrete world. It can generate descriptions that "sound like a need" from the corpus it has read, but that is mimicry of the language of need, not contact with need itself. Ask AI "what do users want" and you get the average phrasing of need, which filters out precisely the counter-intuitive, concrete details of the real job that only the one who lived it knows (the milkshake's real competitor is a banana — that insight is not in the average). So the discipline of the real-need axis is effectuation's "bird in hand" landed at the operational level: start from the small patch you have truly worked and truly have a situation in, because only there can you tell a real job from an imagined need. This is why SHEET 10's fieldwork script has you go and ask "how did you get it done last time / where were you stuck," not "do you want this" — the former forces out the real job, the latter only collects the imagined.
INV
04
USELESS · 散木刻度
THE USELESS TREE
公理 · 反单一目标
Axiom · Anti single-goal
最大的创新风险,是效率吞掉了冗余
The largest innovation risk is efficiency devouring redundancy
承重命题(罗盘第三刻度 · 承"反单一目标过度优化"公理):真正的创新常来自"散木"——暂时无用、不在优化目标上的冗余探索。在 AI 极度优化一切的时代,所有探索都被对齐到当下可度量的目标,散木被砍光。方法论的核心任务之一,是主动保护"无用之用"的空间,把"看似无用"当作创新的种子库来经营。
Load-bearing claim (compass mark three · carrying the "anti single-goal over-optimization" axiom): real innovation often comes from the "useless tree" — redundant exploration that is temporarily useless and off the optimization target. In an age where AI over-optimizes everything, all exploration gets aligned to the currently-measurable goal and the useless tree is cut down. One core task of the methodology is to actively protect the space of useful uselessness, cultivating "looks-useless" as a seed bank for innovation.
Zhuangzi's useless tree lives out its natural span because it is useless — uselessness is protection, freedom from utility; what the utilitarian ruler undervalues is, from another vantage, the very condition for flourishing. This is not poetic flourish: evolutionary biology supplies hard evidence. Optimal ≠ leanest — neutral networks and gene duplication show that seemingly redundant "useless" genes are precisely the raw-material bank for adapting to new environments; squeezing a system to the single-goal optimum cuts away its capacity to evolve. The enemy was never AI; it is optimizing the system to a single goal.
①
异质 · 反单一目标Heterogeneity · anti single-goal
Quality-Diversity / Novelty-Search 证:机器放弃单一目标函数,反而能产出更异质的解。同质化的因果机制(Doshi-Hauser)是 AI 收敛,不是 AI 本身。Quality-Diversity / Novelty-Search show: when the machine drops the single objective, it produces more heterogeneous solutions. The causal mechanism of homogenization (Doshi-Hauser) is AI-induced convergence, not AI itself.
②
散木 · 从公理升为定律Useless tree · axiom to law
"最优 ≠ 最精简"有进化生物学硬证(中性网络、基因复制,Ⅱ 生物学)。冗余非浪费,是适应储备。"Optimal ≠ leanest" has hard evolutionary-biology evidence (neutral networks, gene duplication, Grade II biology). Redundancy is not waste; it is the reserve for adaptation.
③
慢 · 某些过程价值在于慢Slowness · some value lives in the slow
serendipity(有准备的头脑在偏离主线时撞见价值)+ 慢想。把所有探索压成即时可度量产出,serendipity 命中率归零。Serendipity (the prepared mind stumbling on value off the main line) plus slow thinking. Compress all exploration into instantly measurable output and the serendipity hit rate goes to zero.
接驳锚 + 检验信号Cross-link + test signal
这接组织卷人本主线:把人从执行里腾出来,是为了让人回到"什么值得"——在偏认知的这一端,它的形态是"保护无用"。检验信号:散木留存度(不在 KPI 上的探索占比)与意外收获率(serendipity 命中)。(探索账:留存度阈值无普适值,需各组织自定基线后跟踪,未作已证现实。)This links to the organization volume's human through-line: freeing people from execution so they can return to "what is worth it" — on the cognition-facing end, its form is "protect the useless." Test signals: useless-tree retention (the share of exploration not on any KPI) and serendipity hit rate. (Exploration ledger: no universal retention threshold exists; each organization must set its own baseline and track it; not asserted as established fact.)
效率悖论:AI 放大的是利用,不是探索
The efficiency paradox: AI amplifies exploitation, not exploration
March 1991's explore-exploit frame[R10] is still the floor: exploration (search / variation / risk / experiment / discovery) and exploitation (refinement / selection / execution / efficiency) compete for the same budget, and exploitation tends to win — it is predictable, measurable, fast to feed back. Every AI deployment emits a signal of "progress" (faster, cheaper, more output), and those signals land almost entirely on the exploitation side. A convergent observation calls this the efficiency paradox: "polishing a better steam engine while the world turns to electricity." The mechanism is decisive: freed capacity does not automatically become slack — capacity saved by technology is typically reallocated to do more of the same (more volume), not something different; and "what gets measured gets managed; what cannot be measured gets cut" — slack, being unmeasurable, is always the first thing cut.
This yields a clean dividing line (Of Termites & Tokens): replacing people with tokens = exploitation; augmenting people with tokens = exploration. The former's story is clean, CFO-friendly (how many headcount saved); the latter's is vague, demanding imagination (no one can pre-compute what the added agency will grow into). An organization's gravity tilts to the former by default. So "protect the useless tree" is not a moral exhortation but an engineering requirement against the system's default gravity: unless you deliberately set up independent exploration units, appraise by "learning" rather than "output," and explicitly measure and defend the time that sits on no KPI, the organization gets locked into exploitation — optimizing for an unchanging world while the world keeps changing.
"最优 ≠ 最精简"的硬证据来自进化生物学
"Optimal ≠ leanest" has hard evidence from evolutionary biology
Why is the useless tree a law and not a poem? Because it has cross-domain hard evidence. Robustness creates evolvability (Andreas Wagner, Proc. R. Soc. B[R11]): robustness produces genotype networks / neutral networks — large sets of redundant genotypes with identical phenotypes, over which a population spreads, accumulating cryptic variation, thereby reaching more new phenotypes. Redundancy is not waste; it is the reserve pool for innovation. Gene duplication plus drift (Ohno's dilemma, PNAS) is more direct: a new-function gene becomes possible only by "first duplicating redundantly, then letting the copy drift in the neutral / mildly-deleterious zone long enough" to acquire a rare beneficial mutation — the "temporarily useless" copy is the precondition for new function. The chaperone HSP90 buffers mutations, keeping an unstable system alive long enough to await a compensatory one — robustness "buys time" for innovation. Three lines of evidence point to one conclusion: compressing all redundancy into the optimum severs evolvability. This gives "optimal ≠ leanest" a second wall beyond information theory.
FIG. 4.0探索 / 利用的资源竞争,与散木被砍的位置The explore/exploit budget contest, and where the useless tree gets cut · 看懂:Read: 省下的产能默认流回利用;散木在度量边界外,第一个被砍。freed capacity defaults back to exploitation; the useless tree sits beyond the metric boundary and is cut first.
看点:三件事同时发生——利用块吸走所有"进步"信号、省下的产能默认回流利用、散木坐在度量边界外第一个被砍。保护散木=刻意在三处反向施力:独立单元、按学习考核、明确度量并守护不在 KPI 上的时间。Takeaway: three things happen at once — the exploitation block absorbs every "progress" signal, freed capacity defaults back to exploitation, and the useless tree, sitting beyond the metric boundary, is cut first. Protecting it means deliberately applying counter-force at all three: independent units, appraisal by learning, and explicitly measuring and defending off-KPI time.
serendipity 不是运气,是可被设计的暴露面
Serendipity is not luck but a designable exposure surface
"The useless tree protects the use of the useless" sounds like a defense of waste, until you understand the mechanism of serendipity. Serendipity is not luck falling from the sky but the prepared mind stumbling on value in exploration off the main line — it needs two conditions to hold at once: a prepared mind (able to recognize that what it stumbled on has value), and an exposure surface large enough and off the main line (so that "stumbling" has a chance to happen). The danger of the AI era is precisely that the second condition is systematically compressed by efficiency: when all exploration is aligned to the currently-measurable goal, the off-main-line exposure surface is cut away, and the probability of "stumbling" goes to zero — not because the mind is unprepared but because there is no off-main-line place to stumble in. What the useless-tree reserve does is precisely to turn this exposure surface from "what luck leaves over" into "what is deliberately designed": explicitly fencing off exploration space aligned to no KPI raises the probability of serendipity from random to cultivable.
这把"保护无用"从一句道德口号变成一个可操作的设计问题:暴露面要多大、放什么样的人进去、它和主线之间留多少摩擦。约束理论(Theory of Constraints)给了一条相关的硬提醒——局部优化每一步都"无价值",真正决定系统创造价值能力的是约束,而 AI 自动化多发生在成本侧(局部优化),真正的增长来自"让新形式的人类能动性在经济上可行"。换句话说,散木保护区不是成本,是对系统级约束的投资:你砍掉的每一寸暴露面,单看都省了钱,合起来却切断了系统撞见下一个增长曲线的唯一通道。这就是为什么本卷把散木留存度列为先行指标——它度量的不是"浪费了多少",是"还剩多少撞见新物种的可能"。
This turns "protect the useless" from a moral slogan into an operable design question: how large the exposure surface should be, what kind of people go into it, how much friction to keep between it and the main line. The Theory of Constraints gives a related hard reminder — local optimization at every step is "worthless"; what truly governs a system's capacity to create value is the constraint, and AI automation mostly happens on the cost side (local optimization), while real growth comes from "making new forms of human agency economically viable." In other words, the useless-tree reserve is not a cost but an investment in a system-level constraint: every inch of exposure surface you cut saves money in isolation, but together they sever the system's only channel to stumble on its next growth curve. This is why the volume lists useless-tree retention as a leading indicator — it measures not "how much was wasted" but "how much possibility of stumbling on a new species remains."
Ohno 困境的组织版:副本要先没用够久,才可能有用
Ohno's dilemma at the org level: the copy must be useless long enough first
In evolutionary biology, Ohno's dilemma says: a gene that can acquire a new function almost always first passes through a useless period of "redundant duplication plus drifting in the neutral zone long enough" — the copy must exist first, and not be culled immediately, before it can, over long drift, hit the rare beneficial mutation. Carry this mechanism to organizations and you get a counter-intuitive but rigorous corollary: an exploration direction that is ultimately valuable almost always first passes through a "looks-useless" period, and the length of that period often exceeds the patience of any quarterly KPI. If the organization's rule is "exploration must prove its use as fast as possible or be cut," it has effectively decreed that every copy must die before it can drift into a beneficial mutation — systematically killing the sole source of innovation.
This gives the useless-tree reserve a design principle harder than "leave some room": the reserve's time scale must exceed the drift period, or it is reserve in name only. A useless-tree plot protected for only one quarter effectively decrees that all exploration must become useful within a quarter — what it protects is not the use of the useless but "fast usefulness," still exploitation in essence, merely renamed. A real useless-tree reserve must be appraised by "learning" rather than "output," must tolerate a period with no reportable result, and must have someone whose job is to hold the judgment "it is not yet time to cut it." This is exactly why the volume elevates the useless tree from a metaphor to a law — it is not urging you to romantically tolerate waste but telling you: pressing all redundancy into the optimum on the efficiency books is, mathematically and biologically, severing evolvability, and that cost is irreversible. By the time you find you need the direction you cut, it is no longer there.
为什么效率故事总赢:它干净,增长故事模糊
Why the efficiency story always wins: it is clean, the growth story is vague
The useless tree is cut not because decision-makers are foolish but because of an asymmetry in the tellability of two stories. The efficiency story is clean and CFO-friendly: replace people with tokens, so many headcount saved, so much cost down — every figure writes into the quarterly report, a story with definite numbers. The growth story is vague and demands imagination: augment people with tokens, and what the added agency grows into no one can pre-compute — a story with no definite numbers, only possibility. In any setting that requires reporting upward or justifying a budget, the clean story naturally overpowers the vague one. So even when a decision-maker rationally knows the long-term value of the useless tree, at each concrete decision point they are still pushed by "which story is easier to tell" to cut it — not a cognitive error but the gravity of narrative structure.
This gives "protect the useless tree" a countermeasure more effective than "reasoning": do not try to win the clean efficiency story with the vague growth story at every decision point — that is a structurally unwinnable fight. The right move is to move the useless tree off the track that requires case-by-case justification: fix it by institution (a metrics-exempt reserve written into the rules) so it need not re-prove its usefulness every quarter. This is exactly why SHEET 11 keeps stressing that the reserve's "boundary must be hard" — the function of a hard boundary is not physical isolation but turning the useless tree from "having to win a narrative duel every time" into "present by default unless there is a very strong reason to touch it." Moving the battlefield from "case-by-case justification" to "one-time institution-building" is the only way the useless tree survives long-term under the gravity of efficiency.
INV
05
FORK · 系统化分叉刻度
THE FORK
命根 · 双账本
Spine · Two ledgers
价值感知能被系统化吗——能的部分给练法,不能的给栖息地
Can value perception be systematized — teach the teachable, build a habitat for the rest
Load-bearing claim (the spine; the fork itself gets its own sheet): the externalizable part of value perception can be taught; the constitutive core can only have its emergence conditions cultivated — force the latter into a system and you manufacture the average by hand. The stance taken here: both tracks, with ecology as the floor and training as the teachable local part. This sheet marks the tension honestly; it does not pretend the matter is settled.
This is the ruling on the top open question. Both extremes are wrong: a pure "training manual" pretends a personal trait can be copied; a pure "ecology guide" abandons the part that is plainly teachable. The right stance treats the fork itself as the map — first use SHEET 01's "externalizability gradient" to judge which branch the piece in hand falls on, then decide between a drill and a habitat.
可系统化支 · 训练手册(局部)Systematizable branch · training manual (local)
押注复盘:每个押注事后记账——押中/押错、理由、信号来源
Bet retrospectives: book every bet after the fact — hit/miss, reasoning, signal source
真实需求田野:去真实处境里验"待办任务",不在会议室里想象需求
Real-need fieldwork: verify the "job" in the real situation, not imagine needs in a meeting room
Affordable-loss trials: bet only what you can afford to lose (effectuation), making trial-and-error a routine you can sustain
反"看似可行"的证伪训练:默认对每个候选问"它为假的条件是什么"
Anti-"looks-feasible" falsification drills: by default ask each candidate "what would make this false"
不可系统化支 · 生态设计指南(底)Non-systematizable branch · ecology design guide (the floor)
留白:不被即时产出填满的时间,是反共识价值的孵化器
Slack: time not filled by immediate output is the incubator of anti-consensus value
容错:错误成本低到敢押反共识方向,否则只剩安全的平均
Tolerance for error: error cost low enough to dare anti-consensus bets, else only the safe average remains
散木保护区:明确划出不对齐 KPI 的探索地带(接 SHEET 04)
Useless-tree reserve: explicitly fence off an exploration zone not aligned to any KPI (see SHEET 04)
多样性 · 慢通道:抵抗收敛到单一最优,给慢的过程一条不被砍的通道
Diversity · a slow lane: resist convergence to a single optimum; give slow processes a lane that does not get cut
为假的条件 · 命题可证伪Falsification condition · the claim is falsifiable
本卷核心命题为假的条件:若能证明"异质、构成性的价值感知可被无损系统化 / 学习"——即一套训练或一个模型能让任意个体习得只对另一个体成立的反共识价值,且不退化为平均——则本卷倒,全卷应改写。这正是它是命题而非口号的原因。(前沿悬案,见 SHEET 06 与最后一层动态三分;走探索账。)The condition under which this volume's core claim is false: if it can be shown that "heterogeneous, constitutive value perception can be losslessly systematized / learned" — i.e. a drill or a model lets any individual acquire anti-consensus value that holds only for another individual, without degrading to the average — then this volume falls and should be rewritten. That is exactly why it is a claim and not a slogan. (A frontier open question; see SHEET 06 and the closing dynamic trichotomy; on the exploration ledger.)
为什么"可学的恰恰是同质化":RLCF 的双刃
Why "what's learnable is exactly the homogenization": the double edge of RLCF
分叉不是抽象的姿态选择,它有一个尖锐的实证支点。RLCF(从社群反馈中强化学习)证明"科学品味"局部可学——把社群偏好外化成 reward,模型能学会逼近共识口味。但这恰恰暴露了分叉的危险:它能学到的,正是"偏离当前社群平均"会被惩罚的那个信号。RLCF 学的是"predict taste without having taste"——预测口味,而非拥有口味。于是用它去系统化价值感知,系统化的恰恰是同质化:它把判断拉向社群当下的共识,而前沿价值按定义就是偏离当下共识的那部分。这就是 SHEET 06 列为关键实验的那个前沿悬案——RLCF 能不能学到"偏离当前社群平均"的前沿价值?若不能(当前证据倾向于不能),则它系统化的恰是要被守护的对立面。
The fork is not an abstract choice of stance; it has a sharp empirical pivot. RLCF (reinforcement learning from community feedback) shows "scientific taste" is locally learnable — externalize community preference into a reward and the model learns to converge on consensus taste. But that is exactly what exposes the fork's danger: what it can learn is precisely the signal that "departing from the current community average" gets penalized. RLCF learns to "predict taste without having taste." Using it to systematize value perception therefore systematizes the homogenization: it pulls judgment toward the community's current consensus, while frontier value is by definition the part that departs from current consensus. This is the frontier open question SHEET 06 lists as the decisive experiment — can RLCF learn the frontier value that "departs from the current community average"? If it cannot (current evidence leans toward cannot), then what it systematizes is precisely the opposite of what must be protected.
The theoretical wall is harder still: aligning a single model to heterogeneous preferences faces an impossibility theorem (the MaxMin-RLHF line; RLHF under standard aggregation ≈ Condorcet-style majority voting, arXiv:2506.12350, Grade III). Compress plural, mutually conflicting human values into one reward and mathematics dooms you to either sacrifice the minority or degrade to the average — cognate with Arrow's impossibility theorem[R4] in social choice. The implication for this volume is structural: externalizable, aggregatable preference signals (the consensus stretch) can be trained, but "heterogeneous value that holds for a given individual or group" cannot be losslessly absorbed by a single system. So the two tracks are not a compromise but a consequence forced by a theorem: hand the consensus stretch to the training manual, the heterogeneous stretch to the ecology guide — the latter does not try to "learn" heterogeneous value; it cultivates a habitat where different values each survive without being flattened to the mean.
"Both tracks" is easily misread as not daring to take a side, splitting the difference. It is not. Its precise meaning: the two branches of the fork handle different parts of value perception, not two answers to the same question. For the externalizable consensus stretch, the evidence (RLCF) says it is learnable, so honestly give drills and fold it into ① abundance — no need to pretend it is mysterious; for the inexternalizable anti-consensus stretch, the evidence (IndieValueCatalog's 55–65%[R13], the impossibility theorem) says it cannot be learned, so honestly give a habitat and sink it into ④ the bedrock — no need to pretend it is teachable. Fence-sitting is "both sides are a little right"; "both tracks" is "first cut along the externalizability gradient, then handle each by its own nature." The criterion is sharp, not fuzzy: can the piece in hand be written as a bookkeepable trace? If yes, into the training manual; if no, into the habitat (SHEET 10 and 11 deliver the two branches respectively).
The weighting "ecology as the floor" is not arbitrary either; it has an asymmetric reason. Bet the weight the other way — training manual as the floor, ecology as supplement — and the day you misjudge the boundary (taking a piece that is actually constitutive and trying to train it), the cost is irreversible: you systematically manufacture the average, and because the output "looks like innovating" (innovation theatre, SHEET 08) the error is hard to self-detect. The other way, with ecology as the floor, the worst case is merely "kept a bit more space that could in fact have been trained" — a cost that is reversible and bearable (see affordable loss). When the boundary is uncertain, betting the weight toward the side you can retreat from if wrong is itself a demonstration of this volume's value judgment: not gambling on which side is right but controlling the downside of being wrong.
边界不是固定的:自动化前线在右移,但右端有底
The boundary is not fixed: the automation front moves right, but the right end has a floor
The fork is easily read as a fixed line; in fact it is dynamic. On the externalizability gradient, the automation front moves right continuously with model capability — some preference signals that today need human judgment may tomorrow be externalized into a trainable reward. So the training-manual branch expands over time, folding more and more judgment once "kept for the human" into ① abundance. This is true; the volume does not deny it. But the rightward move has a floor: the constitutive value anchor at the far-right end of the gradient is doubly walled by information theory (the generation-verification asymmetry) and an impossibility theorem (aligning a single model to heterogeneous preferences). These two walls are not "current models cannot yet do it" but structural — they do not retreat as capability rises. So the correct reading of the fork is: the line moves, but the movement has an endpoint; the small region right of that endpoint is structurally kept for the human.
This rescues the volume from two common wrong stances. One is the techno-optimist's "once models are strong enough, value judgment will be learned too" — half right (the externalizable stretch will be) and half wrong (the constitutive stretch has structural walls). The other is the humanist-pessimist's "AI will replace all human judgment" — mistaking the dynamic rightward move for total defeat, ignoring the floor at the right end. The volume's stance is between the two, but not a compromise: it states precisely "which stretch will be automated, which will not, and why." This is also why SHEET 13 lists "whether anti-consensus frontier value is learnable" as a frontier open question — it is the decisive experiment on whether this floor can be breached. Until it is breached, the fork holds; if it is breached, the volume falls. Honestly tying the fate to a falsifiable experiment, not to a belief.
INV
06
EMERGENCE · 涌现识别刻度
EMERGENCE
前沿 · 接 γ 机制
Frontier · to the γ mechanism
从生产创新,翻转为事后认出新物种
From producing innovation to recognizing a new species after the fact
Load-bearing claim (compass mark five · the closest point in the whole series to the γ mechanism): innovation's ultimate form may no longer be the innovation of an individual or team but the emergence of human-machine co-evolution — no one can design it in advance, only recognize it after the fact. This flips the methodology from "producing innovation" into emergence literacy: training not "produce innovation" but "in the chaos that has already happened, recognize which is a new species and which is worth amplifying."
Emergence in complex systems: the whole exhibits properties no part can design in advance; they can only be recognized after the fact, never pre-orchestrated. This is the endpoint of the leverage-point climb in this volume — innovation's leverage moves from "ideas → combinations → judging direction" all the way to recognizing emergence, isomorphic with the genealogy volume's "leverage climbs floor by floor." It connects to the project's existing γ mechanism (the emergence of a new species): γ is not designed but recognized and amplified. At this mark the form of value perception changes — not "which direction to bet on" but "in this thing that has already happened, which is the new species worth amplifying."
检验信号 · 探索账Test signal · exploration ledger
事后识别延迟(从涌现发生到被认出 / 放大的时滞)与放大命中率——延迟越短、命中越准,组织的涌现识别力越强。前沿悬案:能否系统化训练"认出反共识新物种"的能力,是 SHEET 05 分叉的关键未决项——这里只给先行指标与适用边界,不写成已证现实;γ 涌现本身是 Ⅲ 级理论推演,不作规划依据。Recognition latency (the lag from when emergence happens to when it is recognized / amplified) and amplification hit rate — the shorter the latency and the sharper the hit, the stronger the organization's emergence literacy. Frontier open question: whether the capacity to "recognize the anti-consensus new species" can be systematically trained is the decisive unresolved item of the SHEET 05 fork — here only a leading indicator and applicability boundary are given, not asserted as established fact; γ emergence itself is a Grade III theoretical extrapolation, not a basis for planning.
为什么"识别"而非"生产":legibility 问题逼出的角色
Why "recognize" not "produce": the role forced by the legibility problem
Why is the endpoint "recognize" rather than "produce more"? Because once generation is pushed to the limit, the next principal contradiction is not too little output but output racing past what humans can digest (the legibility problem). In the extrapolation of autonomous research the bottleneck-migration sequence is clear: typing → local debugging → experiment scaffolding → result summarization cheapen in turn, and then review / judgment / compute allocation / governance become scarce; meanwhile AI output grows ever less legible to humans, requiring a dedicated "explanation layer / translation layer" before anyone can see what happened. In that situation "producing an innovation" stopped being scarce long ago; what is scarce is "in a large, half-illegible mass of output that has already happened, recognizing which is the genuine new species." That is emergence literacy — not a new capacity to produce but a new capacity to read.
This also corrects a common misreading: emergence literacy is not passive waiting, not hindsight. It is an active engineering — leaving interfaces for emergence: letting different parts of the system combine unexpectedly, making the results of edge exploration visible, giving "seemingly irrelevant" outcomes a channel to be noticed, keeping the latency of amplification decisions short. Here this volume merges with the project's existing γ mechanism (the emergence of a new species): γ is never designed; what the methodology can do is turn "recognizing γ and amplifying it fast" from a matter of luck into a matter of institution — which is exactly why SHEET 06 offers the two leading indicators of recognition latency and amplification hit rate.
FIG. 6.0从设计创新,到为涌现留接口、事后识别From designing innovation to leaving interfaces and recognizing after the fact · 看懂:Read: 三层各自的人类角色不同;越往右,越是"读"而非"造"。three layers, each with a different human role; rightward, it becomes reading not making.
看点:这不是说"人不再造东西",而是创新的杠杆点上移了一层——当生产被推到极限,真正稀缺的不是再多产一个,是认出已经长出来的那个值得放大的。这是全系列最靠近 γ 的位置,也最该诚实标 Ⅲ 级:它是推演,不是已证现实。Takeaway: this is not "humans stop making things"; the leverage point of innovation has climbed a layer — once production is pushed to the limit, the truly scarce act is not producing one more but recognizing the one already grown that is worth amplifying. This is the closest point in the series to γ, and the one most honestly marked Grade III: it is extrapolation, not established fact.
为什么"人机共同进化"不是科幻修辞
Why "human-machine co-evolution" is not science-fiction rhetoric
"The emergence of human-machine co-evolution" sounds like grand narrative, but it has a plain mechanism. Co-evolution means: a person changes how a tool is used, the tool changes the boundary of the person's capability, and the changed person finds new uses for the tool — a feedback loop whose output no side can design in advance. Its embryonic form is already observable today: how a person works with an agent reshapes the problems they can conceive, and the new problems they can conceive reshape how they work with the agent. After the loop runs a few rounds, the way of working that grows out of it was neither preset by the tool's designer nor planned by the user at the start — it is the emergent product of the loop. This is exactly why "recognize" replaces "produce": in a co-evolving loop there is no position for a "designer," only positions for "participant" and "recognizer."
这给方法论一个具体的转向:不再问"我要设计出什么创新",而问"我和我的工具的回路,正在长出什么我没设计的东西,其中哪个值得放大"。这个转向把人的角色从回路的外部设计者挪到回路的内部识别者——人仍然不可替代,但不可替代的方式变了:不是因为人能造出 AI 造不出的东西,而是因为人能认出回路里值得放大的东西、并为放大它负责(接 SHEET 07.5)。诚实标注:这一整套是 Ⅲ 级推演,γ 涌现本身没有一手实证,回路机制是合理的类比而非测量过的规律。它在本卷的位置是"最值得继续追的前沿",不是"已经站住的地基"——这也是为什么它和它的两个先行指标全部记在探索账上(SHEET 13)。
This gives the methodology a concrete turn: no longer "what innovation should I design" but "what is the loop of me and my tools growing that I did not design, and which of it is worth amplifying." The turn moves the human role from the loop's external designer to its internal recognizer — the human is still irreplaceable, but the way of being irreplaceable has changed: not because the human can make what AI cannot, but because the human can recognize what in the loop is worth amplifying, and bear responsibility for amplifying it (see SHEET 07.5). Stated honestly: this whole construction is Grade III extrapolation; γ emergence has no first-hand empirics, and the loop mechanism is a reasonable analogy, not a measured regularity. Its place in this volume is "the frontier most worth pursuing," not "foundation already standing" — which is why it and its two leading indicators all sit on the exploration ledger (SHEET 13).
解释层:当产出快过人能消化,翻译成了瓶颈
The explanation layer: when output races past digestion, translation becomes the bottleneck
涌现识别有一个常被忽略的前置条件:你得看得懂已经发生的东西,才谈得上识别哪个是新物种。而 legibility 问题恰恰让这件事变难——当 AI 的产出快过人能消化的速度,且越来越多以人不易读的形式存在(密集的中间状态、非线性的推理链、跨多个系统的涌现行为),"识别"之前还隔着一道"读懂"。这就是为什么瓶颈迁移序列的末端不只是判断,还有一个新角色:解释层 / 翻译层——把 AI 的产出翻译成人能审视、能判断的形式。没有这层,涌现识别在结构上就不可能:你不可能识别一个你根本读不懂的新物种。
Emergence literacy has an often-overlooked precondition: you must be able to read what has already happened before you can recognize which is a new species. And the legibility problem makes precisely this harder — when AI's output races past what humans can digest, and increasingly exists in forms hard for humans to read (dense intermediate states, non-linear reasoning chains, emergent behavior across many systems), there is a "reading" gap before the "recognizing." This is why the end of the bottleneck-migration sequence is not only judgment but a new role: the explanation layer / translation layer — translating AI's output into a form humans can scrutinize and judge. Without this layer, emergence literacy is structurally impossible: you cannot recognize a new species you cannot read at all.
这对创新方法论是个具体的转向,也是一个值得守护的人类角色。解释层不是把 AI 的输出"翻译成自然语言摘要"那么浅——那种摘要恰恰会丢掉涌现里最反直觉、最不可读、也最可能是新物种的那部分(接 SHEET 12 的保守偏置:自动摘要倾向于把异常压回均值)。真正的解释层要求一种特殊的人类能力:在半不可读的产出里,保留住那些"看起来不对劲、但说不定是新东西"的信号,而不是把它们当噪声清掉。这又回到了 emergence literacy 的本质——它是一种阅读能力,而且是一种抵抗把异常读成噪声的阅读能力。这也是为什么本卷反复说,人在涌现刻度上的不可替代,不是因为人能造,是因为人能在一团混沌里,认出那个连模型自己都会忽略的、值得放大的反常。
For the methodology this is a concrete turn, and a human role worth protecting. The explanation layer is not as shallow as "translate AI's output into a natural-language summary" — such a summary precisely drops the part of emergence that is most counter-intuitive, least legible, and most likely to be a new species (see the SHEET 12 conservative bias: auto-summary tends to press anomalies back toward the mean). A real explanation layer demands a special human capacity: in half-illegible output, to preserve the signals that "look off, but might be something new" rather than clearing them as noise. This returns to the essence of emergence literacy — it is a reading capacity, and specifically a reading capacity that resists reading the anomalous as noise. This is why the volume keeps saying that the human's irreplaceability at the emergence mark is not because the human can make, but because the human can, in a mass of chaos, recognize the worth-amplifying anomaly that even the model itself would ignore.
认出有一个窗口期:错过它,新物种就被当噪声清掉了
Recognition has a window: miss it and the new species is cleared as noise
Emergence cannot be produced, but the act of "recognizing" has a temporal structure — it cannot be done at any time later. A new species' lifeline runs roughly: it first appears as a faint signal, nearly invisible among normal requests; if efficiency does not filter it out, it grows spontaneously by small amounts; at some moment it enters a recognition window — grown visible enough, yet not yet cleared as "noise / abuse." If someone in the window recognizes it, gives it an instrument (the SHEET 06 emergence dashboard), and ratifies it into a formal form, it survives; miss the window and it is either nudged back on track or cleared as an anomaly, and the new species dies in the filter (Case 4's Copilot Chat ran exactly the "recognized within the window" line). The timeline below draws this structure — its point is not "how often to check" but that recognition is time-bound, and hesitation defaults to abandonment.
FIG. 6.5涌现识别时间轴:从噪声到追认,或到被清掉Emergence-recognition timeline: from noise to ratification, or to being cleared · 看懂:Read: 同一股异常用法两条命运分叉——窗口内被认出则上行成新物种,窗口外被当噪声则下行被清掉。one anomalous usage forks into two fates — recognized within the window it rises into a new species; outside it, treated as noise, it falls and is cleared.
看点:这张图把"涌现不能生产、只能认出"翻译成一个可操作的时间约束。新物种从噪声地板里冒头时信号极弱,很容易被当成滥用清掉;它的命运在"认出窗口"里分叉——窗口内有人盯着异常并问"这是不是一个没被设计的真实需求",它上行成新物种;没人在窗口里认出,它下行被清掉。仪表盘(SHEET 06)的全部意义,就是让这个窗口不被错过——不是去生产涌现,是确保涌现发生时有人看得见、且看得见时还来得及追认。Takeaway: this figure translates "emergence cannot be produced, only recognized" into an actionable temporal constraint. A new species' signal is extremely weak as it surfaces from the noise floor, easily cleared as abuse; its fate forks inside the "recognition window" — within it, someone watching the anomaly asks "is this a real need I did not design for," and it rises into a new species; with no one to recognize it in the window, it falls and is cleared. The whole point of the dashboard (SHEET 06) is to keep this window from being missed — not to produce emergence but to ensure that when emergence happens someone can see it, and that seeing it, there is still time to ratify it.
Load-bearing claim (applicability sheet · hard gate): the value compass fits situations where direction is genuinely open and the cost of failure is bearable — greenfield exploration, choosing a product direction, early bets in research or a venture. It does not fit situations where direction is locked by external hard constraints (heavy compliance, safety-critical, a single fixed goal): there what is needed is execution discipline, not value perception. Forcing the compass onto these is manufacturing noise where divergence does not belong.
绿地 / 方向开放 · 用罗盘Greenfield / open direction · use the compass
"做什么值得做"仍是真问题,多个方向都技术可行
"What is worth doing" is still a live question; several directions are technically viable
失败成本可承受(affordable loss),允许押反共识
Failure cost is bearable (affordable loss); anti-consensus bets are allowed
价值由你(或你的群体)异质地定义,无外部唯一正确答案
Value is defined heterogeneously by you (or your group); there is no externally unique right answer
方向锁死 / 增量 · 非目标群体Locked / incremental · not the target
强合规、安全关键、监管硬约束已锁定方向——直说非本卷目标群体
Heavy compliance, safety-critical, hard regulatory constraints already lock direction — plainly not this volume's target group
单一确定目标的纯执行场景:要的是工程纪律,去下游卷
Pure execution toward a single fixed goal: it wants engineering discipline; go to the downstream volumes
增量优化既有产物:先问"是重画还是嫁接",多数情况不需要罗盘
Incremental optimization of an existing artifact: first ask "redraw or graft"; in most cases no compass is needed
总闸 · greenfield vs transformationMaster gate · greenfield vs transformation
一句话总闸:方向开放 → 用罗盘(本卷);方向锁死 → 用施工图(下游卷)。罗盘最危险的误用,是在方向其实已被锁死的地方假装它开放,于是把执行问题伪装成价值问题、制造无谓发散。与设计卷再切一刀:设计判好不好,创新判值不值得;都不该在执行纪律的场景里发散。The gate in one line: direction open → use the compass (this volume); direction locked → use the drawing (the downstream volumes). The compass's most dangerous misuse is pretending direction is open where it is in fact locked, thereby disguising an execution problem as a value problem and manufacturing pointless divergence. One more cut against the design volume: design judges good-or-not, innovation judges worth-it-or-not; neither should diverge in a situation that calls for execution discipline.
重画还是嫁接:用一道测试决定要不要拿出罗盘
Redraw or graft: one test for whether to take the compass out at all
"方向开放"听起来主观,其实有一道可操作的测试,借自组织卷的"重画 vs 嫁接":把你面前的问题写成一句话,然后问——要解决它,是得重新画一张图(重新定义做什么、为谁做、价值锚在哪),还是只需把 AI 嫁接到已有流程上(目标不变、只是更快更便宜)?若是后者,方向其实没开放,你要的是下游卷的施工图,把罗盘收起来;若是前者,方向真的开放,罗盘才有用武之地。这道测试挡住的是本卷最常见的滥用——在一个其实只需要执行纪律的地方,因为"AI 让一切看起来都能重做"而误以为方向开放,于是把执行问题伪装成价值问题。
"Direction is open" sounds subjective, but there is an operational test, borrowed from the organization volume's "redraw vs graft." Write the problem in front of you as one sentence, then ask — to solve it, must you draw a new diagram (redefine what to do, for whom, where the value anchor sits), or do you merely need to graft AI onto an existing process (goal unchanged, just faster and cheaper)? If the latter, direction is not actually open; you want a downstream drawing, so put the compass away. If the former, direction is genuinely open and the compass has work to do. The test blocks this volume's most common abuse — in a place that actually needs only execution discipline, mistaking "AI makes everything look redoable" for "direction is open," and thereby disguising an execution problem as a value problem.
The second boundary is bearable failure cost. Even if direction is genuinely open, if the cost of a single failure is irreversible and would land on an innocent third party (see SHEET 07.5 and INSTRUMENT 08's reversibility / consequence-attribution axes), that is not a "free-divergence" scenario either — it wants a caution closer to safety engineering than the explorer's stance of a value compass. So the applicability boundary is really two gates in series: direction open ∧ failure cost bearable. Pass both, then take the compass. This is also why the volume keeps stressing affordable loss — it is not merely a mindset but a pillar of the applicability boundary itself: pressing a single failure into the bearable range is what turns a situation that is otherwise "too dangerous to diverge" back into one you "can explore with the compass."
"AI 让一切可重做"是适用边界最常见的幻觉
"AI makes everything redoable" is the most common illusion at the boundary
适用边界最容易被一句话冲垮:"反正 AI 让一切都能重做,那一切方向都开放了,都该用罗盘。"这句话听起来顺,但混淆了两件根本不同的事:技术上能重做,和方向上值得重新选。AI 确实让"重做"的技术成本骤降,但"方向开放"问的不是能不能重做,是"做什么"这个问题本身是否仍有真正的选择空间。一个被强合规锁死的领域,就算 AI 让你能一夜重写整个系统,你的方向仍然不开放——你能改的是怎么做,不是做什么。把"技术可重做"误当"方向开放",就是"AI 赋能"冒充"AI 原生"的那个经典错误在创新面上的形态:工具变了,问题的类别没变,却假装它变了。
The applicability boundary is most easily washed away by one sentence: "since AI makes everything redoable, every direction is open, so use the compass everywhere." It sounds smooth but conflates two fundamentally different things: technically able to redo and worth re-choosing the direction. AI does crash the technical cost of "redoing," but "direction is open" asks not whether you can redo but whether the question "what to do" still has real room for choice. A field locked by heavy compliance stays direction-closed even if AI lets you rewrite the whole system overnight — what you can change is how, not what. Mistaking "technically redoable" for "direction is open" is the innovation-surface form of the classic error of "AI-enabled" masquerading as "AI-native": the tool changed, the category of the problem did not, yet it pretends it did.
四类明确不适用的处境:把它说死,比含糊更诚实
Four situations where it explicitly does not apply: saying so flatly is more honest than hedging
The two gates "direction is open ∧ failure cost is bearable" conversely fence off four kinds of situation where the compass explicitly should not appear. Naming them one by one is not leaving the methodology an escape hatch; it is that a good tool's honesty shows first in its nerve to say "this is not mine to govern." First, execution problems whose direction is locked: compliance filing, tax calculation, implementing a signed-off spec — there is no "is it worth it" question here, only a "did you do it right" question, wanting the downstream volumes' drawing and verification discipline, not value-divergence. Taking out the compass here is brainstorming over a question that has one correct answer. Second, high-stakes decisions where failure is irreversible and the cost spills outward: drug dosing, bridge load-bearing, the safety margins of a braking system. Even if technically the "direction" appears to have several implementations, the cost of one failure lands on innocent third parties and cannot be withdrawn — such problems want a caution close to safety engineering (narrower acceptable bands, more redundancy and review), not the exploratory stance of "bet many, bet fast, undo if wrong." Affordable loss's precondition (the loss is bearable) simply does not hold here.
Third, situations where value is uniquely fixed by an external hard constraint: heavily-regulated financial disclosure, the statutory elements of medical informed consent, accessibility standards that must be met. Here "what is valuable" is not an open question but is pinned in advance by law, ethics, or safety norms; the space for judgment is legitimately compressed to near zero, and the correct stance is to treat the constraint as a non-negotiable boundary condition, not as "a direction awaiting divergence." Fourth, pure preference aggregation with no constitutive heterogeneity: when a choice really is only "which one do most people like" and there is no "anti-consensus value that holds only for one group" — say, which home-style dishes the canteen serves next week — a vote or simple tally suffices, and wheeling out the three-axis compass is using a cleaver to kill a chicken, adding only ritual cost (itself a form of the SHEET 08 innovation theatre). The four share a clear common thread: either direction is not open, or failure is not bearable, or value is externally fixed, or there is simply no tacit value that needs "perceiving."
举一个本卷明确不适用的真例,把这道边界钉到具体处境上:一家做航空电子飞控软件(受 DO-178C 适航认证约束)的团队,想"用 AI 原生的创新方法论加速我们的开发"。诚实的回答是:不适用,且强行套用会制造真实危害。逐轴看,它四道门全撞:方向不开放——飞控的功能与安全需求由适航标准与系统设计预先确定,不存在"值不值得做这个功能"的发散空间;失败不可逆且代价外溢到无辜第三方(机上乘客)——这是 affordable-loss 的反面,单次失败无法用"错了就退"兜底;价值被外部强约束唯一确定——DO-178C 的每条目标都是不可协商的边界条件;最后,这里要的恰恰是与价值罗盘相反的姿态:更窄的可接受区间、可追溯到每行代码的需求、穷尽式的验证覆盖。对这家团队,本卷能给的唯一诚实建议是"这不是你的工具"——AI 在这里的正当用法是下游卷的范畴(在锁死的规格内做可验证的执行加速),而不是本卷的价值发散。一个连自己不适用谁都说不清的方法论,比这个边界本身更危险。
Take one real case where this volume explicitly does not apply, to nail the boundary onto a concrete situation: a team building avionics flight-control software (under DO-178C airworthiness certification) wants to "use the AI-native innovation methodology to accelerate our development." The honest answer is: it does not apply, and forcing it on would manufacture real harm. Axis by axis, it hits all four gates: direction is not open — flight-control function and safety requirements are fixed in advance by airworthiness standards and system design, with no "is this feature worth doing" divergence space; failure is irreversible with cost spilling to innocent third parties (passengers aboard) — the opposite of affordable loss, where one failure cannot be backstopped by "undo if wrong"; value is uniquely fixed by an external hard constraint — every DO-178C objective is a non-negotiable boundary condition; and finally, what is wanted here is precisely the opposite stance to the value compass: narrower acceptable bands, requirements traceable to every line of code, exhaustive verification coverage. To this team, the only honest advice this volume can give is "this is not your tool" — AI's legitimate use here belongs to the downstream volumes' domain (verifiable execution acceleration inside a locked spec), not this volume's value-divergence. A methodology that cannot even state whom it does not fit is more dangerous than the boundary itself.
所以适用边界其实是在保护罗盘的信噪比,而不只是划分场景。每一次在方向其实锁死的地方拿出罗盘,都是往判断带宽里灌噪声——你会对着一个其实只有一个正确答案的问题"发散",制造一堆看似可行的伪选项,然后还要花力气把它们砍掉。这是双重浪费。正确的纪律是:先用"重画 vs 嫁接"测试 + 失败成本可承受这两道门筛一遍,只有真正双门都过的处境才动用价值罗盘;其余的,老实承认它要的是下游卷的施工图、是执行纪律,把罗盘收起来。知道何时不用一个工具,和知道何时用它,是同一种判断力的两面——这也正是本卷反复示范的"敢于不做"。
So the applicability boundary is really protecting the compass's signal-to-noise, not merely sorting scenarios. Every time you take the compass out where direction is in fact locked, you pour noise into the judgment bandwidth — you "diverge" on a question that actually has one right answer, manufacture a heap of looks-feasible pseudo-options, then spend effort cutting them. A double waste. The correct discipline: first screen with the "redraw vs graft" test plus the bearable-failure-cost gate, and bring out the value compass only for situations that genuinely pass both; for the rest, honestly admit they want a downstream drawing and execution discipline, and put the compass away. Knowing when not to use a tool and knowing when to use it are two faces of one judgment — exactly the "nerve not to do" this volume keeps demonstrating.
INV
07·5
RESPONSIBILITY · 价值与责任刻度
RESPONSIBILITY
命题 · ③↔④ 接缝
Claim · the ③↔④ seam
"值得吗"的另一半,是谁为后果买单
The other half of "is it worth it?" is who bears the consequence
Load-bearing claim (compass mark four · value judgment must come bound to responsibility): "is it worth it?" is not only "worth it for me" but also "who bears its cost." The moment value discovery outsources execution to AI, a hidden fracture appears: the person who defines value and the person who bears the consequence begin to decouple. The stance here is explicit — value judgment and responsibility attribution are two faces of one judgment node, and you cannot keep the former while outsourcing the latter. Quietly thinning the consequence until no one actually bears it is innovation's most hidden, and most falsifiable, failure.
为什么这一刻度必须独立成一张?因为内核③(上下文=对世界的深理解)与④(人回归意义)之间有一道接缝,而这道接缝正是后果承担被稀释的地方。决策归属缺口(attributability gap, Sci Eng Ethics 2024,Ⅲ)说得很准:AI 决策支持系统让人难以辨认"决策里反映的价值判断该归于谁"——技术里隐含的价值判断未必归属于使用 AI 的人。一旦价值判断悄悄从人转移到工具,"为后果买单"的人就不再是"定义了什么值得追求"的人,③ 与 ④ 脱钩。这不是抽象担忧,它有正在发生的制度形态。
Why must this mark stand as its own sheet? Because between the kernel's step ③ (context = deep understanding of the world) and step ④ (people return to meaning) there is a seam, and that seam is exactly where consequence-bearing gets diluted. The attributability gap (Sci Eng Ethics 2024, Grade III) names it precisely: AI decision-support systems make it hard to discern "to whom the value judgment reflected in a decision should be attributed" — the value judgment implicit in the technology need not attribute to the person using the AI. Once value judgment quietly migrates from human to tool, the person who pays for the consequence is no longer the person who defined what was worth pursuing, and ③ decouples from ④. This is not an abstract worry; it has an institutional form that is already happening.
责任不是被消灭的,是被摊薄到没人承担的
Responsibility is not abolished; it is thinned until no one carries it
学界主流不承认"无人负责"——议会否了 AI 电子人格,闭合责任缺口的理论也在推进(按前提性控制 + 预期收益分配,每个缺口里总至少有一个该负责的人)。真正的危险不是法律宣布无人负责,而是三条更隐蔽的稀释路径:责任外移(把后果转成可定价、可转移、可池化的成本——严格责任 + 保险,把"道德承担"工程化成"成本内部化");责任摊薄(liability overlaps:多方互相甩锅,最后所有人都逃脱,武器技术式);归属错配(moral crumple zone / 道德皱缩区:责任被推给最近的人类操作员以保护技术系统完整性,但那个人对结果几乎没有控制,于是承担变成"皱缩区表演",真正的价值定义者被保护)。三条都不"消灭"责任,它们让责任在形式上有人担、实质上无人担。
The academic mainstream does not concede "no one is responsible" — parliaments rejected AI e-personhood, and theories for closing the responsibility gap advance (allocate by antecedent control plus expected benefit; in every gap there is always at least one person who should bear it). The real danger is not a law declaring no one responsible, but three more hidden dilution paths: responsibility offloaded (turning consequence into a priceable, transferable, poolable cost — strict liability plus insurance engineer "moral bearing" into "cost internalization"); responsibility thinned (liability overlaps: many parties pass the blame until everyone escapes, the weapons-technology pattern); attribution misplaced (the moral crumple zone: responsibility is pushed onto the nearest human operator to protect the integrity of the technical system, yet that person has almost no control over the outcome, so bearing becomes "crumple-zone theatre" and the true value-definer is shielded). None of the three abolishes responsibility; they make it formally borne and substantively unborne.
The implication for this volume is structural: every reading of the value compass should come with a responsibility reading. To ask "is it worth it?" you must at the same time ask "on whom does the cost land, and does that person have commensurate control?" When a value-definer outsources execution to an agent and then offloads the consequence to insurance or thins it down a blame chain, they capture all the upside of "value discovery" while shedding the downside — the mirror image of the top claim's "people voluntarily stop defining value": people have not stopped defining value; they have stopped paying for the value they define. A value judgment that bears no responsibility is not a lighter judgment but a hollowed-out one.
FIG. 7.5价值与责任映射:定义者与买单者的脱钩Value & responsibility map: the decoupling of definer from payer · 看懂:Read: 健康态是一条对角线(谁定义谁买单);三条稀释路径把点拉离对角线。the healthy state is a diagonal (definer = payer); three dilution paths pull the point off the diagonal.
看点:这张图不是道德说教,是一个判据。把任何创新放到这个平面上:若定义价值的人和承担后果的人是同一个(落在对角线上),③↔④ 接缝完好;若你能轻易把后果外移、摊薄、错配(点被拉离对角线),那这个"值得"多半是借后果稀释换来的,不是真值得。Takeaway: this is not a sermon but a criterion. Put any innovation on this plane: if the value-definer and the consequence-bearer are the same (on the diagonal), the ③↔④ seam is intact; if you can easily offload, thin, or misplace the consequence (the point is pulled off-diagonal), then that "worth it" is mostly bought by diluting the consequence, not truly worth it.
Set each of three axes one notch to judge whether a bet is one you can afford, reverse, and answer for: affordable loss (effectuation) × reversibility (can a wrong bet be undone) × consequence attribution (on whom the cost lands). The bench synthesizes a one-line allocation diagnosis — it does not bet for you; it puts the responsibility half of "is it worth it?" on the table. The reading re-renders on language toggle.
① · 损失可承受度Affordable loss
② · 可逆性Reversibility
③ · 后果归属Consequence attribution
分配原则 · 押注组合Allocation principle · the portfolio of bets
把多个押注摆在一起,分配台给出的不是"押哪个",是"怎么配比":可逆 × 输得起 × 自己担的押注可以多下、快下(双向门,错了就退);不可逆 × 伤筋动骨 × 代价外移的押注必须少下、慎下,且先把后果拉回自己头上再决定。这就是 affordable-loss 组合的实操——不预测哪个会赢,而是控制每个押注的下行,让组合整体输得起。最危险的一格是"不可逆 × 代价落在他人":那不是大胆,是把自己的上行建立在别人的下行上(接 FIG 7.5 的离对角线点)。(探索账:诊断阈值为启发式,不可逆/可承受的判定须结合具体处境,非校准判据。)Put several bets together and the allocator gives not "which to bet" but "how to weight them": reversible × affordable × self-borne bets can be placed more, and fast (two-way door, undo if wrong); irreversible × ruinous × cost-offloaded bets must be placed sparingly and slowly, and only after pulling the consequence back onto yourself. This is the practice of an affordable-loss portfolio — not predicting which wins but controlling the downside of each bet so the whole portfolio is something you can afford to lose. The most dangerous cell is "irreversible × cost lands on others": that is not boldness but building your upside on someone else's downside (see the off-diagonal point in FIG 7.5). (Exploration ledger: the diagnosis thresholds are heuristic; the irreversibility/affordability verdicts must be read with the concrete situation, not as calibrated criteria.)
把后果定价,是否等于让人不再负责
Does pricing the consequence amount to no one being responsible
The most realistic direction of responsibility engineering today is strict liability plus insurance: engineering "moral bearing" into "cost internalization" — the consequence of AI-caused harm is turned into a priceable, transferable, poolable cost. This path has its merits: it does make someone pay for the consequence rather than leaving it hanging. But it also hides a tension this volume must face: once the consequence is fully priced, "being responsible for the consequence" decays from a moral relation into a financial arrangement. The problem is not compensation itself but whether pricing quietly changes the value-definer's judgment — when an irreversible harm becomes "a budgetable cost," the person defining value may start treating it as another optimizable expense rather than a consequence that should or should not be created. This is the financial version of the ③↔④ decoupling: responsibility closes on the books while being offloaded, morally, into a number.
This volume's stance is not against compensation or insurance — those are advances of civilization. What it opposes is substituting pricing for judgment: mistaking "the cost can be paid" for "this cost is worth creating." These are different judgment nodes: the former asks "who pays if something goes wrong," the latter asks "should this be done at all, is its irreversible harm justified by the value it creates." INSTRUMENT 08's "consequence attribution" axis deliberately keeps this question on the human side and does not let "we already bought insurance" skip it. The attributability-gap research warns that value judgment quietly migrates from human to tool and then gets diluted down a blame chain; this volume's countermeasure is plain: in every compass reading, force "who defines value" and "who bears the consequence" onto the same row so that their decoupling is visible to the naked eye (FIG 7.5). This does not solve all the institutional difficulties of the responsibility gap, but it at least keeps the methodology from being an accomplice to the decoupling.
外部性失明:代价落在不在场的人身上
Externality-blindness: the cost lands on those not in the room
All three paths of responsibility dilution assume there is someone in the system to "pass the blame" to; there is a more hidden form where even the someone is not in the room — externality-blindness. When value judgment narrows to "worth it for me (or for my users)," the cost may land on those outside the system, or on the future: the environment, the unrepresented, the next generation. AI is an amplifier here, not the cause: it optimizes the objective function you give it, is blind to any externality not written into that function, and writes the plan ever cleaner and more credible, making "the cost not seen" vanish entirely from the appearance. This is the other face of the same abundance logic as "looks-feasible" — one makes the infeasible look feasible, the other makes the costly look costless.
This volume's countermeasure is not to have the methodology solve every externality problem — that is the work of governance and policy, beyond the boundary of one innovation methodology. What it can do is something concrete and limited: in the tools of value judgment, force "on whom the cost lands" to be an axis you cannot skip. INSTRUMENT 08's third axis is built for exactly this — it does not compute externalities for you, but it forces you, before each bet, to state explicitly where the cost is attributed, and does not let "the objective function didn't mention it" pretend it does not exist. This turns externality from an easily-forgotten blind spot into a field that must be filled in. It does not solve all the hard problems of responsibility, but it at least makes the excuse "I didn't think of it" unsayable after using the compass — and that is the share of responsibility the methodology can, and should, bear on this hard problem.
INV
08
TRAP · 看似可行陷阱
THE TRAP
失败模式 · 本卷最常见误用
Failure mode · how this goes wrong
"看起来可行"是充裕时代最贵的伪信号
"Looks feasible" is the abundance era's most expensive false signal
Load-bearing claim (failure-mode overview): this volume goes wrong along one trunk — mistaking "looks feasible" for signal. The model can make any direction sound coherent, so the appearance of feasibility decouples entirely from feasibility itself. The six failures below are all variants of that trunk; each comes with a leading indicator (how to spot it one step early) and a fix.
为什么"看似可行"在充裕时代特别危险,而旧时代不那么危险?旧时代,"想清楚一个方案"本身就要付出认知成本——能把方案讲圆的人,多半真想过。那份成本是一道天然过滤器:卖相和实质大致同涨。AI 把这道过滤器拆了——把方案讲圆的成本降到零,卖相可以独立于实质无限生产。于是"它讲得通"不再携带"有人真想过"的信息。受力分析:陷阱不来自 AI 说谎,来自 AI 太擅长把任何方向写得可信,而人的判断习惯还停在"讲得通≈想过了"的旧校准上。
Why is "looks feasible" especially dangerous in the abundance era and less so before? Before, "thinking a plan through" itself cost cognitive effort — anyone who could make a plan hold together had probably actually thought about it. That cost was a natural filter: appearance and substance rose together. AI dismantled the filter — the cost of making a plan sound coherent fell to zero, and appearance can now be mass-produced independently of substance. So "it hangs together" no longer carries the information "someone really thought about this." Force analysis: the trap comes not from AI lying but from AI being too good at making any direction sound credible, while human judgment is still calibrated on the old "coherent ≈ thought-through."
"A coherent plan was probably thought through" — cost served as the filter. Judgment could lazily use "does it read smoothly" as a proxy for "does it hold up."
Coherence is supplied without limit by the generation side and decouples from soundness. The only proxy that did not depreciate is "can a falsifying condition be constructed, can reality break it" — the cost of falsification did not fall. Judgment must switch from "does it read right" to "does it survive falsification."
六种误用 · 先行指标与修法
Six ways it goes wrong · leading indicators and fixes
①
卖相当信号Appearance as signal
先行指标:评审里说"这个写得真好 / 逻辑很顺"次数 > 说"它会在哪失败"次数。修法:每个候选先问"为假的条件",再谈优点(接 SHEET 05 证伪训练)。Leading indicator: in review, "this is well-written / the logic flows" is said more often than "where would it fail." Fix: ask each candidate "what would make this false" before its merits (see SHEET 05's falsification drill).
②
借来的确信Borrowed conviction
先行指标:押注理由里出现"连 AI 都说可行 / 大家都在做"。修法:把确信溯源到一次亲历的现实摩擦——说不出来,就是借的(接 SHEET 03 内在确信轴)。Leading indicator: the bet's rationale contains "even AI says it's viable / everyone is doing it." Fix: trace conviction back to one lived friction with reality; if you cannot name it, it is borrowed (see SHEET 03's conviction axis).
③
想象的需求Imagined need
先行指标:需求陈述里没有一个具体的人、在一个具体处境里、真的要把某事办成。修法:下场做一轮真实需求田野,把"我觉得有人要"换成"我见过谁在什么处境下要"(JTBD 待办任务)。Leading indicator: the need statement names no concrete person, in a concrete situation, truly needing to get something done. Fix: go run a round of real-need fieldwork; replace "I think someone wants this" with "I have seen who, in what situation, needs it" (the JTBD job).
④
效率吞冗余Efficiency eats redundancy
先行指标:所有探索都被要求对齐当下 KPI,散木留存度趋零(接 SHEET 04)。修法:显式划一块不汇报、不对齐的保护区(下面 INSTRUMENT 07 自检)。Leading indicator: all exploration is required to align to current KPIs; useless-tree retention trends to zero (see SHEET 04). Fix: explicitly fence off a reserve that does not report and does not align (the INSTRUMENT 07 self-check below).
⑤
把异质强行系统化Forcing heterogeneity into a system
先行指标:用一套打分 / 一个模型给"反共识方向"判分,分数总把它们压到平均线下。修法:先用可外化性梯度判它落哪支——构成性支别打分,给栖息地(接 SHEET 05 分叉)。Leading indicator: one scoring rubric / one model scores "anti-consensus directions" and always pushes them below the average line. Fix: first judge which branch it falls on by the externalizability gradient; do not score the constitutive branch, give it a habitat (see SHEET 05's fork).
⑥
在锁死处假装开放Pretending open where it is locked
先行指标:对一个其实方向已锁死(强合规 / 安全关键 / 单一目标)的任务做发散头脑风暴。修法:过 SHEET 07 总闸——方向锁死就去用施工图,别在执行问题上制造价值发散。Leading indicator: running divergent brainstorming on a task whose direction is actually locked (heavy compliance / safety-critical / single goal). Fix: pass the SHEET 07 master gate — if direction is locked, use the drawing; do not manufacture value-divergence over an execution problem.
反指标 · 怎么知道没掉进陷阱Counter-indicator · how to know you avoided it
做对的反指标不是"押中的多",而是砍掉的多且砍得早:放弃率(敢砍"看似可行"的比例)随评审升高,且砍的理由能落到"它为假的条件被现实击穿",而非"感觉不对"。一个健康团队的会议室里,"它会在哪失败"的发言密度应高于"它哪里好"。(探索账:作先行指标提出,需团队记账校准,未作已证现实。)The counter-indicator for doing it right is not "many hits" but many cuts, made early: the abandon rate (the share of "looks-feasible" you dared to cut) rises through review, and the reasons land on "its falsifying condition was broken by reality," not "it felt off." In a healthy team's room, the density of "where would this fail" should exceed that of "what is good about it." (Exploration ledger: offered as a leading indicator, needs team bookkeeping to calibrate; not asserted as established fact.)
三个系统级失败:创新剧场、外部性失明、把可度量的优化到死
Three system-level failures: innovation theatre, externality-blindness, optimizing the measurable to death
The failures above are individual; scaled to the organization, the "looks-feasible" trap grows three system-level forms, each harder to self-detect. Innovation theatre: an organization gets enamoured of activities that "look like innovating" — hackathons, innovation labs, the count of AI pilots — and the output of those activities is precisely the most easily-generated "looks-feasible." The test for theatre is simple: ask "how many of these bets actually staked an affordable loss and actually went to verify a real need?" If the answer is near zero, it is theatre, not innovation. Leading indicator: the count of innovation activities climbs while the hit rate (the share of directions that paid off) does not move.
Externality-blindness: narrowing "is it worth it?" to "worth it for me," blind to the cost landing on people outside the system or on the future (see SHEET 07.5). AI makes this blindness cheaper — it optimizes the objective function you give it, ignores any externality not written into that function, and writes the plan ever more cleanly and credibly. Fix: run every bet through INSTRUMENT 08's "consequence attribution" axis, forcing the question "on whom does the cost land." Optimizing the measurable to death: the innovation form of Goodhart's law — once a proxy metric (active users, idea count, patent count) is taken for innovation itself, the system optimizes that proxy and squeezes out the real, hard-to-measure value. This shares a root with the SHEET 04 efficiency paradox and the SHEET 12 convergence bias: "what gets measured gets managed; what cannot be measured gets cut first." Together the three are the full organizational-scale picture of the looks-feasible trap.
FIG. 8.0价值罗盘:探索/利用 × 可逆/不可逆The value compass: exploration/exploitation × reversible/irreversible · 看懂:Read: 这是一具罗盘的两根轴,不是流程步骤——它告诉你该多探还是该收,该快下还是该慎下。two needles of one compass, not process steps — it tells you to explore or exploit, to place fast or place with care.
看点:这是本卷唯一一张刻意画成"罗盘"而非"流程"的图。两根轴——探索/利用、可逆/不可逆——划出四象限,但它们不是要你依次走过,而是定位:你现在这个押注落在哪格,就该用哪种节奏。右上"探索 × 可逆"是 AI 时代最该多下的格(双向门、affordable loss),左下"利用 × 不可逆"则该交给下游卷的施工图。Takeaway: this is the one figure in this volume deliberately drawn as a "compass," not a "process." Two needles — explore/exploit, reversible/irreversible — cut four quadrants, but they are not a sequence to walk through; they are a locator: whatever cell your current bet falls in dictates the tempo. Top-right "explore × reversible" is the cell to place most in the AI era (two-way door, affordable loss); bottom-left "exploit × irreversible" should be handed to the downstream volumes' drawings.
怎么重新校准:把"讲得通"从证据降级为候选
How to recalibrate: demote "it sounds right" from evidence to candidate
Recognizing the trap is not the same as climbing out of it. Climbing out needs an explicit reset of judgment habits: demote "it sounds right" from evidence to a candidate awaiting verification. On the old calibration, a self-consistent plan carried some credibility — because in the era of expensive generation, sounding coherent itself required thought. On the new calibration, the information content of "it sounds right" approaches zero, because it can be manufactured in bulk at no cost. So the reset is very concrete: whenever a direction "reads smoothly, sounds correct," do not count that as a plus but treat it as a flag demanding extra falsification — the smoother it reads, the more you ask "what is its falsifying condition, can it be broken by reality at low cost." This is not pessimism but moving judgment's anchor from "appearance" back to "can it be punctured."
After the reset, all six failures share one antidote: before betting, force a pass through "its falsifying condition." The looks-feasible trap is blocked by the falsification point; imagined need by "is there a concrete person"; borrowed conviction by "would I waver if AI reversed itself"; innovation theatre by "how many bets truly staked an affordable loss"; externality-blindness by INSTRUMENT 08's consequence-attribution axis; optimizing the measurable to death by "is this metric squeezing out the real value." The six blocks share one root: amid abundance, doubt appearance by default and deliberately hunt for the falsifying condition. Once internalized, this habit is the drillable half of "value perception" in this volume's sense — it does not guarantee you bet right, but it systematically catches the false signals that merely "look feasible."
INV
08.5
LEGACY · 旧创新机器的失效
LEGACY MACHINE
结构批判 · 点名机制
Structural critique · named mechanism
旧创新机器是为点子稀缺造的,它管的不是值得
The old innovation machine was built for idea scarcity — and it never managed worth
Load-bearing claim: the twentieth-century innovation machine — the stage-gate funnel, the KPI roadmap, the hackathon, the idea-count metric, the "fail fast" slogan, the central R&D lab — shares one assumed premise: good ideas are scarce and expensive to think up, so the machine's job is to efficiently screen and advance a flow of existing ideas. When generation drives the cost of "thinking up a coherent idea" to near zero, that premise voids wholesale; these structures do not "need tuning" — once the bottleneck moves, the gate they guard stands empty. Named one by one below, with mechanism, not mood.
First state the shared root, so this does not sound like six unrelated complaints. These structures are all built on an implicit scarcity assumption: credible plans are scarce, producing a plan costs heavy cognitive effort, so what is worth managing is "the flow of plans and a quality gate on them." The machine is designed accordingly — the funnel screens flow, the roadmap prioritizes, the hackathon drives volume, the metric counts output, the lab concentrates capacity. But this volume's first mark already said it: the bottleneck moved from "generating ideas" to "recognizing what deserves commitment" (SHEET 01). When a plan's appearance can be mass-produced at zero cost, every machine that "manages flow, volume, or appearance-quality" is managing something no longer scarce, while the genuinely scarce thing — value perception — is precisely what these structures structurally cannot squeeze out and cannot retain. The six below are six concrete failure points growing from that root.
六种旧结构 · 它守的关口为什么空了
Six legacy structures · why the gate each guards stands empty
①
阶段闸漏斗 Stage-GateStage-gate funnel
它假设:方案稀缺,所以宽口进、逐闸筛,每道闸用"看起来够不够成熟"放行(Cooper 的 Stage-Gate[R19])。为什么空了:闸门判的是"卖相成熟度",而卖相恰恰被生成端无限供给——漏斗现在过滤的是"谁更会把方案写圆",不是"谁更接真实需求"。它会系统性地放过高可行 · 低真实需求的看似可行陷阱(SHEET 08),因为那正是最容易过闸的形态。机制:过滤器的判据(成熟卖相)与稀缺物(价值感知)正交,于是筛得越勤,离值得越远。It assumes: plans are scarce, so enter wide, screen gate by gate, each gate passing on "does it look mature enough" (Cooper's Stage-Gate[R19]). Why empty: the gate judges "appearance-maturity," and appearance is exactly what the generation side now supplies without limit — the funnel today filters for "who is better at making a plan read polished," not "who is closer to a real need." It systematically passes the high-feasibility · low-real-need looks-feasible trap (SHEET 08), because that is precisely the form most able to clear a gate. Mechanism: the filter's criterion (polished appearance) is orthogonal to the scarce thing (value perception), so the harder it screens, the further it drifts from worth.
②
KPI 路线图 KPI RoadmapKPI-driven roadmap
它假设:方向已基本确定,剩下的是把执行排进季度、对齐可度量目标。为什么空了:它把"方向开放"的探索强行塞进"方向锁死"的执行框(违反 SHEET 07 总闸)。一切候选都要先证明"对当季 KPI 有贡献"才进表,于是凡是反共识的、当下指标看不出价值的方向,结构性地排不进路线图——而反共识恰是 AI 充裕里唯一还稀缺的信号源。机制:用利用期的工具(路线图)管探索期的工作(找方向),等于让收敛偏置(SHEET 12)制度化,散木留存度(INSTRUMENT 07)被路线图本身压到零。It assumes: direction is largely settled; what remains is scheduling execution into quarters against measurable targets. Why empty: it forces direction-open exploration into a direction-locked execution frame (violating the SHEET 07 master gate). Every candidate must first prove "it contributes to this quarter's KPI" to make the table, so any anti-consensus direction whose value current metrics cannot see is structurally locked out of the roadmap — and anti-consensus is the one signal source still scarce amid AI abundance. Mechanism: using an exploitation-phase tool (the roadmap) to manage exploration-phase work (finding direction) institutionalizes the convergence bias (SHEET 12); useless-tree retention (INSTRUMENT 07) is driven to zero by the roadmap itself.
③
黑客松仪式 Hackathon-as-ritualHackathon-as-ritual
它假设:把人关进 48 小时、给压力和咖啡,点子的产量就上去——产量是瓶颈。为什么空了:产量从来不是瓶颈了。48 小时里 AI 能产出的"看似可行 demo"比整个团队过去一年都多,于是黑客松的产出几乎全是最易生成的那一类伪信号。它退化成创新剧场(SHEET 08 系统级失败之一):看起来很创新,可几乎没有一个押注承担了 affordable loss、去验过真实需求。机制:仪式催的是产量曲线,而曲线的瓶颈已经移走;催一个不再稀缺的量,只会把噪声地板(SHEET 02)再抬高一截。It assumes: lock people in for 48 hours, add pressure and coffee, and idea volume rises — volume is the bottleneck. Why empty: volume stopped being the bottleneck. In 48 hours AI can produce more "looks-feasible demos" than the whole team did last year, so a hackathon's output is almost entirely the most easily-generated kind of false signal. It decays into innovation theatre (one of SHEET 08's system-level failures): it looks innovative, yet barely a single bet staked an affordable loss or went to verify a real need. Mechanism: the ritual drives the volume curve, but the curve's bottleneck has moved; driving a quantity no longer scarce only lifts the noise floor (SHEET 02) one more notch.
④
点子数指标 Idea-count metricIdea-quantity metric
它假设:提案数、专利数、点子库条目数,是创新健康度的代理——多多益善。为什么空了:这是 Goodhart 定律的创新版(SHEET 08)。一旦数量成了被奖励的指标,生成就让它免费爆表:提案数可以一夜十倍,且每一条都讲得头头是道。指标飙升而识别命中率不动,正是噪声地板被抬高、信号没变的精确读数(FIG 2.1)。机制:把代理变量(数量)当目标,系统就优化数量、挤出难度量的真价值;在生成充裕下,这个挤出效应不是变弱而是变强,因为代理变量的边际成本归零了。It assumes: proposal count, patent count, idea-bank entries are proxies for innovation health — more is better. Why empty: this is the innovation form of Goodhart's law (SHEET 08). Once quantity becomes the rewarded metric, generation makes it free to max out: proposal counts can go tenfold overnight, each one reading coherent. The metric soars while the hit rate does not move — the precise readout of a lifted noise floor against a flat signal (FIG 2.1). Mechanism: take a proxy variable (count) for the goal and the system optimizes the count, squeezing out the hard-to-measure real value; under generation abundance this squeeze does not weaken but strengthens, because the proxy's marginal cost went to zero.
它假设:多试、快试、不怕错,好东西自然冒出来——失败本身被当成美德。为什么空了:原版的"fail fast"有个被省略的前提:每次失败都要便宜且能学到东西(effectuation 的 affordable loss + 复盘回流)。货物崇拜只抄了"多失败"的形,丢了"可承受 + 可学习"的实。在生成充裕下,"快速试很多看似可行的方向"恰恰是最贵的失败——因为试的全是伪信号,且没有 affordable-loss 额度与证伪点兜底,失败既不便宜也学不到东西。机制:把一个有前提的纪律抄成无前提的口号,等于鼓励团队在噪声里高速空转——快速地失败,但从不快速地认出该砍什么。It assumes: try a lot, try fast, fear no error, and good things emerge — failure itself is treated as a virtue. Why empty: the original "fail fast" had an omitted precondition: each failure must be cheap and must teach something (effectuation's affordable loss + retrospective feedback). The cargo cult copied the form of "fail more" and dropped the substance of "affordable + learnable." Under generation abundance, "quickly trying many looks-feasible directions" is the most expensive failure — because what is tried is all false signal, and with no affordable-loss size or falsification point to backstop it, the failure is neither cheap nor instructive. Mechanism: copying a preconditioned discipline as an unconditioned slogan encourages a team to spin at high speed inside noise — failing fast, but never fast at recognizing what to cut.
⑥
中央研发实验室 Central R&D labCentral R&D lab
它假设:把最聪明的人集中到一处、给资源和隔离,创新产能就最大化——创新是可被集中的稀缺产能。为什么空了:价值感知的原料是亲历与深耕(SHEET 03)——它分布在一线、在与真实用户摩擦的边缘,恰恰不可被集中到中央。当生成产能不再稀缺(人人桌上都有),把稀缺物错认成"集中的智力产能"就指错了方向:真正稀缺的是贴着真实需求的判断,而它天然是分布式的。机制:集中模型优化的是"产能密度",但瓶颈已从产能移到"贴近真实处境的价值判断";离真实处境越远的中央实验室,越容易把看似可行当信号——它有最强的生成力,却离 JTBD 现场最远。It assumes: concentrate the smartest people in one place, give resources and isolation, and innovation capacity is maximized — innovation is a scarce capacity that can be centralized. Why empty: the raw material of value perception is lived experience and deep tenure (SHEET 03) — distributed at the front line, at the edge that rubs against real users, precisely what cannot be centralized. When generation capacity is no longer scarce (everyone has it on their desk), mistaking the scarce thing for "centralized intellectual capacity" points the wrong way: what is truly scarce is judgment pressed against real need, and that is inherently distributed. Mechanism: the central model optimizes "capacity density," but the bottleneck moved from capacity to "value judgment close to the real situation"; the further a central lab sits from real situations, the more easily it mistakes looks-feasible for signal — it has the strongest generation power yet sits furthest from the JTBD scene.
共同诊断 · 一句话Shared diagnosis · one line
六者不是各自坏,是同一个误判的六种制度化:它们都在管"方案的流量、产量、卖相质量",因为它们诞生时这些确实稀缺。瓶颈一移,它们守的关口集体落空,而真正稀缺的价值感知——分布式、不可外化、靠亲历养——恰恰是它们结构上接不住的。所以解法不是"修好漏斗 / 改进 KPI",是把整套机器的设计目标从"高效推进点子流"换成"高保真守住价值感知的栖息地"(SHEET 11)。(这些为结构机制论断与从业观察,非对照实验结论;走探索账。)The six are not each separately bad; they are six institutionalizations of one misjudgment: all manage "the flow, volume, and appearance-quality of plans," because those were genuinely scarce when they were born. Move the bottleneck and the gate each guards falls empty together, while the truly scarce thing — value perception, distributed, non-externalizable, grown from lived experience — is exactly what they structurally cannot catch. So the fix is not "repair the funnel / improve the KPI" but to swap the whole machine's design goal from "efficiently advance the idea flow" to "faithfully hold the habitat of value perception" (SHEET 11). (These are structural-mechanism claims and practitioner observations, not controlled-trial conclusions; on the exploration ledger.)
FIG. 8.5旧机器管的量都不稀缺了,它没在管值得What the old machine manages stopped being scarce; it never managed worth · 看懂:Read: 三条曲线——生成产量暴涨、卖相质量随之涨、真实需求贴合度没动;六个旧结构全锚在前两条上。three curves — generation volume surges, appearance-quality follows, real-need fit stays flat; all six legacy structures anchor to the first two.
看点:把三条曲线叠在同一张图上,旧机器的失效就不再是态度问题,而是几何问题:它们设计来管理的量(产量、卖相质量)随生成成本归零而暴涨,唯独"贴合真实需求"那条线纹丝不动。六个旧结构无一例外锚在前两条上——它们越高效,越把资源投在不稀缺的维度,离那条不动的线越远。Takeaway: stack the three curves on one chart and the old machine's failure stops being an attitude problem and becomes a geometry problem: the quantities it was designed to manage (volume, appearance-quality) surge as generation cost goes to zero, while the "real-need fit" line does not budge. All six legacy structures, without exception, anchor to the first two — the more efficient they are, the more resource they pour into non-scarce dimensions, and the further they drift from the flat line.
INV
09
ALLOCATION · 押注分配矩阵
ALLOCATION
决策矩阵 · 哪步交 AI / 哪步留人
Decision matrix · AI vs human
生成全交 AI,押注的"为什么"留给人
Hand generation to AI; keep the "why" of the bet for the human
Load-bearing claim (a division of labor you can follow): the innovation workflow is not "human or AI" but routing each step to where it belongs along one line. The principle: expanding the possibility space → to AI; judging whether a bet is worth it → to the human; context (real need and conviction) flows from human to AI, never the reverse. The table below locates each of the six steps of value discovery; it is directly actionable.
Force analysis: who each step goes to is set by two quantities — the step's output externalizability (can it be written as a spec / signal an AI can read) and the irreversibility of failure (if the bet is wrong, can it be withdrawn at affordable loss). Externalizable and reversible → to AI; non-externalizable or irreversible → to the human. Note the context flow is one-way: real need and inner conviction can only be injected by the human; once the AI has them it can expand the search, but it cannot reverse the flow and generate conviction for the human — that is the load-bearing sentence of SHEET 03.
交 AI · 扩张可能性To AI · expand possibility
发散生成:批量产候选方向、变体、组合——这是 AI 的绝对主场,越多越好
Divergent generation: produce candidate directions, variants, combinations in bulk — AI's home turf, the more the better
可行路径搜索:给定一个方向,搜遍"怎么走通"的已知方案(SHEET 03 可行路径轴)
Viable-path search: given a direction, search known ways to make it work (the SHEET 03 viable-path axis)
证伪辅助:替每个候选生成"它为假的条件"清单,供人审——生成清单交 AI,判定是否被击穿留人
Falsification assist: generate, for each candidate, a list of "what would make it false" for human review — generating the list to AI, judging whether it is broken to the human
共识信号汇总:聚合可外化的偏好 / 引用 / 采纳信号(RLCF 那一支,仅作输入不作裁决)
Consensus-signal aggregation: aggregate externalizable preference / citation / adoption signals (the RLCF branch, as input only, never as verdict)
留人 · 判值不值得To human · judge worth
真实需求判定:有没有真实的人在真实处境里真要——只能由亲历者判(JTBD,不可外化)
Real-need verdict: is there a real person in a real situation who truly needs it — only the one who lived it can judge (JTBD, non-externalizable)
内在确信归属:这份笃定是你的还是借来的——确信无法由 AI 代生成(SHEET 03 承重句)
Conviction attribution: is this certainty yours or borrowed — conviction cannot be generated by AI (the SHEET 03 load-bearing sentence)
押注决定与额度:押哪个、押多少(affordable loss)——不可逆的资源承诺留人
Bet decision and size: which to bet, how much (affordable loss) — irreversible resource commitments stay with the human
Anti-consensus / emergence recognition: recognize heterogeneous value that holds only for this group, and the new species worth amplifying (the SHEET 05/06 constitutive branch)
把六步串成一条流水,上下文的流向就清楚了——它单向从人流向 AI,再不反向:
String the six steps into one line and the context flow becomes clear — it runs one-way from human to AI, never back:
① 注入上下文(人) → 把真实需求、亲历、确信写成一段"我为谁、解决什么真实任务"的简报,喂给 AI。上下文起点在人。
① Inject context (human) → write real need, lived experience, and conviction into a brief, "for whom, what real job," and feed it to the AI. Context originates with the human.
② 发散(AI) → 在该上下文约束下批量生成候选方向与路径。
② Diverge (AI) → under that context, generate candidate directions and paths in bulk.
③ 证伪(AI 生成 · 人裁) → AI 列"为假的条件",人判哪些被现实击穿。
③ Falsify (AI generates · human judges) → AI lists falsifying conditions; the human judges which are broken by reality.
④ 收敛押注(人) → 用价值罗盘(INSTRUMENT 06)合成读数,人决定押哪个、押多少。
④ Converge and bet (human) → synthesize a reading with the value compass (INSTRUMENT 06); the human decides which to bet and how much.
⑤ affordable-loss 试错(人定额度 · AI 助执行) → 只投得起的损失,把试错变成可负担的常规。
⑤ Affordable-loss trial (human sets the size · AI assists execution) → bet only what you can afford to lose, making trial-and-error a sustainable routine.
⑥ 复盘回流(人 → 上下文) → 押中/押错的理由回流,更新①的上下文。闭环回到人。
⑥ Retrospect and feed back (human → context) → the reasons for hits and misses flow back to update the context in ①. The loop closes at the human.
最危险的失效是上下文流向倒灌:让 AI 替你生成"真实需求"与"确信",再把它的输出当成你的上下文喂回判断。一旦倒灌,价值源头就开始干涸(agentic flattening)——这正是顶层命题里"人自愿停止定义价值"的微观形态。先行指标:你的简报(①)越来越多由 AI 起草、越来越少来自亲历。修法:①永远人写,AI 只在②之后入场。(接 SHEET 13 边界;倒灌风险为机制论断,走探索账。)The most dangerous failure is the context flow running backward: letting the AI generate your "real need" and "conviction," then feeding its output back as your context. Once it runs backward, the value source begins to dry up (agentic flattening) — the micro form of the top claim's "people voluntarily stop defining value." Leading indicator: your brief (①) is increasingly drafted by AI and decreasingly drawn from lived experience. Fix: ① is always written by the human; the AI enters only after ②. (See SHEET 13's boundary; the backward-flow risk is a mechanistic claim, on the exploration ledger.)
为什么"押注的为什么"留给人:预测变便宜,判断没有
Why "the why of the bet" stays with humans: prediction got cheap, judgment did not
"生成全交 AI、押注的为什么留给人"不是分工偏好,是有经济学结构撑着的。Agrawal、Gans、Goldfarb[R14](Prediction versus Judgment, NBER WP 24626 / Information Economics and Policy 2019,Ⅱ)把这条结构讲透:AI 降低的是预测的成本——"在给定目标函数下,哪条路最可能走通";它没有降低判断的成本——"目标函数本身该是什么、各种结果的相对价值是多少"。当预测变得近乎免费,判断的相对价值反而上升,因为它成了瓶颈。本卷的六步循环就是这条定理的落地:②发散、③证伪里 AI 擅长的全是预测;④押注、⑤定额度、⑥复盘里留给人的全是判断——目标函数无法被编码的那部分。
"Hand generation to AI, keep the why of the bet with humans" is not a preference about division of labor; it rests on an economic structure. Agrawal, Gans, and Goldfarb[R14] (Prediction versus Judgment, NBER WP 24626 / Information Economics and Policy 2019, Grade II) make the structure plain: what AI lowers is the cost of prediction — "under a given objective function, which path is most likely to work"; it does not lower the cost of judgment — "what the objective function itself should be, what the relative values of the outcomes are." When prediction becomes near-free, the relative value of judgment rises, because it becomes the bottleneck. The six-step loop here is that theorem landed: in ② diverge and ③ falsify, everything AI is good at is prediction; in ④ bet, ⑤ size, and ⑥ retrospect, everything kept with the human is judgment — the part of the objective function that cannot be coded.
进一步,Agrawal 等在后续工作(Bicycles for the Mind, NBER WP 34034,Ⅲ)把判断再切两层:机会判断(这个方向值不值得追)恒与 AI 互补——AI 越强,机会判断越值钱;收益判断(追了能得多少)条件互补;而实现技能(把方案做出来)被替代。本卷的分工正好压在这条线上:把可被替代的实现交给 AI,把恒互补的机会判断("押哪个方向值得")牢牢留人。这也解释了上下文为什么必须单向从人流向 AI——机会判断的原料是亲历与确信(SHEET 03),一旦让 AI 替你生成机会判断,你替换掉的恰恰是那个随 AI 变强而越来越值钱的能力,自废武功。
Further, Agrawal et al.'s later work (Bicycles for the Mind, NBER WP 34034, Grade III) splits judgment into two more layers: opportunity judgment (is this direction worth pursuing) is always complementary to AI — the stronger AI gets, the more opportunity judgment is worth; return judgment (how much pursuing it yields) is conditionally complementary; while execution skill (making the thing) is substituted. This volume's division of labor sits exactly on that line: hand the substitutable execution to AI, keep the always-complementary opportunity judgment ("which direction is worth betting on") firmly with the human. It also explains why context must flow one-way from human to AI — the raw material of opportunity judgment is lived experience and conviction (SHEET 03); the moment you let AI generate your opportunity judgment, what you replace is precisely the capacity that grows more valuable as AI grows stronger, disarming yourself.
上下文单向流动,是一种道德结构而非只是分工
One-way context flow is a moral architecture, not just a division of labor
六步循环里上下文单向从人流向 AI、再不反向,这条规则表面是工程纪律,底层是一种道德结构。回到价值与责任那道接缝(SHEET 07.5):定义价值的人和承担后果的人必须是同一个。上下文倒灌——让 AI 替你生成"真实需求"和"内在确信"、再把它的输出当成你的判断依据——恰恰是在这道接缝上动刀:它让价值定义悄悄从人转移到工具,于是"为后果负责"的人不再是"真正定义了价值"的人。所以坚持上下文单向,不只是为了保住判断质量,是为了保住价值定义与责任承担的同一性——让做决定的人始终是那个该为决定买单的人。
In the six-step loop, context flows one-way from human to AI and never back; on the surface this is engineering discipline, underneath it is a moral architecture. Return to the value-and-responsibility seam (SHEET 07.5): the one who defines value and the one who bears the consequence must be the same. Context running backward — letting AI generate your "real need" and "inner conviction," then taking its output as the basis of your judgment — cuts precisely at that seam: it quietly migrates value definition from human to tool, so that the one "responsible for the consequence" is no longer the one who "truly defined the value." So insisting on one-way context is not only about preserving judgment quality but about preserving the identity of value-definition and consequence-bearing — keeping the one who decides always the one who should pay for the decision.
这给"agentic flattening"(人自愿停止定义价值)一个具体的早期信号,落在这条循环的第①步:你的简报越来越多由 AI 起草、越来越少来自亲历。第①步本该是上下文的起点、纯粹由人写——它是你把"我为谁、解决什么真实任务、为什么是我"的笃定注入系统的地方。一旦这一步开始由 AI 代笔,你就不再是从"手中之鸟"出发,而是从模型的先验出发,回路的源头被悄悄换成了均值。修法很硬:第①步永远人写,AI 只在第②步发散之后入场。这条纪律不浪漫,但它是把"人回归于意义"从口号变成可执行约束的具体一招——意义不是被宣告的,是被一条"谁写第①步"的规则守住的。
This gives "agentic flattening" (people voluntarily ceasing to define value) a concrete early signal, landing at step ① of this loop: your brief is increasingly drafted by AI and decreasingly drawn from lived experience. Step ① is meant to be the origin of context, written purely by the human — it is where you inject into the system your conviction about "for whom, what real job, why me." Once this step starts being ghost-written by AI, you no longer start from the "bird in hand" but from the model's prior, and the loop's source is quietly swapped for the mean. The fix is hard: step ① is always written by the human; AI enters only after step ② diverges. The discipline is unromantic, but it is the concrete move that turns "people return to meaning" from a slogan into an executable constraint — meaning is not declared but held by a rule about "who writes step ①."
押多少,由"输得起多少 × 代价落谁头上"两轴定
How much to bet is set by "what you can afford to lose × on whom the cost lands"
分工解决了"哪步交谁",还剩一个问题:决定押下去之后,押多少。effectuation 给的答案不是"按预期回报定额度"(那需要可靠的概率分布,方向开放处境里没有),而是按 affordable loss——你输得起多少,就投多少。但 affordable loss 只是一根轴。本卷加上第二根:代价落谁头上(后果归属,接 SHEET 07.5)。两轴交叉出一张分配矩阵,它不替你算外部性,但逼你在加注前同时回答两个问题:这一注我自己输得起吗?万一输了,代价会不会落到没在决策桌上的人或未来身上?两个问题都过,额度才放出去。
The division of labor settles "which step to whom"; one question remains: once you decide to bet, how much. Effectuation's answer is not "size by expected return" (which needs a reliable probability distribution, absent in a direction-open situation) but by affordable loss — invest what you can afford to lose. But affordable loss is only one axis. This volume adds a second: on whom the cost lands (consequence attribution, see SHEET 07.5). The two axes cross into an allocation matrix; it does not compute externalities for you, but it forces you, before raising the stake, to answer two questions at once: can I afford to lose this bet? And if I lose, will the cost land on people not at the decision table, or on the future? Only when both pass is the size released.
FIG. 9.0押注额度分配:输得起 × 代价归属Bet-size allocation: affordable × who bears the cost · 看懂:Read: 只有"我输得起 ∧ 代价落在自己头上"的格才放开额度;代价外溢的格,再便宜也先停。only the "I can afford it ∧ the cost lands on me" cell releases size; any cell where the cost spills outward stops first, however cheap.
看点:多数 affordable-loss 讨论只画一根轴(输得起多少),于是会得出"反正便宜,多试无妨"的危险结论——它默默假设代价只落在试的人头上。加上"代价归属"这根轴,右下格(自己输得起、代价却外溢给他人或未来)立刻暴露出来:它在第一根轴上看是安全的,在第二根轴上是不该做的。这正是 INSTRUMENT 08 第三轴存在的理由——把外部性从一个易被遗忘的盲区,变成一个加注前必须填写的栏位。Takeaway: most affordable-loss discussions draw only one axis (how much you can lose), and so reach the dangerous conclusion "it's cheap, no harm trying a lot" — quietly assuming the cost lands only on the one who tries. Add the "consequence-attribution" axis and the bottom-right cell (affordable to you, yet the cost spills to others or the future) is immediately exposed: safe on the first axis, ought-not on the second. This is exactly why INSTRUMENT 08's third axis exists — turning externality from an easily-forgotten blind spot into a field that must be filled in before raising the stake.
INV
09.5
CASES · 四个走过的真例
WORKED CASES
案例 · 把罗盘读数走一遍
Cases · the compass read end to end
罗盘怎么读,用四个真例走一遍
How the compass reads, walked through four real cases
Load-bearing claim: if the earlier marks stop at principle they read as fine words. This sheet pairs each of the four most common readings with one named, real case: a three-axis value-perception triage, a looks-feasible punctured by its falsification point before polish set in, a useless tree worthless at the time that later paid back, an emergence recognized after the fact rather than produced. Each gives "what was seen then, how the compass read it, what happened after," with the conclusion unembellished.
案例一 · 三轴分诊:Notion 的 AI 功能为什么先慢一步
Case 1 · Three-axis triage: why Notion's AI feature deliberately came a step late
2022 年底 ChatGPT 引爆后,文档协作工具集体面临同一个押注:要不要立刻把"AI 写作助手"塞进产品。可行路径轴在那一刻被 AI 自己抬到满格——接一个生成接口、做个侧边栏,技术上几周可成,几乎零门槛。多数工具据此快速上线了"AI 写作"。用三轴罗盘读这个方向:可行路径=满格(人人都能接),但这正是危险信号——当一轴被 AI 吹满、它就不再是区分度的来源。关键问题落到另两轴:真实需求——用户雇用 Notion 去完成的 job 是什么?是"在一个结构化工作区里组织知识与协作",不是"得到一段生成文本"。内在确信——这份"该做 AI 写作"的笃定是你的,还是"大家都在做"借来的?
After ChatGPT detonated in late 2022, document-collaboration tools faced the same bet at once: should an "AI writing assistant" be jammed into the product immediately. The viable-path axis was, at that moment, pushed to full by AI itself — wire up a generation endpoint, build a sidebar, technically doable in weeks, near-zero barrier. Most tools shipped "AI writing" fast on that basis. Read this direction with the three-axis compass: viable path = full (anyone can wire it), which is precisely the danger signal — when one axis is inflated by AI, it stops being a source of differentiation. The decisive questions fall to the other two axes: real need — what job do users hire Notion to do? It is "organize knowledge and collaborate inside a structured workspace," not "get a paragraph of generated text." Inner conviction — is the certainty that "we should do AI writing" yours, or borrowed from "everyone is doing it"?
"AI writing is technically a few weeks of work, everyone is shipping it, we fall behind if we don't" — taking one axis at full as the signal to bet. The result is a sidebar identical to every competitor's and disconnected from Notion's real job.
三轴一起读Reading all three axes together
可行满格但不区分;真实需求指向"在结构化工作区里 AI 帮你组织而非替你写"。Notion 的 AI 后来落在数据库属性自动填充、会议纪要结构化、知识库问答——接住了原本的 job,而非追一段生成文本。先慢一步,是因为在另两轴上等到了真实确信。
Viable is full but non-differentiating; real need points to "in a structured workspace, AI helps you organize rather than write for you." Notion's AI later landed on auto-filling database properties, structuring meeting notes, querying the knowledge base — catching the original job rather than chasing a generated paragraph. Coming a step late was the cost of waiting until the other two axes carried real conviction.
Reading and result: judge the me-too "AI writing" as viable-full · real-need-weak · conviction-borrowed — a textbook looks-feasible trap, do not bet yet; judge "AI helps you organize in a structured workspace" as three-axis-aligned — worth a bet. In hindsight, what that "slow" step bought was AI features actually growing on the product's job rather than floating on its surface. The load-bearing point of this case is not what Notion bet right, but that it demonstrates "viable path at full" is exactly the moment to be wary — when AI maxes that axis, differentiation can only come from the two axes it cannot help with. (The product timeline is publicly verifiable fact; the "why it was bet this way" attribution is this volume's reading through the three-axis frame, an analytic reconstruction, not Notion's official account; on the exploration ledger.)
案例二 · 证伪先行:一个"AI 法律助手"在打磨之前被一句话击穿
Case 2 · Falsification first: an "AI legal assistant" punctured by one sentence before polish
A common looks-feasible direction: an "AI contract-review assistant" for small and mid-size businesses. The generation side writes it extremely complete — market size, user persona, pricing, technical path, competitive differentiation, a ten-page business plan in one night, reading airtight. This is the most dangerous form of "looks feasible": the appearance is perfect, and the longer it is polished the more it looks real. This volume's discipline is to pass the falsification point before polishing (the SHEET 10 falsification checklist): do not ask "where is it good," ask first "can its falsifying condition be written out, can reality break it at low cost."
The falsifying condition, written out, is one sentence: "would a small-business owner who can be held liable for a misreviewed contract dare hand that contract to a tool that occasionally hallucinates with a straight face, with no named lawyer underwriting the result?" This sentence needs no ten-page plan to test; going to the field and asking three real small-business owners punctures it: their answers are nearly identical — "who is responsible when something goes wrong?" The real job of contract review is not "understand the clauses" but "someone takes the fall for this judgment." AI can give the former, not the latter; and the latter is precisely the job's load-bearing weight. One sentence punctured the real-need axis among the three: users hiring a contract-review service are hiring the bearing of responsibility, not text comprehension (see the SHEET 07.5 value-responsibility seam).
读数 · 证伪点的杠杆Reading · the leverage of a falsification point
对比两条路:打磨派会花两周把十页计划做成二十页、做个 demo、再融资,半年后撞上"没人敢用"的墙;证伪派花半天写一句证伪点、问三个人,当天就把方向降级。差别不在谁更聪明,在判断的锚下在打磨之前还是之后。生成时代打磨极其便宜,于是"先打磨再验证"等于让伪信号有充足时间把自己装扮成真信号;先证伪,是把判断的锚抢在打磨抬高卖相之前钉下。(案例为本卷综合常见情形构造的代表性示例,非单一可指名公司复盘;机制论断走探索账。)Contrast two paths: the polishing camp spends two weeks turning the ten-page plan into twenty, builds a demo, raises money, and six months later hits the "nobody dares use it" wall; the falsification camp spends half a day writing one falsifying sentence, asks three people, and demotes the direction that day. The difference is not who is smarter but whether judgment's anchor drops before or after polish. In the generation era polish is extremely cheap, so "polish first, verify later" gives the false signal ample time to dress itself as a true one; falsifying first nails judgment's anchor before polish can lift the appearance. (The case is a representative example this volume composes from common situations, not a single nameable company's retrospective; the mechanism claim is on the exploration ledger.)
案例三 · 散木回本:Slack 在 Tiny Speck 游戏失败的废墟里
Case 3 · The useless tree pays back: Slack in the ruins of Tiny Speck's failed game
The test for a useless tree: something that current KPIs cannot value, that looks "worthless" now, yet is deliberately kept because of high conviction (SHEET 04). Slack's origin is a textbook useless-tree story. Stewart Butterfield's company Tiny Speck spent years building a web game called Glitch, which failed and shut down in 2012 — by any roadmap KPI, a project to be cut clean. But to collaborate on the game the team had built an internal messaging tool: channels, search, integrations. Under the goal of "make a game," this tool was pure byproduct, a textbook "useless tree" — not on the roadmap, not aligned to any business KPI of the day. The game died; the team did not cut this useless tree with it but recognized it had itself solved a real job.
This is exactly the useless-tree reserve's mechanism: a team with high useless-tree retention (INSTRUMENT 07) still holds, when the main goal fails, byproducts that efficiency did not cut early — and those byproducts occasionally hide value larger than the main goal. Had Tiny Speck strictly enforced "align everything to the game KPI, cut anything unrelated," that internal tool would never have been grown, let alone recognized after the game failed. Slack launched in 2013 and was acquired by Salesforce in 2019 for roughly 23 billion dollars[R20] — a return that came not from "betting on the game" but from "not cutting, in efficiency's name, the tree that was worthless at the time."
读数 · 别把保险费当浪费Reading · do not mistake the premium for waste
散木的回报天然是滞后且偶发的,所以它在任何当期 KPI 上都像浪费——这正是它在 AI 效率压力下最先被砍的原因。但保住散木的成本,本质是一笔对"价值源头不被效率提前耗尽"的保险费。多数散木确实不会回本,这不否证保护区的价值,正如多数保险不会理赔不否证买保险的理性。承重的是分布的尾部:少数散木的巨大回报,覆盖了保住全部散木的成本。把散木留存度压到零,等于退掉这份保险,赌"主目标永远不失败"——在 Knightian 不确定性主导的方向开放处境里,这是最贵的赌。(Slack/Glitch/收购为公开事实;"散木机制"为本卷以 SHEET 04 框架的解读;走探索账。)A useless tree's payback is inherently lagged and occasional, so on any current-period KPI it looks like waste — exactly why it is cut first under AI efficiency pressure. But the cost of keeping useless trees is essentially a premium on "the value source not being depleted early by efficiency." Most useless trees indeed never pay back; this does not falsify the reserve's value, just as most insurance never paying out does not falsify the rationality of buying it. What is load-bearing is the tail of the distribution: the enormous return of a few useless trees covers the cost of keeping all of them. Driving useless-tree retention to zero is cancelling that insurance and betting "the main goal never fails" — in a direction-open situation dominated by Knightian uncertainty, the most expensive bet there is. (Slack / Glitch / the acquisition are public facts; the "useless-tree mechanism" is this volume's reading through the SHEET 04 frame; on the exploration ledger.)
案例四 · 事后认出:GitHub Copilot 的"聊天"不是被规划出来的
Case 4 · Recognized after the fact: Copilot's "chat" was not planned into being
This volume says it repeatedly: emergence cannot be produced, only recognized after the fact (SHEET 06). A clear example is the code assistant's turn from "completion" to "conversation." Tools like Copilot were first designed for inline code completion — you write half a line, it continues. But many users began using it as something else: writing natural-language questions in comments, using completion to "ask" how to fix a bug, treating it as a conversable copilot. This usage was not in the original spec; it was a new species that grew in real usage rather than a feature the product team designed. The early signal was faint and noise-like — a few users' odd behaviors, mixed into a sea of normal completion requests.
The point is not that "the team didn't foresee it" but that recognizing emergence needs a stance different from producing. The producing stance filters "odd usage not in the spec" away as noise; the recognizing stance watches what users actually do and asks "is this usage deviating from the normal path telling me about a real need I did not design for?" The later Copilot Chat and the various conversational coding assistants are essentially after-the-fact ratification, into a formal product, of usage that had already emerged in the wild. This is exactly what the SHEET 06 emergence dashboard is built to illuminate: not to produce innovation but to instrument it, so that the anomalous usage that "deviates from the designed path yet grows spontaneously" becomes visible rather than being filtered out as noise.
"We designed completion; we measure completion acceptance against the spec." Conversational usage outside the spec is deviation, is noise — filtered, nudged back on track. Emergence dies in the filter.
"A spontaneously growing usage deviates from our design — it is telling us a real need we did not design for." Give it an observation instrument, ratify it, rather than nudging it back to completion. The new species is invited from the wild into the product.
Taken together, the four cases demonstrate four readings of one compass: triage tells you not to take one axis at full as signal; falsification-first tells you to nail judgment's anchor before polish; useless-tree protection tells you not to mistake the value source's premium for waste; recognizing-after-the-fact tells you innovation is recognized more than produced. They are not four steps but one compass pointing under four situations — which is also why this volume keeps refusing to write value discovery as a process: direction has no "next step," only "given this reading now, which way to lean." (Copilot's usage evolution is publicly observable fact; the "emergence mechanism" attribution is this volume's reading; on the exploration ledger.)
INV
09.7
FALSIFIER · 看似可行证伪器
FALSIFIER
仪器 · 押注前的三问
Instrument · the three pre-bet questions
押注之前,先让它去经受证伪
Before you bet, put it through falsification first
Load-bearing claim: the value compass (INSTRUMENT 06) synthesizes a "worth" reading, but it assumes you have already told true signal from looks-feasible. This falsifier supplies the step before: score one concrete direction on three axes — can a falsifying condition be written · can reality break it at low cost · is the conviction lived or borrowed — and read a verdict. It does not bet for you; it demotes "looks right" to "a candidate awaiting falsification."
Why must this step be independent of the value compass? Because the most expensive error of the abundance era is not "misjudging the worth" but "taking a false signal for a true one, then earnestly computing worth on the false signal." The compass's three axes (real need × viable path × inner conviction) assume the inputs are real; the looks-feasible that the generation side supplies is precisely able to counterfeit the appearance of those three. So before synthesizing worth there must be a falsification gate — it does not ask "is this direction good" but "does it survive falsification." Score one on each axis and the falsifier returns one of six verdicts: unfalsifiable (refuse to bet), borrowed conviction (go get friction first), looks-feasible trap (cut), needs a field test (go verify), a signal that survived (bet few and sharp), or weak (keep sharpening).
心里锁定一个你正在考虑要不要押的具体方向,三轴各选一项。Hold one concrete direction you are weighing whether to bet on, and pick one on each axis.
① 可证伪性① falsifiability
② 现实能否低成本击穿② cheap reality test
③ 确信来源③ source of conviction
读法:确信若是借来的,先去摩擦,另两轴的读数都不作数——借来的确信是噪声里最危险的伪信号。证伪器只拦伪信号,不替你判值得度;过了证伪闸再上价值罗盘。How to read: if conviction is borrowed, go get friction first — the other two axes do not count, because borrowed conviction is the most dangerous false signal in the noise. The falsifier only catches false signals; it does not judge worth for you. Pass the falsification gate, then take it to the value compass.
这具证伪器的设计本身就编码了一条优先级:确信来源是第一道闸。无论可证伪性与现实检验读数多好,只要确信是借来的,证伪器都先把你打回去摩擦——因为借来的确信会让你停止寻找,它比没有确信更危险(SHEET 03)。第二道闸是可证伪性:连为假的条件都写不出来的方向,不是好方向,是一个故事;它不可能被现实纠错,只会被打磨无限装扮。两道闸都过,才轮到看"现实能否击穿"——这一轴决定它是看似可行陷阱(能击穿却没去验就信了)、还是扛住的真信号(给了机会没断)。三轴串起来,恰好复刻了案例二里那个"一句话击穿 AI 法律助手"的判断顺序:先问确信是不是你的,再问为假的条件能不能写出来,最后让现实去试着击穿它。
The falsifier's design itself encodes a priority: the source of conviction is the first gate. However good the falsifiability and reality-test readings, if conviction is borrowed the falsifier sends you back to get friction first — because borrowed conviction makes you stop looking, more dangerous than no conviction (SHEET 03). The second gate is falsifiability: a direction whose falsifying condition cannot even be written is not a good direction but a story; it cannot be corrected by reality, only dressed up endlessly by polish. Pass both gates and only then does "can reality break it" come into play — that axis decides whether it is a looks-feasible trap (breakable yet believed without testing) or a signal that survived (given the chance and did not break). Strung together, the three axes replicate exactly the judgment order in Case 2's "one sentence punctured the AI legal assistant": first ask whether the conviction is yours, then whether a falsifying condition can be written, and finally let reality try to break it.
INV
10
FIELD MANUAL · 价值感知田野手册
FIELD MANUAL
可拷贝工件 · 训练手册的局部
Copyable artifact · the teachable part
能练的那一半,给一套可照抄的练法
For the teachable half, a set of drills you can copy
Load-bearing claim (delivering the SHEET 05 training-manual branch): the externalizable part of value perception can be drilled. This sheet turns SHEET 05's four drill types into copyable artifacts: a bet-retrospective sheet, a real-need fieldwork script, an affordable-loss trial protocol, a falsification checklist. Honest boundary: these only calibrate the externalizable half; the constitutive core cannot be drilled, and belongs to SHEET 11's habitat.
Why the drills work, in one force-analysis line: they all externalize tacit judgment into a bookkeepable trace, so judgment's hits and misses can be calibrated after the fact. Borrowed conviction, imagined need, looks-feasible paths — all otherwise hide inside "a feeling" beyond correction; written as a trace, they stand exposed to falsifiable light. That is the precise meaning of "the externalizable part": what can be written as a trace can be drilled; what lives only in intuition belongs to the habitat.
Bet-retrospective sheet (filled after each bet): ① the bet in one line · ② the strongest "true" reason before betting · ③ the strongest "false" reason before betting · ④ signal source (lived / data / borrowed) · ⑤ outcome (hit / miss / open) · ⑥ the thing only seen clearly afterward. Purpose: turn hit rate and abandon rate into a countable ledger.
真实需求田野脚本(去现场前填):① 我猜的待办任务是什么 · ② 我要找谁、在什么处境里观察 · ③ 我会问的不是"你要不要"而是"你上次怎么办成的 / 卡在哪" · ④ 证伪点:什么观察会推翻"这是真实需求"。作用:把想象需求挡在投入之前(接陷阱③)。
Real-need fieldwork script (filled before going to the field): ① what job I am guessing · ② whom I will find, in what situation to observe · ③ I will not ask "do you want this" but "how did you get it done last time / where were you stuck" · ④ falsification point: what observation would overturn "this is a real need." Purpose: stop imagined need before investment (see trap ③).
affordable-loss 试错规约(开试前定):① 我投得起输掉的额度(钱 / 时间 / 声誉)· ② 这一轮要证伪的一个假设 · ③ 多久、看什么信号收手 · ④ 输了我学到什么。来源:effectuation affordable-loss + pilot-in-the-plane(未来可被行动塑造,非预测)。
Affordable-loss trial protocol (set before starting): ① the amount I can afford to lose (money / time / reputation) · ② the one assumption this round falsifies · ③ how long, on what signal, I stop · ④ what I learn if I lose. Source: effectuation's affordable loss + pilot-in-the-plane (the future is shaped by action, not predicted).
Falsification checklist (run over each candidate): □ can its falsifying condition be written out · □ can that condition be broken by reality at low cost · □ do I say "viable" because I lived it, or because it "reads smoothly" · □ is the conviction mine or borrowed · □ does the real need name a concrete person. Purpose: stop traps ①②③ before the bet.
边界 · 这是半套手册Boundary · this is half a manual
诚实标注:以上是训练手册支,只对价值感知的可外化部分有效。它练不出反共识的前沿判断、tacit 价值锚、对"什么真正重要"的构成性确信——那些归 SHEET 11 的栖息地设计。把这套手册当全部,正是 SHEET 05 警告的"强行系统化=亲手制造平均"。两半合起来才是完整姿态:能练的给练法(本张),不能练的给栖息地(下一张)。(这些工件为方法论提案,非经对照实验验证的处方;走探索账。)Stated honestly: the above is the training-manual branch, effective only on the externalizable part of value perception. It cannot drill anti-consensus frontier judgment, the tacit value anchor, or constitutive conviction about "what truly matters" — those belong to SHEET 11's habitat design. Treating this manual as the whole is exactly what SHEET 05 warns against: "forcing systematization = manufacturing the average by hand." Only both halves form the full stance: drills for the teachable (this sheet), a habitat for the rest (the next).(These artifacts are methodological proposals, not prescriptions validated by controlled trials; on the exploration ledger.)
练法的底层逻辑:塑造未来,而不是预测它
The logic beneath the drills: shape the future, do not predict it
The four drills are not a random assortment; they share one underlying logic — effectuation (Sarasvathy). The traditional logic of decision is "causation": fix a goal, then find the optimal means to reach it; it assumes the future can be predicted. But in a situation where direction is genuinely open and Knightian uncertainty dominates, prediction is bound to fail, because there is no reliable probability distribution to compute. Effectuation inverts it: start from what is already in your hand (who you are, what you know, whom you know — bird-in-hand), bet with affordable loss (not expected return), treat surprise as a resource (lemonade), and believe the future is shaped by action, not predicted (pilot-in-the-plane). Each drill delivers one principle: the fieldwork script = bird-in-hand, the trial protocol = affordable loss, the retrospective sheet = turning surprise into the next round's resource, the falsification checklist = puncturing the illusion of "predictable."
这正是创业理论被 GenAI 重写后的核心姿态(Journal of Management Studies 2026):当机器创造力把点子空间扩到无限,人类判断的工作不是"预测哪个点子会赢",而是"用可承受的损失,逐个淘汰不能被实现的"——靠行动收缩可能性,而非靠预测挑选可能性。所以训练手册支练的不是"预测力",是"在不可预测中行动的纪律":怎么从手里已有的出发、怎么把每次下注的损失控制在输得起的范围、怎么让每次失败都变成下一轮更准的输入。这套纪律可外化、可记账、可练——这恰是它落在分叉"可系统化支"的原因(SHEET 05)。
This is exactly the core stance of entrepreneurship theory after GenAI rewrote it (Journal of Management Studies 2026): when machine creativity expands the idea space to infinity, the work of human judgment is not "predict which idea wins" but "cull the unrealizable one by one, with affordable loss" — contracting possibility by acting, not selecting possibility by predicting. So the training-manual branch drills not "prediction power" but "the discipline of acting amid the unpredictable": how to start from what is in hand, how to keep each bet's loss within what you can afford, how to make each failure a sharper input to the next round. This discipline is externalizable, bookkeepable, and drillable — which is precisely why it lands on the "systematizable branch" of the fork (SHEET 05).
手册的复利:把判断变成一个会自我校准的回路
The manual's compounding: turn judgment into a self-calibrating loop
Seen alone, the four drills are just tables; strung together, they are a calibration loop that compounds. The loop's shape: the fieldwork script collects traces of real need → the falsification checklist blocks looks-feasible false signals → the affordable-loss protocol turns the remaining bets into bearable experiments → the bet-retrospective sheet books each experiment's hit or miss and feeds back to update your prior on "which signals are reliable." Run one round and your nose for real need, your sensitivity to falsification points, your honesty about the source of your own conviction are each calibrated a notch. Run many rounds and these calibrations compound — this is the precise mechanism of the part of value perception that "can be drilled": not a rise in talent but the engineering process of repeatedly externalizing tacit judgment into traces and re-calibrating against outcomes.
The loop has an easily-missed precondition: it compounds only when bets are actually placed and outcomes actually booked. A team that fills in tables but never bets gets theatre, not calibration (another innovation theatre); a team that bets but never retrospects starts from the same prior every time and never improves. So the manual's load-bearing weight is not in the four tables but in the discipline that closes them into a loop: every bet has an affordable size, a clear falsification point, an honest after-the-fact accounting. This ties back to effectuation's pilot-in-the-plane — the future is not predicted but shaped and calibrated by round after round of affordable action. What the manual drills is exactly turning this capacity to "act amid uncertainty and learn from it" from a matter of intuition into a matter of process.
Load-bearing claim (delivering the SHEET 05 ecology branch · the floor of the volume): constitutive, anti-consensus value perception cannot be taught, only have its emergence habitat cultivated. This sheet turns SHEET 04/05's "slack / tolerance / useless-tree reserve / diversity / slow lane" into designable habitat elements, with a self-check instrument: is your habitat being devoured by efficiency.
Why a habitat and not a course, by force analysis: the judgment of heterogeneous value cannot be externalized into teachable rules (the spine of SHEET 05, the Specification Trap's "from specification to emergence"). What a rule can teach is, by definition, already-settled consensus — teaching it only replicates the average. So what the methodology can do on this half is not "teach judgment" but "not kill the conditions under which judgment emerges." Habitat design is a negative engineering: it works mainly by subtraction — removing the forces that press all exploration toward a single goal.
①
留白Slack
不被即时产出填满的时间。设计要素:不汇报的时段、无议程的探索块。它是反共识价值的孵化器——没有留白,只剩对齐 KPI 的安全平均。Time not filled by immediate output. Design element: un-reported blocks, agenda-free exploration slots. It is the incubator of anti-consensus value — without slack, only the KPI-aligned safe average remains.
②
容错Tolerance for error
错误成本低到敢押反共识方向。设计要素:把单次失败的代价压到 affordable-loss 区间,让试错不需要勇气、只需要预算。Error cost low enough to dare anti-consensus bets. Design element: push the cost of a single failure into the affordable-loss range, so trial needs no courage, only a budget.
③
散木保护区Useless-tree reserve
明确划出不对齐任何 KPI 的探索地带。设计要素:一块写进制度的、免于度量的地(接 SHEET 04)。保护区的边界要硬,否则会被效率慢慢蚕食。An exploration zone explicitly aligned to no KPI. Design element: a metrics-exempt plot written into the system (see SHEET 04). Its boundary must be hard, or efficiency erodes it bit by bit.
④
多样性Diversity
抵抗收敛到单一最优。设计要素:保住异质的人、异质的来源、异质的方法——这正是反"单一目标过度优化"公理在组织层的落点(QD / Novelty-Search)。Resist convergence to a single optimum. Design element: keep heterogeneous people, sources, methods — the organizational landing of the anti-single-goal axiom (QD / Novelty-Search).
⑤
慢通道Slow lane
给慢的过程一条不被砍的通道。设计要素:区分"该快的执行"与"该慢的酝酿",别用同一条效率尺子量两者(serendipity 与慢想活在这条通道里)。A lane for slow processes that does not get cut. Design element: distinguish "execution that should be fast" from "incubation that should be slow"; do not measure both with one efficiency ruler (serendipity and slow thinking live in this lane).
Tick the symptoms that "the useless tree is being devoured by efficiency" in your organization — the more you hit, the lower the retention. This is not a router (it allocates no work) but a mirror of habitat health. Each symptom is the inverse of one of the five habitat elements above; the reading re-renders on language toggle.
检验信号 + 反指标Test signal + counter-indicator
正向信号:散木留存度(不在 KPI 上的探索占比)与意外收获率(serendipity 命中)。反指标——栖息地正在死的早期征兆:保护区边界开始"临时挪用"、留白时段被会议填满、慢通道被要求给即时产出。一旦六条征兆命中四条以上,价值源头多半已在干涸,不是缺人才,是栖息地塌了。(探索账:留存度无普适阈值,需各组织自定基线后跟踪;自检为启发式镜子,非校准判据。)Positive signals: useless-tree retention (the share of exploration not on any KPI) and serendipity hit rate. Counter-indicator — early signs the habitat is dying: the reserve's boundary starts being "temporarily borrowed," slack blocks fill with meetings, the slow lane is asked for immediate output. Once four of the six symptoms hit, the value source is likely already drying up; it is not a talent shortage but a collapsed habitat. (Exploration ledger: retention has no universal threshold; each organization sets its own baseline and tracks it; the self-check is a heuristic mirror, not a calibrated criterion.)
栖息地是一门否定的工程:主要做减法
A habitat is negative engineering: it works mainly by subtraction
The most counter-intuitive thing about habitat design: it is mainly not "adding things" but "not killing." Because the judgment of anti-consensus value cannot be externalized into teachable rules (the spine of SHEET 05), what the methodology can do on this half is not install a machine that promotes innovation but remove the forces that press all exploration toward a single goal. This is negative engineering, the opposite of the downstream volumes' positive engineering of "install guardrails, set specs." Biology gives the precise analogy: genotype networks are not designed; they are a byproduct of robustness — as long as a system tolerates the survival of many redundant "same-phenotype" variants, a population can spread across them and accumulate cryptic variation. Habitat design is the organizational version of exactly this: not producing innovation directly but maintaining a neutral zone that tolerates redundancy and tolerates the temporarily useless, so heterogeneous value has somewhere to survive until the day it is recognized.
Negative engineering has an operational consequence: a habitat dies chronically and bloodlessly — it is rarely cut down in one stroke but eroded bit by bit by efficiency. The reserve's boundary is "temporarily borrowed" once, the slack block is filled by one "important meeting," the slow lane is asked to "show some results this quarter too." Each step looks reasonable in isolation (each emits a "progress" signal, see the SHEET 04 efficiency paradox); together they are the habitat's slow death. So INSTRUMENT 07's six symptoms are not for scoring and bragging but for early warning: intervening while the erosion is on only one or two of them is far cheaper than discovering it after the value source has dried up. The maintenance cost of a habitat is almost entirely in "holding the boundary against erosion by reasonable-sounding reasons."
多样性不是政治正确,是对抗均值引力的保险
Diversity is not political correctness but insurance against the pull to the mean
栖息地五要素里,多样性最容易被当成口号,其实它有最硬的功能性理由。AI 的默认引力是把分布拉向原型(regression to prototype,SHEET 01);在组织层,这条引力表现为人、来源、方法的收敛——大家用同一套工具、读同一批语料、按同一种最优解工作,于是集体的判断分布越来越窄。多样性是对抗这条引力的结构性保险:保住异质的人(不同背景、不同直觉)、异质的来源(不只喂同一批数据)、异质的方法(不只跑同一种最优),等于在分布上保留多个互不重叠的视角。当默认引力把每个个体都往均值拉时,只有视角的异质性能让集体不塌成单峰。这正是反"单一目标过度优化"公理在组织层的落点:异质性的敌人不是 AI,是所有人被同一个最优解同化。
Of the five habitat elements, diversity is the easiest to take as a slogan, yet it has the hardest functional reason. AI's default gravity pulls the distribution toward the prototype (regression to prototype, SHEET 01); at the organizational level this gravity shows up as the convergence of people, sources, and methods — everyone uses the same tools, reads the same corpus, works to the same optimum, so the collective's judgment distribution narrows. Diversity is structural insurance against this gravity: keeping heterogeneous people (different backgrounds, different intuitions), heterogeneous sources (not fed the same data), heterogeneous methods (not running the same optimum) preserves several non-overlapping vantage points across the distribution. When the default gravity pulls every individual toward the mean, only the heterogeneity of vantage points keeps the collective from collapsing into a single peak. This is the organizational landing of the anti-single-goal axiom: the enemy of heterogeneity is not AI but everyone being assimilated to one optimum.
This logic has a counter-intuitive operational implication: the return on diversity is non-linear and lagging. Most of the time heterogeneous vantage points look like redundancy or even friction — they slow decisions, make consensus harder, a pure negative on the efficiency books (so always the first to be cut, SHEET 04). Their value pays off only in rare moments: when the environment shifts abruptly, when the mainstream optimum fails, when a direction no one thought of is needed — then the heterogeneous vantage point long treated as redundant becomes the only one that can see a way out. This is the same logic as the useless tree, and as neutral networks buying time for evolvability (the SHEET 04 biology). So maintaining diversity is in essence prepaying a premium for a shift whose arrival time you do not know — it looks like a loss day to day, but when the shift hits it is the only reserve not yet assimilated, still able to think of something new. Cutting it on the everyday efficiency books is cancelling that insurance.
INV
12
DASHBOARD · 涌现识别仪表盘
DASHBOARD
信号清单 · 接 SHEET 06
Signal list · to SHEET 06
涌现没法生产,但能被仪表盘照亮
Emergence cannot be produced, but it can be lit by a dashboard
Load-bearing claim (turning SHEET 06 into observable signals): emergence literacy trains "recognizing a new species in the chaos that has already happened." It has no process but has readable gauges: a set of leading indicators (emergence is happening) plus a set of counter-indicators (you are missing or killing it). The gauges do not judge which is the new species — that is constitutive, kept for the human; they only shorten your lag from "emergence happens" to "it is recognized."
受力分析:涌现的定义就是"非任何部件可预先设计",所以它不可能有生产流程——任何"产出涌现"的流程都自相矛盾。但可观测不等于可设计。复杂系统的涌现总在边缘留下痕迹:意料之外的组合开始反复出现、一个非计划的用法被用户自发放大、人与 AI 的交互长出没人设计的回路。仪表盘做的就是把这些边缘痕迹抬到可见,让事后识别快一点——因为延迟越短,放大窗口越大。
Force analysis: emergence is by definition "designable by no part in advance," so it cannot have a production process — any process that "produces emergence" is self-contradictory. But observable is not the same as designable. Emergence in complex systems always leaves traces at the edge: an unexpected combination starts recurring, an unplanned use is spontaneously amplified by users, the human-AI interaction grows a loop no one designed. The dashboard lifts these edge traces into view, making after-the-fact recognition faster — because the shorter the lag, the larger the amplification window.
先行指标 · 涌现正在发生Leading indicators · emergence is happening
非计划用法在上升:用户/团队自发把产物用在你没设计的地方,且频次在涨
Unplanned uses are rising: users/teams spontaneously use the artifact where you did not design it, and the frequency climbs
意料外的组合反复出现:某两个本不相关的部件总被一起用——可能长出了新物种
Unexpected combinations recur: two unrelated parts keep getting used together — a new species may be growing
边缘比中心更活跃:增长/讨论发生在你规划之外的边缘,不在你押注的中心
The edge is livelier than the center: growth/discussion happens at the unplanned edge, not at the center you bet on
人机回路自长:人与 AI 的协作长出没人写进流程的稳定回路
Human-AI loops self-grow: human-AI collaboration grows a stable loop no one wrote into the process
反指标 · 你正在错过 / 扼杀它Counter-indicators · you are missing / killing it
识别延迟在拉长:从涌现发生到被认出的时滞越来越久,放大窗口被错过
Recognition latency lengthens: the lag from emergence to recognition grows; the amplification window is missed
非计划用法被当噪声清掉:偏离路线图的信号被当成"用错了"删掉,而非当成种子
Unplanned uses are cleared as noise: off-roadmap signals are deleted as "misuse" instead of treated as seeds
只看押中的中心:仪表只盯计划内指标,边缘根本不在视野里
Only the bet-on center is watched: the gauges track only in-plan metrics; the edge is not in view at all
放大窗口被效率关掉:新物种刚冒头就被要求"证明 ROI",在能被识别前被砍
The amplification window is closed by efficiency: a new species, barely emerged, is asked to "prove ROI" and is cut before it can be recognized
证据锚 · 收敛偏置是已发生的硬信号Evidence anchor · the convergence bias is a hard signal that has already happened
仪表盘的反指标不是空想——它有已发生的硬证据撑腰。Hao, Xu, Li & Evans《AI tools expand scientists' impact but contract science's focus》, Nature 649(8099) 2026, DOI 10.1038/s41586-025-09922-y[R15](Ⅱ 同行评议 + 开放数据/代码,观测性文献计量·有选择效应口径):4129 万篇论文里,AI 增强使个体影响力涨(引用 4.84×),但集体层面主题覆盖收缩 4.63%、学者互动↓22%、winner-take-all(Gini 0.754 vs 0.690)。机理=AI 向数据丰富区聚集、自动化既有领域而非探索新领域。这正是"放大窗口被效率关掉""只看押中的中心"的宏观版——生成层本身有保守偏置,会把涌现拉回数据丰富的已知区。配套:James Evans《After Science》(方法论单一化)。(涌现识别学本身仍是 Ⅲ 级理论推演;收敛偏置 Ⅱ 级,但因果解读须谨慎;走探索账。)The dashboard's counter-indicators are not speculation — they are backed by evidence that has already happened. Hao, Xu, Li & Evans, "AI tools expand scientists' impact but contract science's focus," Nature 649(8099) 2026, DOI 10.1038/s41586-025-09922-y[R15] (Grade II peer-reviewed plus open data/code, an observational bibliometric study with a selection-effect caveat): across 41.29 million papers, AI augmentation raised individual impact (4.84× citations) but at the collective level topic coverage contracted 4.63%, scholar interaction fell 22%, winner-take-all (Gini 0.754 vs 0.690). The mechanism: AI clusters into data-rich regions, automating existing fields rather than exploring new ones. This is the macro version of "the amplification window closed by efficiency" and "watching only the bet-on center" — the generation layer itself carries a conservative bias that pulls emergence back toward data-rich known regions. Companion: James Evans, "After Science" (methodological monoculture). (Emergence literacy itself remains a Grade III theoretical extrapolation; the convergence bias is Grade II but causal reading must stay cautious; on the exploration ledger.)
仪表盘的两个用途:缩短延迟,对抗保守偏置
Two uses of the dashboard: shorten the lag, fight the conservative bias
The dashboard solves two distinct problems; do not conflate them. The first is the lag problem: after emergence happens there is an amplification window — before it is recognized and resourced, it is still fragile and easily cleared as noise. The longer the lag, the higher the chance of missing the window. The dashboard's leading indicators (unplanned uses rising, unexpected combinations recurring, the edge livelier than the center) light that window early, giving people a chance to recognize it before it closes. This is pure observation engineering, not the judgment of which is the new species — that step is constitutive and kept with the human.
The second problem is deeper: the generation layer itself carries a conservative bias. The mechanism from Hao, Xu, Li, and Evans's Nature 2026 study (41.29M papers, Grade II): AI tends to cluster into data-rich regions, automating existing fields rather than exploring new ones; individual impact rises (4.84× citations) but at the collective level topic coverage contracts, scholar interaction falls 22%, winner-take-all intensifies. In other words, hand recognition to the same generative system and it will systematically pull you back toward the known, data-rich regions, missing exactly the sparse edge most likely to incubate a new species. So the dashboard's counter-indicators (watching only the bet-on center, clearing unplanned uses as noise) are not airy admonitions but a micro hedge against a macro bias that has been measured: humans must deliberately allocate attention to the edge the generative system ignores, or emergence literacy is quietly hollowed out by this conservative bias.
放大决策:在还看不清时就得动手的两难
The amplification decision: acting while it is still unclear
After the dashboard lights up emergence, a genuinely hard judgment remains: when to act and amplify? There is an inherent dilemma here. Amplify too early — the new species is not yet stable, evidence is thin, and you pour resources into amplifying something that may not hold at all, the cost of mistaking noise for signal; amplify too late — the window has closed, the new species is either cleared by efficiency or recognized first by someone else, the cost of missing out. Both sides carry cost, and you must decide on insufficient evidence, because by the time evidence is sufficient the window has usually closed. This is exactly why emergence literacy is judgment, not computation: no threshold tells you "once this many signals accumulate, amplify"; it demands the capacity to bet under uncertainty.
本卷给这个两难的对策不是一个公式,是一套姿态,借自前面所有刻度:用 affordable loss 把"放大太早"的代价压到可承受(INSTRUMENT 08)——小额、可逆地先投一点,看新物种是否在投入下变强;用证伪检查把"放大太早"的概率压低——问"它为假的条件是什么、这一轮的早期投入能不能击穿它";用散木保护区把"放大太晚"的概率压低——让新物种在被正式放大前,有一片不被效率清掉的地方先活着。换句话说,放大决策不是一个孤立的判断,是前面整具罗盘的合用:信噪比刻度教你认出它、价值感知刻度教你判它值不值、散木刻度给它存活空间、责任刻度提醒你放大它的后果由谁担。仪表盘只负责把它点亮——决定动不动手、动多大,永远是那个不可外化的、留给人的判断。
This volume's answer to the dilemma is not a formula but a stance, borrowed from every mark before it: use affordable loss to press the cost of "amplifying too early" into the bearable range (INSTRUMENT 08) — invest a little first, reversibly, and watch whether the new species strengthens under the input; use falsification checks to lower the probability of "amplifying too early" — ask "what is its falsifying condition, can this round's early input puncture it"; use the useless-tree reserve to lower the probability of "amplifying too late" — let the new species survive in a place not cleared by efficiency before it is formally amplified. In other words, the amplification decision is not an isolated judgment but the whole compass used together: the signal-to-noise mark teaches you to spot it, the value-perception mark to judge whether it is worth it, the useless-tree mark gives it survival space, the responsibility mark reminds you who bears the consequence of amplifying it. The dashboard only lights it up — deciding whether to act, and how big, is always that inexternalizable judgment kept for the human.
INV
13
EVIDENCE · 证据锚与边界
EVIDENCE
双账本 · 谁适用 · 起步
Two ledgers · who · start
把这具罗盘的承重,逐条摆到证据等级上
Put this compass's load-bearing claims, one by one, onto the evidence grades
Load-bearing claim (two ledgers · honest grading): this volume's spine claims are not from intuition — they have first-hand evidence, but at uneven grades. The evidence ledger carries reliability; the exploration ledger carries leading indicators and extrapolation. The table below places each load-bearing claim on grades I–V, marking which is settled, which is still a preprint theory, which is a Grade III extrapolation, without mixing the books.
Ⅱ
异质价值学不到(实证已坐实)Heterogeneous value is not learnable (settled empirically)
IndieValueCatalog(Jiang, Sorensen, Levine, Choi, ACL 2025 Long Papers pp.6757–6794, DOI 10.18653/v1/2025.acl-long.336;arXiv:2410.03868):前沿 LM 预测个体价值仅 55–65%,人口统计学无法近似。坐实"AI 学得到平均、学不到异质"。承 SHEET 03/05 基岩。IndieValueCatalog (Jiang, Sorensen, Levine, Choi, ACL 2025 Long Papers pp.6757–6794, DOI 10.18653/v1/2025.acl-long.336; arXiv:2410.03868): frontier LMs predict individual values at only 55–65%, and demographics cannot approximate them. This settles "AI learns the average, not the heterogeneous." Carries the SHEET 03/05 bedrock.
Ⅱ
收敛偏置(已发生的硬信号)Convergence bias (a hard signal that has happened)
Hao, Xu, Li & Evans, Nature 649(8099) 2026, DOI 10.1038/s41586-025-09922-y:4129 万篇论文,主题覆盖收缩 4.63% / 学者互动↓22% / Gini 0.754。观测性、有选择效应口径。承 SHEET 12 反指标与"生成层保守偏置"。Hao, Xu, Li & Evans, Nature 649(8099) 2026, DOI 10.1038/s41586-025-09922-y: 41.29M papers, topic coverage contracted 4.63% / scholar interaction down 22% / Gini 0.754. Observational, with a selection-effect caveat. Carries the SHEET 12 counter-indicators and the "conservative bias of the generation layer."
Ⅱ
散木=定律(生物学硬证)The useless tree = law (hard biology)
中性网络(neutral networks)与基因复制:看似冗余的"无用"基因是适应新环境的原料库。把"最优≠最精简"从启发式叙事升为有一手证据的定律。承 SHEET 04/11。Neutral networks and gene duplication: seemingly redundant "useless" genes are the raw-material bank for adapting to new environments. This lifts "optimal ≠ leanest" from a heuristic narrative to a law with first-hand evidence. Carries SHEET 04/11.
Ⅲ
价值须转向涌现(preprint 理论)Value must turn to emergence (preprint theory)
Spizzirri《The Specification Trap》arXiv:2512.03048(单人·哲学论证·未同行评议):内容式价值对齐在能力扩张下结构性失败,三支柱=Hume is-ought + Berlin 价值多元不可公度 + 扩展框架问题;结论"从价值规约转向价值涌现"。与本卷生态指南姿态逐字同构。引用须写"论证/主张"非"已证明"。承 SHEET 05/11。Spizzirri, "The Specification Trap," arXiv:2512.03048 (single-author, philosophical argument, not peer-reviewed): content-based value alignment fails structurally under capability expansion; three pillars = Hume's is-ought + Berlin's incommensurable value pluralism + the extended frame problem; conclusion, "from value specification to value emergence." Word-for-word isomorphic with this volume's ecology-guide stance. Cite as "argues / claims," not "proven." Carries SHEET 05/11.
Ⅲ
共识可学 / 反共识不可学(preprint)Consensus learnable / anti-consensus not (preprint)
RLCF(Li et al. 2025-06)学社群共识="predict taste without having taste"、过度优化挤出反共识;配套 MaxMin-RLHF 不可能定理、Preference-Validity Compression(arXiv:2606.10569)、RLHF≈Condorcet(arXiv:2506.12350)。坐实 SHEET 05 分叉:可外化共识可系统化(练法)、反共识不可学(栖息地)。RLCF (Li et al. 2025-06) learns community consensus = "predict taste without having taste," and over-optimization crowds out the anti-consensus; with MaxMin-RLHF's impossibility theorem, Preference-Validity Compression (arXiv:2606.10569), RLHF≈Condorcet (arXiv:2506.12350). Settles the SHEET 05 fork: externalizable consensus can be systematized (drills); the anti-consensus cannot be learned (habitat).
effectuation 五原则(Sarasvathy:bird-in-hand / affordable-loss / crazy-quilt / lemonade / pilot-in-the-plane)· JTBD/ODI(Christensen / Ulwick)· 庄子散木(《人间世》"无用之用")。诚实标注:effectuation 与散木已核实;JTBD/ODI 据通识引用、未逐一抓一手页面。承 SHEET 03/04/10。Effectuation's five principles (Sarasvathy: bird-in-hand / affordable-loss / crazy-quilt / lemonade / pilot-in-the-plane) · JTBD/ODI (Christensen / Ulwick) · Zhuangzi's useless tree ("the use of the useless," In the World of Men). Honest note: effectuation and the useless tree are verified; JTBD/ODI cited from general knowledge, not each traced to a first-hand page. Carries SHEET 03/04/10.
最弱的一环 · 诚实摆出The weakest link · stated honestly
最弱的一环必须摆出来:涌现识别学(SHEET 06/12)整体是 Ⅲ 级理论推演——γ 涌现本身没有一手实证,先行指标(识别延迟 / 放大命中率)是提案、非校准过的判据,不作规划依据,全走探索账。本卷核心命题可证伪(SHEET 05):若证明异质构成性价值可被无损系统化,全卷倒。FRI ForecastBench 拆分 Brier、RLCF 能否学反共识前沿价值,是两个待坐实的关键前沿(见最后一层动态三分)。这才是命题而非口号。The weakest link must be put on the table: emergence literacy (SHEET 06/12) is a Grade III theoretical extrapolation as a whole — γ emergence has no first-hand empirics, and its leading indicators (recognition latency / amplification hit rate) are proposals, not calibrated criteria, not a basis for planning, all on the exploration ledger. This volume's core claim is falsifiable (SHEET 05): if heterogeneous constitutive value is shown to be losslessly systematizable, the whole volume falls. FRI ForecastBench's split Brier, and whether RLCF can learn anti-consensus frontier value, are two key frontiers still to be settled (see the closing dynamic trichotomy). That is what makes it a claim and not a slogan.
为什么分两本账:把可靠性和先行指标分开记
Why two ledgers: keep reliability and leading indicators on separate books
This volume deliberately keeps its load-bearing claims on two ledgers, unmixed. The evidence ledger holds claims with first-hand empirics that can be independently rechecked — they carry the methodology's reliability, and may be cited as "settled." The exploration ledger holds leading indicators, mechanistic claims, and Grade III theoretical extrapolations — they point a direction and pose hypotheses but are not yet settled, and may only be cited as "the model predicts / a proposal," never "proven." Mixing the books is the most common way this kind of methodology loses trust: telling an attractive Grade III extrapolation (say, emergence literacy) as if it were a Grade II fact, so that once the reader notices, the whole volume's credibility is implicated. The benefit of separation: the reliable part is not dragged down by the extrapolation, and the extrapolation need not pretend to be hard — it sits honestly on the exploration ledger as "a frontier worth pursuing," not "a conclusion already standing."
维度Dimension
证据账Evidence ledger
探索账Exploration ledger
记什么Records
有一手实证、可独立复核的命题Claims with first-hand empirics, independently recheckable
How to read the table is simple: whenever someone takes a claim from this volume to make a decision, first ask which ledger it is on. What is on the evidence ledger can serve as a basis; what is on the exploration ledger can only serve as a hypothesis — worth verifying, worth trialing, but do not stake an unbearable loss on it (see INSTRUMENT 08). This is also this volume's concrete definition of "honesty": not saying less, but marking the reliability grade of every claim it does say.
INV
13
FRONTIER · 推演幕(13·5)
FRONTIER (13·5)
前瞻 · 自标死亡条件
Projection · self-named death conditions
自动化前线右移,而喉部仍在原地
The automation front moves right, while the throat stays put
Load-bearing claim: the vertical line in SHEET 01 — "the automation front moves right over time" — is not rhetoric; it has concrete coordinates and moves right year by year. This act nails it into a dated arc (2026→2030→2032), gives each of the three forces pushing it a named falsification condition, and honestly records the strongest counter-bet against this volume. The front always eats the mouth of the funnel; the throat this volume guards (recognition) is un-eaten in every projection here — which is exactly the point to be falsified.
Projection is not prophecy. Its use is to take "the front moves right" — a claim this volume leans on repeatedly — and turn it from a slogan into a set of concrete bets reality can slap down: dated, force-named, and each force tagged with the observation that would extinguish it. The right way to read this act is as a ledger of wagers: which one reality redeems first, and which it falsifies first, decides whether this compass still points north in 2032.
For a projection to count as "slappable by reality," it must meet three hard conditions this volume sets itself, or it is merely a slogan dressed as a prediction. One, date it: not "eventually" or "sooner or later" but "by 2028 this line reaches X" — a dateless prophecy is forever right and therefore forever uninformative. Two, name the force: say plainly what pushes the front rightward (model capability, toolchain maturity, the cost curve), not appeal to a subjectless force like "the trend"; a force that can be named is a force that can be tracked. Three, state the extinguishing condition: for each force, write in advance "what observation would make me admit this force is in fact not pushing" — this is the step that nails a projection into a wager rather than a faith. Meet all three and the arc earns entry into the same ledger as SHEET 01; lacking any one, it should be demoted back to "a vision," unworthy of a reader's judgment bandwidth. This self-discipline is the volume's own root — "doubt appearance by default, deliberately hunt the falsifying condition" (SHEET 08) — turned on itself: a methodology that demands others falsify must first make its own core claim falsifiable.
The projection must also separate two things often conflated: what moves and what does not. What moves is the coordinate of the automation front — it shifts right year by year, bringing into machine reach ever more of the judgment that yesterday needed a human; this arc draws exactly that. What does not move is the "throat": wherever the front advances to, a final stretch of value judgment stays on the human side — not because technology temporarily cannot reach it, but because its raw material (lived experience, real need, the inner conviction of one who pays for the consequence) is in principle non-externalizable (see SHEET 03 / 07.5). Telling the two apart matters greatly, because the most common misreading is precisely reading "the front is moving" as "the throat too will eventually be swallowed, the human will hand it all over in time." Drawing a moving arc here is exactly to set off the line that does not move: the further the arc is pushed, the clearer which stretch is truly immovable — the other face of this volume's self-written obituary condition. If the front truly swallows the throat, this volume is wrong; if the front advances while the throat remains, this volume's load-bearing claim is confirmed by reality one year at a time.
FIG. 13.5自动化前线的有日期弧The dated arc of the automation front · 看懂:Read: 同一条竖线,逐年右移——但它永远停在"识别墙"左侧;墙右是结构性守住的反共识价值。the same vertical line, moving right year by year — yet it always halts left of the "recognition wall"; right of the wall is the structurally-held anti-consensus value.
看点:前线的右移是真的、可观测的,且本卷不否认它会继续。本卷唯一的赌注是那道墙不动——它由信息论(生成易、验证难)和偏好聚合的不可能定理双重支撑。把这道墙画在固定位置,就是把本卷的可证伪点画了出来:哪天前线越过墙,本卷就错了。Takeaway: the front's rightward march is real, observable, and this volume does not deny it will continue. The volume's only wager is that the wall does not move — held up by both information theory (generation easy, verification hard) and the impossibility theorem of preference aggregation. Drawing the wall at a fixed position draws the volume's falsification point: the day the front crosses the wall, the volume is wrong.
有日期的弧:前线在 2026 / 2030 / 2032 各停在哪
The dated arc: where the front sits in 2026 / 2030 / 2032
The front sits at the gradient's left stretch: style, lint, labelable taste, settled community consensus — RLCF (reinforcement learning from community feedback) is externalizing this stretch into a reward signal. Practical marker: teams start handing "which option fits our design spec" to the model for bulk filtering, while keeping "should we be doing this at all" in human hands. Generation is already free; the externalizable subset of recognition begins to loosen.
MID2030
前线逼近"异质口味",撞上不可能定理
The front reaches "heterogeneous taste" and hits the impossibility theorem
The front advances to the gradient's middle. Here comes the first structural deceleration: the impossibility theorem of aligning a single model to heterogeneous preferences (the MaxMin-RLHF line, grade Ⅲ theory) begins to bite — stuffing more people's taste into one reward model only converges it to a Condorcet-style majority median, systematically crowding out the anti-consensus. A wave of "personalized alignment" products will try to route around it; this volume predicts they either degrade into shallow persona-presets or hand judgment back to humans. The externalizable stretch of recognition is largely consumed; the inexternalizable stretch has not budged.
FAR2032
前线贴住识别墙,价值发现成为唯一稀缺岗位
The front presses against the recognition wall; value discovery becomes the one scarce role
The front presses against the left edge of the recognition wall and stops. Right of the wall — constitutive value, the anti-consensus frontier, the conviction earned only through long friction with the world — is still held by people, because it resists externalization (information theory) and resists aggregation (the impossibility theorem). Roles for "producing more ideas" zeroed out long ago; everyone left does the same thing: betting which direction in the expanding adjacent possible is worth it, and owning the consequences (see SHEET 07.5). The volume's entire thesis is either redeemed or bankrupt in this 2032 cell.
推前线右移的力,每一股都可能熄火——所以每一股都标了证伪条件
The forces driving the front right can each stall — so each carries a falsification condition
The front does not move on its own; three nameable forces push it. They are listed separately because each can stall — and each stall reshapes the arc. Every force below is tagged with the observation under which it should be judged to have stopped.
共识口味的可学性 · CONSENSUS LEARNABILITY
Consensus Learnability
推力Pushes byRLCF 一系证明"已成形的社群共识"可被当奖励信号学会——梯度左段被持续吃进 ① 充裕。这是前线右移最直接的引擎(证据级 Ⅲ preprint)。The RLCF line shows that "settled community consensus" can be learned as a reward signal — the gradient's left stretch is continuously eaten into ① abundance. This is the most direct engine of the front's advance (grade Ⅲ preprint).
证伪Falsified if若三年内出现一个对齐方法,能在不挤出反共识的前提下学会异质口味(即绕过 MaxMin 不可能定理),则前线不止吃左段,会越过中段——本卷的"识别墙不动"被推翻。If within three years an alignment method learns heterogeneous taste without crowding out the anti-consensus (i.e. routes around the MaxMin impossibility theorem), the front eats past the middle, not just the left — and this volume's "the wall does not move" is overturned.
生成成本继续坠落 · GENERATION COLLAPSE
Generation Cost Collapse
推力Pushes by推理单价继续向零坠落,邻近可能的圈以更快倍率外推(SHEET 02 FIG 2.1)。它不直接吃识别,但把噪声地板推得更高,反向加重识别负担——它推的是漏斗入口,不是喉部。Inference unit-price keeps falling toward zero; the ring of the adjacent possible expands at a faster multiple (SHEET 02 FIG 2.1). It does not eat recognition directly, but it raises the noise floor higher, worsening the recognition burden — it pushes the funnel mouth, not the throat.
证伪Falsified if若推理成本反而因算力地租、能源或监管而抬升并稳住,则"生成免费"前提松动,整卷的"瓶颈已迁到识别"会退回程度之别——但 2024–2026 的价格曲线指向反面。If inference cost instead rises and holds — due to compute rent, energy, or regulation — the "generation is free" premise loosens and the whole volume's "the bottleneck has moved to recognition" reverts to a difference of degree. But the 2024–2026 price curve points the other way.
异质性的可计算化 · COMPUTABLE NOVELTY
Computable Novelty
推力Pushes bynovelty-search / MAP-Elites / 开放式算法证明:放弃单一目标函数,机器也能产异质(SHEET 01)。若"什么值得不同"本身可被形式化为搜索目标,前线就能侵入墙右。这是最该警惕的一股力(证据级 Ⅲ)。novelty-search / MAP-Elites / open-ended algorithms prove that, dropping the single objective, machines produce heterogeneity too (SHEET 01). If "what is worth being different about" can itself be formalized as a search target, the front can invade right of the wall. This is the force to watch most (grade Ⅲ).
证伪Falsified if若有系统能自己设定"值得不同"的目标(而非由人喂入多样性度量),并且其产出被独立判定为连接了真实需求——那么 ④ 的"人定义什么值得不同"也塌了,本卷的承重墙整面倒下。目前所有开放式算法的多样性度量仍由人给定。If a system can set for itself the target of "worth being different" (rather than being fed a diversity metric by humans), and its output is independently judged to connect to a real need — then ④'s "humans define what is worth being different about" collapses too, and the volume's load-bearing wall falls wholesale. So far the diversity metric of every open-ended algorithm is still human-supplied.
The best way to make the 2032 cell touchable is not another paragraph of argument but to show you an object that would really exist in that world. The job posting below is fictional, yet every line of it is derivable from this volume's claims: what a job ad looks like once "producing ideas" has zeroed out and recognition is the one scarce role.
SPECULATIVE · 虚构 · Fiction
ARTIFACT · 2032 招聘启事 · 2032 Job Posting
招聘:方向判断负责人(Problem-Selection Lead)— 不接受"创意产出"履历
Hiring: Problem-Selection Lead — "idea-output" résumés will not be read
From the ~4,000 "looks-feasible" directions our agent fleet generates weekly, bet on no more than 3 per quarter — and own the abandonment of all the rest. Your output is not proposals; it is cuts.
≥ 8 years of first-hand friction in some real domain (the inexternalizable understanding of the world, see SHEET 03). We do not count how many ideas you have produced — an agent produces more in an afternoon than you will in a lifetime.
考核指标
押中率、放弃率、涌现识别延迟(事后认出新物种的速度)。不考核产量。剧场式"跑了多少试点"视为负分。
Evaluated on
Hit rate, abandon rate, emergence-recognition latency (how fast you name a new species after the fact). Output volume is not evaluated. Theatre-style "pilots run" counts against you.
薪酬结构
底薪 + 一份"被你砍掉、后被证明确实不该做"的方向的复盘分红。我们为你没做的事付钱。
Compensation
Base + a dividend on directions you cut that were later proven genuinely not-worth-doing. We pay you for the things you did not do.
This document is a projection instrument, not a predictive assertion: it folds the claims "recognition > generation," "the nerve to abandon is the new scarce skill," and "people retreat to inexternalizable understanding of the world" into one concrete object, so you can test whether those claims still hang together in 2032. If it reads as absurd, some claim has just been falsified by your intuition — which is exactly the reaction it is built to trigger.
反方下注:本卷最可能错在哪
The counter-bet: where this volume is most likely wrong
Honesty demands recording the strongest counter-argument, not only the evidence that flatters us. This volume bets that "the recognition wall does not move"; the strongest wager against it is "computable heterogeneity": open-ended algorithms (the novelty-search, MAP-Elites, quality-diversity line) have shown that, dropping the single objective, machines produce genuine heterogeneity rather than regressing to a prototype. The volume's defense is "the diversity metric is still human-supplied — humans define what is worth being different about." But that defense has a crack: if one day a system can infer for itself, from real interaction with the world, which dimensions of diversity are worth pursuing (rather than being fed them), then step ④'s "humans define value" is eroded, and the recognition wall is breached from the right.
Reading this arc as a ledger, the most useful move is not guessing which force is strongest but working out which wager reality redeems first and which it falsifies first — because whichever turns over first dictates how fast you should adjust your stance. The most likely to be redeemed first is the rightward shift of the "viable-path search" stretch: the model's coverage of "how to make an already-set direction work" widens year by year, and this will be confirmed repeatedly well before 2032. The one most worth watching, and most likely to "slap" this volume, is the "real-need verdict" stretch — if some day a system appears that can, without a human injecting lived experience, stably tell a real job from an imagined need (and that telling survives affordable-loss trials rather than after-the-fact cherry-picking), then a corner of this volume's load-bearing claim that "value perception is non-externalizable" is broken by reality. This volume does not fear that day arriving; what it fears is handing over the judgment before it arrives — taking "the model says this is a real need too" for a verified real need. The ledger's discipline is therefore two-way: it forces the frontier school to state its extinguishing conditions and forces this volume to state its own obituary condition, and whoever reality turns over first is the one who must concede.
This volume does not pretend the crack is absent; its wager is that the crack will not close — because "worth pursuing" embeds a value premise, and the source of a value premise (constitutive conviction about the world) is exactly the part doubly walled by information theory and the impossibility theorem. Which wager redeems first is the most trackable point of divergence after this volume. If, before 2030, a system appears that sets its own diversity target and whose output is independently judged to connect to a real need, please file this volume as one overconfidence of the "difference-of-degree" school — that is the obituary condition the volume writes for itself.
INV
14
LANDING · 罗盘的用法
LANDING
落地 · 怎么读这具罗盘
Landing · how to read it
不是三步流程,是怎么用并校准这具罗盘
Not a three-step process, but how to use and calibrate this compass
Load-bearing claim (the closing layer · dynamic trichotomy): this volume gives no assembly line — it gives principles, signals, starting moves, and one playable compass. The closing layer offers no static answer; it splits into invariant / shifting / frontier: which mark is settled, which is moving, which remains open.
Principles: generate many · bet few and sharp / real need before looks-feasible / protect the useless tree / leave an interface for emergence and recognize it after the fact.
Signals: hit rate / abandon rate / useless-tree retention / serendipity hit rate / emergence-recognition latency. (All on the exploration ledger: offered as leading indicators, to be calibrated by your own bookkeeping.)
Start: run one round of falsifying "looks-feasible" / fence off one useless-tree reserve / give the team one shared compass (the instrument below).
不变INVARIANT
tacit 价值锚只能营造条件The tacit value anchor can only be cultivated
构成性、异质的价值定义不可无损外包;方法论只能营造让它涌现的条件,不能直接传授。基岩在 ④。Constitutive, heterogeneous value definition cannot be losslessly outsourced; the methodology can only cultivate the conditions for its emergence, never teach it directly. The bedrock sits at ④.
在变SHIFTING
可外化信号可被系统化Externalizable signals can be systematized
RLCF 已证可学"淘汰不可实现者"、逼近共识口味——价值感知的可外化部分正在被自动化(Ⅲ preprint,探索账)。RLCF already shows it can learn to "cull the unachievable" and converge on consensus taste — the externalizable part of value perception is being automated (Grade III preprint, exploration ledger).
前沿FRONTIER
能否学到反共识的前沿价值Whether anti-consensus frontier value is learnable
创新分叉的关键悬案:若可学且不退化为平均,本卷命题倒(SHEET 05 为假的条件)。目前未决,走探索账。The decisive open question of the innovation fork: if it is learnable without degrading to the average, this volume's claim falls (the SHEET 05 falsification condition). Unresolved for now; on the exploration ledger.
Take an idea or direction and set each of three axes one notch: real need × viable path × inner conviction. The compass synthesizes a reading plus a one-line diagnosis — this is not a router (it does not allocate work) but a compass for calibrating value perception. All three axes come from the SHEET 03 value-perception formula; the reading re-renders on language toggle.
① · 真实需求Real need
② · 可行路径Viable path
③ · 内在确信Inner conviction
读数说明Reading note
罗盘不替你做决定——它把"值得吗"拆成可对话的三轴,让借来的确信、看似可行的路径、想象的需求无处藏身。第四个雷达顶点是合成的"值得度",仅为可视化;真正的诊断在那一句话里。(探索账:诊断阈值为启发式,非校准过的判据。)The compass does not decide for you — it splits "is it worth it?" into three conversable axes so that borrowed conviction, looks-feasible paths, and imagined needs have nowhere to hide. The fourth radar vertex is a synthesized "worth score," for visualization only; the real diagnosis is in the one line. (Exploration ledger: the diagnosis thresholds are heuristic, not calibrated criteria.)
系列接驳Series cross-links
创新(方向)→ 设计(好不好)→ 工程(对不对)→ 组织(谁来做)。本卷 SHEET 03 接 effectuation"手中之鸟";SHEET 04 散木接组织卷人本主线 ↗;SHEET 06 涌现识别接 γ 机制;与设计卷 ↗切分(设计判好不好,创新判值不值得)。Innovation (direction) → design (good or not) → engineering (right or not) → organization (who does it). SHEET 03 links to effectuation's "bird in hand"; SHEET 04's useless tree links to the organization volume's human through-line ↗; SHEET 06's emergence literacy links to the γ mechanism; cleanly split from the design volume ↗ (design judges good-or-not, innovation judges worth-it-or-not).
怎么真正起步:三个最小动作,今天就能做
How to actually start: three minimal moves you can make today
A compass is not "used" merely by being read — it must be picked up and calibrated. Three starting moves, deliberately made into minimal versions you can begin today without any approval. First, run one round of falsifying "looks-feasible." Take the three most favored directions in hand and run each through INSTRUMENT 06's three axes plus SHEET 10's falsification checklist, asking "can its falsifying condition be written out, can it be broken by reality at low cost." In most cases at least one of the most favored directions is exposed in this round — high viable path, low real need, a classic looks-feasible trap. The output of this round is not "cutting one direction" but shifting the centre of judgment back from appearance to real need.
Second, fence off one useless-tree reserve. It need not be large — mark a clear exploration block or budget aligned to no KPI, write it into the system, and assign one person to hold its boundary (see SHEET 11). The point is not its size but the hardness of its boundary: whether it can withstand the first "temporary borrowing" request. Third, give the team one shared compass. Turn INSTRUMENT 06's three-axis language into the team's public vocabulary for assessing directions — when an idea is discussed, no longer "I feel it's viable" but "on the real-need axis, is it a verified job or an imagined need; on the conviction axis, is it yours or borrowed." The value of a shared compass is not in scoring but in leaving "looks-feasible" and "borrowed conviction" nowhere to hide in team conversation.
Why does the closing layer use "invariant / shifting / frontier" rather than give a static answer? Because this volume handles direction, and the criteria for direction are themselves in motion. Laying the load-bearing claims out as a dynamic trichotomy is the volume's last honesty to the reader: which cell (the tacit value anchor can only be cultivated) is settled and can serve as foundation; which cell (externalizable signals can be systematized) is moving and must be continually re-tested; which cell (whether anti-consensus frontier value is learnable) is still open and is where this volume's falsification condition lives. The right way to read this compass is not to memorize a conclusion but to know each cell's reliability at this moment, and to update it as evidence arrives.
为什么是罗盘,不是流水线:方向之事没有"下一步"
Why a compass, not an assembly line: direction has no "next step"
By now it should be clear why this volume must be a compass and not a drawing. A drawing can exist because the bottleneck has been located — once the bottleneck is fixed, there is a drawable path from here to there, and so there is a "next step." But direction judgment has no such fixed bottleneck: every "is it worth it?" judgment depends on a shifting situation, a context that belongs only to the judge, a set of mutually conflicting and incommensurable values. On such a problem any "standard process" is fake — it either compresses heterogeneous value into one average objective function (and so manufactures the average, SHEET 05) or pretends the direction question has one right answer that fits everyone (and so is misused, SHEET 07). What the compass gives is not a path but orientation: it tells you where you currently stand on each axis and which way to lean, but which step to take, and how far, is forever your judgment in your situation.
This returns to the human spine of the whole series. The downstream volumes move the bottleneck to the judgment node and have the human hold the judgment — already "people return to meaning"; the innovation volume goes one layer further upstream and guards the source of meaning — what is worth pursuing, what is worth being different about, what is worth being made. This layer cannot, and should not, be outsourced: outsourcing it to a system that pulls toward the mean is voluntarily ceasing to define value, which is exactly the failure the top claim most warns against. So what this volume ultimately guards is not "the efficiency of innovation" but the human's position as the definer of value. Cheaper execution is never the end; recognizing and protecting what is worth it is what this volume truly guards. Putting this compass in your hand is not to set your direction for you — it is to make sure that setting direction stays, always, in your hands.
一句话带走:生成多,押注少而准
One line to take away: generate many, bet few and sharp
如果把这一整卷的所有刻度、所有证据、所有失败模式压缩到只剩一句话,是这句:生成多,押注少而准。"生成多"是充裕的礼物——尽情用 AI 把可能性铺到最宽,这一步几乎免费,不必吝啬。"押注少而准"是判断的本分——在铺开的可能性里,敢于砍掉绝大多数看似可行,只把资源投给那少数真正连接真实需求与可行路径的,且每一注都附一个责任读数(谁买单)。这句话同时回答了本卷的三个刻度:信噪比(多生成、少押注,因为信号没随噪声涨)、价值感知(押得准,因为靠的是真实需求×可行路径×内在确信)、责任(押得起,因为后果落在自己头上)。它不浪漫,但它是一具罗盘能压缩成的最短指北。把它记牢,剩下十五张 SHEET 都是它的展开与校准;忘了别的,记住这一句,你已经握住了这一卷的全部承重。
If the whole volume left only one line, it is this: generate many, bet few and sharp. "Generate many" is the gift of abundance — use AI freely to spread possibility as wide as it goes; this step is nearly free, so do not be stingy. "Bet few and sharp" is the duty of judgment — in the spread of possibility, dare to cut the great majority of looks-feasible, invest resources only in the few that truly link real need to viable path, and attach to each bet a responsibility reading (who pays). This single line answers all three of the volume's marks at once: signal-to-noise (generate many, bet few, because signal did not rise with noise); value perception (bet sharp, because it rests on real need × viable path × inner conviction); responsibility (bet what you can afford, because the consequence lands on you). It is unromantic, but it is the shortest north a compass can be compressed into.
This volume teaches "how to read the compass"; this piece actually runs innovation for you — it does not design an innovation org (that is the architect piece), it does the real work of this surface. Hand it a pile of ideas, an undecided direction, or "we want to innovate but don't know what to back," and it gives the near-free generation entirely to agents while putting the human at the one scarce node: the bet. The flow follows this volume's six-step compass — generate → diverge & search → falsify → read value-perception (human) → allocate the bet (human) → run the affordable-loss trial — each gated by Step 0 (the compass comes out only when direction is genuinely open and a single failure is affordable; locked direction → downstream; irreversible third-party harm → closer to safety engineering; strong-trust emotional labor → boundary). It produces a conversable, reusable bet sheet, not the innovation theatre of "we ran more hackathons and counted more pilots."
# 先装一次(Claude Code 插件市场)install once (Claude Code plugin marketplace)
$ /plugin marketplace add watterfall/ai-native-architect
# 在 Claude Code 里调用invoke inside Claude Code
$ /skill ai-native-innovation
> "我们手上有一堆方向,帮我判断该押哪个、押多少""we have a pile of directions — help me decide which to back, and how much"→ 范围闸 · 罗盘 / 出域下游 / 安全工程 / 边界scope gate · compass / out-of-scope / safety / boundary→ 信号过滤 · 证伪日志(先证伪,后打磨)signal filters · falsification log (falsify before polish)→ 一份创新组合 + 押注表(可承受损失 × 谁买单)one Innovation Portfolio & Bet Sheet (affordable-loss × who-pays)
本件性质 · 创新面的可执行配套架构层(architect)设计组织;六个配套件是创新/工程/设计/研究/学习/组织六个面各一件、同一内核、彼此耦合、阅读无固定起点。本件把创新卷的价值发现方法跑成押注表。判断节点=价值感知:哪个信号是真的、该押什么——生成充裕、归 agent,押注是判断、留给人。止步线:确信必须是你的(不是借来的——"若 AI 明天反悔,我的确信会动摇吗")、谁买单不可外包;先证伪,再打磨。买了保险不等于跳过"这件到底值不值得做"。
What this is · the innovation executable companionThe architecture layer (architect) designs the org; the six companion pieces are one each for innovation / engineering / design / research / learning / organization — one kernel, mutually coupled, with no fixed reading entry. This piece runs the innovation volume's value-discovery method into a bet sheet. Judgment node = value-perception: which signal is real and what to back — generation is abundant and belongs to agents; the bet is judgment and stays human. Stop-line: the conviction must be yours, not borrowed ("if the AI reversed itself tomorrow, would my conviction shake?"), and who-pays cannot be offloaded; falsify before you polish. Having bought insurance does not skip the question of whether the thing is worth doing at all.
SPEC.V / AI NATIVE METHODOLOGY / OWL METHODOLOGY SERIES
SCOPE /一套方法论 · 完整组织光谱 N=1 → N=众多(一人公司至 agent 网络,同一套第一性原理)One methodology · the full organizational spectrum N=1 → N=many (from the one-person company to the agent network, on a single set of first principles)
SERIES /六卷同一内核 · 本卷是其中一个面,完整接线见上方「方法论系列」。Six volumes, one kernel · this volume is one surface; the full wiring is above under "The Series."
APPENDIX · SOURCES /证据与引用登记 —— 分级口径:Ⅰ 审计级实证(监管文件交叉验证)· Ⅱ 同行评审 · Ⅲ 理论模型/工作论文(引用须写"模型预测",不得写"已证明")· Ⅳ 从业者一手陈述 · Ⅴ 咨询预测(是预测,不是事实)。本卷来源经 3 票对抗验证(2026-06,全部通过、0 条被驳倒)。Evidence and citation registry; grading key: Ⅰ audit-grade empirics (cross-checked against regulatory filings) · Ⅱ peer-reviewed · Ⅲ theoretical model / working paper (citations must read "the model predicts," never "proven") · Ⅳ practitioner first-hand account · Ⅴ advisory forecast (a forecast, not a fact). This volume's sources passed 3-vote adversarial verification (2026-06; all passed, 0 overturned).
REF
级GR
SOURCE
承重论断Load-bearing claim
R1
Ⅰ/Ⅱ
Doshi & Hauser《Generative AI enhances individual creativity but reduces the collective diversity of novel content》Science Advances 10(28) 2024 · doi.org/10.1126/sciadv.adn5290
AI 辅助下个体作品更"好"、群体却向均值收敛——同质化引力的实证锚(受控实验 Ⅰ–Ⅱ)Under AI assistance individual works get "better" while the collective converges toward the mean: the empirical anchor for the homogenization gravity (controlled experiment, Ⅰ–Ⅱ)
R2
Ⅱ
《Measuring Creativity in the Age of Generative AI》Measuring Creativity in the Age of Generative AI · arXiv:2604.19799(受控研究 · 多份实证) (controlled study · multiple empirics) · arxiv.org/abs/2604.19799
共享 AI 后产出呈双峰分布(贴近模型默认 / 人驱动偏离),而非单峰塌缩——比"信噪比"更硬的可度量推论After sharing AI, output forms a bimodal distribution (near the model default / human-driven deviation), not a single-peak collapse: a measurable corollary harder than "signal-to-noise"
R3
Ⅲ
RLCF(Reinforcement Learning from Community Feedback)Li et al. · 2025-06 · 预印本RLCF (Reinforcement Learning from Community Feedback), Li et al. · 2025-06 · preprint
"已成形的社群共识"可被当奖励信号学会——梯度可外化的左段被持续吃进①充裕(模型预测,非已证明)An "already-formed community consensus" can be learned as a reward signal: the externalizable left segment of the gradient is steadily eaten into ① abundance (the model predicts, not proven)
R4
Ⅰ/Ⅴ
Arrow《Social Choice and Individual Values》Cowles Foundation / Wiley 1951(不可能定理本体 Ⅰ;迁移到偏好对齐语境=Ⅴ 论证) (the impossibility theorem itself, Ⅰ; migrated into the preference-alignment context = grade Ⅴ argument)
≥3 备选、≥2 异质主体时,不存在同时满足无关备选独立/帕累托/非独裁的聚合函数——"什么值得做"无法无损外包给优化器(FIG 5.0 承重)With ≥3 alternatives and ≥2 heterogeneous agents, no aggregation function satisfies IIA / Pareto / non-dictatorship at once: "what is worth doing" cannot be losslessly outsourced to an optimizer (load-bearing for FIG 5.0)
把异质偏好塞进单一奖励模型,要么牺牲少数派、要么退化为平均——与社会选择论的阿罗结果同源(模型预测)Forcing heterogeneous preferences into a single reward model either sacrifices the minority or degenerates to the mean: isomorphic with the Arrow result in social choice (the model predicts)
放弃单一目标函数,机器也能产异质——故公理的正确表述是"异质性的敌人是单一目标的过度优化,不是机器本身"(算法实证 Ⅱ,映射创新为类比 Ⅲ)Abandoning a single objective, machines too can produce heterogeneity: hence the axiom's correct form is "the enemy of heterogeneity is single-objective over-optimization, not the machine itself" (algorithmic empirics Ⅱ, mapping to innovation is an analogy, Ⅲ)
R7
Ⅴ
Kauffman《Investigations》Oxford University Press 2000("邻近可能"概念,理论框架) ("the adjacent possible" concept, a theoretical frame)
可达状态随手边资源外推;本卷借作 FIG 2.1/2.2 的"邻近可能膨胀"——膨胀的是空间、不是值得去的点(理论框架 Ⅴ)Reachable states expand outward with the resources at hand; borrowed for the "adjacent possible expanding" in FIG 2.1/2.2: what expands is the space, not the worthy points (theoretical frame, Ⅴ)
R8
Ⅱ/Ⅳ
Christensen et al.《Know Your Customers' "Jobs to Be Done"》HBR 2016-09 · hbr.org/2016/09; Ulwick《What Customers Want》McGraw-Hill 2005(结果驱动创新 ODI) (Outcome-Driven Innovation, ODI)
"真实需求"=JTBD 的待办任务:人在真实处境里"雇用"产物办成一件事——价值感知第①轴的判据来源(理论+从业者框架 Ⅱ/Ⅳ)"Real need" = JTBD's job-to-be-done: in a real situation a person "hires" a product to get something done — the source of the first-axis criterion in value perception (theory plus practitioner frame, Ⅱ/Ⅳ)
R9
Ⅱ
Sarasvathy《Causation and Effectuation》Academy of Management Review 26(2) 2001:243-263 · doi.org/10.5465/amr.2001.4378020(effectuation 五原则) (the five effectuation principles)
手中之鸟(bird-in-hand)/ 可承受损失(affordable loss)/ 柠檬水(lemonade)/ 未来由行动塑造——价值感知的起点与"责任带宽"的来源(FIG 2.2)Bird-in-hand / affordable loss / lemonade / the future is shaped by action — the starting point of value perception and the source of the "responsibility bandwidth" (FIG 2.2)
R10
Ⅱ
March《Exploration and Exploitation in Organizational Learning》Organization Science 2(1) 1991:71-87 · doi.org/10.1287/orsc.2.1.71
探索与利用争夺同一笔资源、利用倾向于赢(可预测/可度量/反馈快)——效率悖论与"散木被砍"的底座(FIG 4.0)Exploration and exploitation contend for the same budget and exploitation tends to win (predictable / measurable / fast feedback): the base for the efficiency paradox and "the useless tree gets cut" (FIG 4.0)
R11
Ⅱ
Wagner《Robustness and Evolvability in Living Systems》Princeton University Press 2005; 《The role of robustness in phenotypic adaptation and innovation》Proc. R. Soc. B 279(1732) 2012:1249-1258 · doi.org/10.1098/rspb.2011.2293
稳健性造就可演化性:genotype/中性网络上积累隐变异,才能触及更多新表型——"冗余是创新的储备池"的跨域硬证据(生物学实证 Ⅱ,映射组织为类比)Robustness begets evolvability: cryptic variation accumulated on genotype / neutral networks is what makes new phenotypes reachable — the cross-domain hard evidence for "redundancy is the reserve pool of innovation" (biological empirics Ⅱ; mapping to organizations is an analogy)
R12
Ⅱ
Ohno《Evolution by Gene Duplication》Springer 1970(基因复制 + 漂变,"Ohno's dilemma"谱系) (gene duplication + drift, the "Ohno's dilemma" lineage); 分子伴侣缓冲chaperone buffering Rutherford & Lindquist《Hsp90 as a capacitor for morphological evolution》Nature 396 1998:336-342 · doi.org/10.1038/24550
新功能基因靠"先冗余复制、副本在中性/弱有害区漂变足够久"才可能获得罕见有益突变;HSP90 缓冲让不稳定系统活到补偿突变——"暂时无用是新功能的前提"(实证 Ⅱ,映射为类比)A new-function gene arises only when a redundant copy drifts long enough in neutral / mildly deleterious space to catch a rare beneficial mutation; Hsp90 buffering keeps unstable systems alive until compensatory mutations — "temporarily useless is the precondition for new function" (empirics Ⅱ; mapping is an analogy)
R13
Ⅲ
IndieValueCatalog(独立价值目录,55–65% 不可外化区间的来源)· 预印本IndieValueCatalog (the source of the 55–65% inexternalizable band) · preprint
大量个体价值判断落在模型学不到的反共识区(约 55–65%)——与不可能定理一并支撑"不可外化的那一段下沉栖息地"(模型/数据预测,Ⅲ)A large share of individual value judgments fall in the anti-consensus region a model cannot learn (roughly 55–65%): together with the impossibility theorem this supports "the inexternalizable segment sinks into the habitat" (model/data prediction, Ⅲ)
R14
Ⅱ
Agrawal, Gans & Goldfarb《Exploring the Impact of Artificial Intelligence: Prediction versus Judgment》NBER WP 24626 (2018) · Information Economics and Policy 47 (2019):1-6 · doi.org/10.1016/j.infoecopol.2019.05.001 · nber.org/w24626
AI 降低的是预测成本,没降的是判断——"生成全交 AI、押注留给人"是有经济学结构撑着的,不是分工偏好AI lowers the cost of prediction; what it does not lower is judgment — "hand generation to AI, keep the bet with the human" rests on an economic structure, not a division-of-labor preference
R15
Ⅱ
Hao, Xu, Li & Evans《AI tools expand scientists' impact but contract science's focus》Nature 649(8099) 2026 · doi.org/10.1038/s41586-025-09922-y(同行评议+开放数据/代码,观测性文献计量·有选择效应) (peer-reviewed plus open data/code; observational bibliometrics with selection effects)
AI 工具扩大个体科学家影响力、却收窄整个科学的关注面——仪表盘反指标"收敛到单一最优"的已发生硬证据(Ⅱ)AI tools expand individual scientists' impact yet contract the focus of science as a whole: the already-happened hard evidence for the dashboard's anti-metric "convergence on a single optimum" (Ⅱ)
R16
Ⅲ
Holland《Hidden Order: How Adaptation Builds Complexity》Addison-Wesley 1995; Kauffman《At Home in the Universe》Oxford University Press 1995(NK 适应度景观) (the NK fitness landscape)
复杂适应系统:局部规则→全局涌现;NK 景观把"探索-利用平衡"形式化——SHEET 06"为涌现留接口、事后识别"的理论框架(映射创新 Ⅲ)Complex adaptive systems: local rules give rise to global emergence; the NK landscape formalizes the "exploration-exploitation balance" — the theoretical frame for SHEET 06's "leave interfaces for emergence and recognize after the fact" (mapping to innovation, Ⅲ)
R17
Ⅱ
Christensen《The Innovator's Dilemma》Harvard Business School Press 1997(破坏性创新;与 R8 的 JTBD 同一作者谱系) (disruptive innovation; the same author lineage as R8's JTBD)
在位者沿既有度量持续改进、却在新价值网络被低端切入——"打磨更好的蒸汽机而世界转向电"的经典对照(高被引案例理论 Ⅱ)Incumbents keep improving along existing metrics yet get undercut from below in a new value network: the classic counterpart to "polishing a better steam engine while the world turns to electricity" (a highly cited case theory, Ⅱ)
R18
Ⅴ
效率悖论的多源一致观察("打磨更好的蒸汽机,而世界在转向电")—— 综合 March 1991〔R10〕的机制与扩散史的常见复述The multi-source consistent observation of the efficiency paradox ("polishing a better steam engine while the world shifts to electricity") — synthesizing the mechanism of March 1991 [R10] with the common retelling of diffusion history
AI 落地发出的"进步"信号几乎全落在利用一侧、系统性挤出探索——本卷的承重观察,非单一可引定理(综合论证,Ⅴ)The "progress" signals of AI deployment fall almost entirely on the exploitation side, systematically crowding out exploration: a load-bearing observation of this volume, not a single citable theorem (a synthetic argument, Ⅴ)
R19
Ⅱ/Ⅴ
Cooper《Winning at New Products: Creating Value Through Innovation》Basic Books(Stage-Gate 漏斗本体 Ⅱ;迁移到"闸门判卖相成熟度"批判=Ⅴ 论证)Cooper, "Winning at New Products: Creating Value Through Innovation," Basic Books (the Stage-Gate funnel proper, Grade II; migrated into the "gates judge appearance-maturity" critique = Grade V argument)
阶段闸漏斗按"看起来够不够成熟"逐闸放行——在卖相可被零成本量产后,过滤器判据与稀缺物正交(SHEET 08.5 结构批判①)The stage-gate funnel passes gate by gate on "does it look mature enough" — once appearance is mass-produced at zero cost, the filter's criterion is orthogonal to the scarce thing (SHEET 08.5 structural critique ①)
R20
Ⅳ
Slack / Tiny Speck / Glitch 的公开历史与 2019 Salesforce 收购(约 230 亿美元)—— 公司公告、主流财经报道交叉可核(从业一手史料 Ⅳ)The public history of Slack / Tiny Speck / Glitch and the 2019 Salesforce acquisition (~$23B) — cross-checkable against company announcements and mainstream financial reporting (practitioner first-hand record, Grade IV)
主目标(游戏)失败后,被保住的"无用"副产物(内部通讯工具)认出更大价值——散木保护区机制的代表性史例(SHEET 09.5 案例三)After the main goal (the game) failed, a kept "useless" byproduct (the internal messaging tool) was recognized as the larger value — a representative historical instance of the useless-tree reserve mechanism (SHEET 09.5 Case 3)
分级与"迁移到对齐语境=Ⅴ论证"的口径承自组织卷;个别原始定理(Arrow Ⅰ)在本卷中以推断身份使用,按本卷规矩标 Ⅴ,溯原典做最终评级。The grading and the "migrated into the alignment context = grade Ⅴ argument" convention are inherited from the organization volume; a few primary theorems (Arrow, Grade Ⅰ) are used here in an inferential role and logged as Ⅴ per this volume's rule — trace to the original for final grading.
REV
DATE
DESCRIPTION
1.0
2026-06
创新方法论卷成形 —— 点子充裕/选择稀缺漏斗 · 可外化性梯度 · 信噪比塌陷 · 邻近可能 · 价值感知三轴 · 探索/利用预算与散木 · 涌现接口 · 价值-责任脱钩 · 价值罗盘 · 自动化前线的有日期弧;本卷专属来源登记(R1-R20,承自组织卷分级口径)The innovation-methodology volume takes shape: the idea-abundance / selection-scarcity funnel · the externalizability gradient · the signal-to-noise collapse · the adjacent possible · the three axes of value perception · the explore/exploit budget and the useless tree · emergence interfaces · the value-responsibility decoupling · the value compass · the dated arc of the automation front; this volume's own source registry (R1-R20, grading conventions inherited from the organization volume)
1.1
2026-06
论证可视化与登记重建 —— 新增 FIG 2.2(搜索空间膨胀 × 责任带宽不动)与 FIG 5.0(不可能定理推导:墙为何没有门)· #refs 重建为本卷专属来源(R1-R20),内联引用接 [R#] · 移除继承自组织卷的 R1-R47 与组织卷版本史Argument visualization and registry rebuild: added FIG 2.2 (the exploding search space × flat responsibility bandwidth) and FIG 5.0 (the impossibility-theorem derivation: why the wall has no door) · #refs rebuilt to this volume's own sources (R1-R20), inline citations linked to [R#] · removed the inherited organization-volume R1-R47 and the organization-volume version history
1.2
2026-06
深度扩容 —— 新增 SHEET 08.5 旧创新机器结构批判(点名阶段闸/KPI 路线图/黑客松/点子数/"快速失败"货物崇拜/中央研发实验室,给机制)+ FIG 8.5 · SHEET 09.5 四个具名真例(Notion 三轴分诊 / AI 法律助手证伪 / Slack 散木回本 / Copilot Chat 事后认出)· SHEET 09.7 INSTRUMENT 12 看似可行证伪器(交互·随语言重渲染)· 新增 FIG 9.0 押注额度分配矩阵、FIG 6.5 涌现识别时间轴 · 来源增补 R19(Cooper Stage-Gate)/ R20(Slack 史例)Deep enrichment: added SHEET 08.5, the structural critique of the legacy innovation machine (naming stage-gate / KPI roadmap / hackathon / idea-count / "fail fast" cargo cult / central R&D lab, with mechanism) + FIG 8.5 · SHEET 09.5, four named worked cases (Notion three-axis triage / AI legal-assistant falsification / Slack useless-tree payback / Copilot Chat recognized after the fact) · SHEET 09.7, INSTRUMENT 12 looks-feasible falsifier (interactive, re-renders on language change) · added FIG 9.0 bet-size allocation matrix and FIG 6.5 emergence-recognition timeline · sources extended with R19 (Cooper Stage-Gate) / R20 (the Slack instance)