工程方法论 · 谱系篇ENGINEERING · X-ENG STACK CHAPTER/← 返回工程方法论← back to Engineering

杠杆点的迁移史

The Migrating Leverage Point

四年，行业造了七个「X Engineering」：prompt → context → spec → harness → loop，到 fleet。看着像术语通胀，其实是一栋楼——每个新词优化的层，恰好在上一个词的上一层抽象。机制就是这套系列的内核，加上一根时间轴：模型每强一档，人手工操作的那层被产品吸收，人的杠杆点就上移一层。这一卷，就是内核第②步在四年里的轨迹；底层是控制论的复活。

In four years the field minted seven "X Engineering" terms: prompt → context → spec → harness → loop, and on to fleet. It looks like terminology inflation; it is actually one building. Each new term optimizes the layer exactly one abstraction above the last. The mechanism is this series' kernel plus a time axis: every time models gain a notch, the layer humans were hand-operating gets absorbed by products, and the human leverage point climbs a floor. This volume is the trace of the kernel's step ② over four years; underneath, it is cybernetics reborn.

工程
谱系篇

ENG
STACK

SHEET

PROLOGUE · 概念

PROLOGUE · The Concept

定义 · 先划界

Definition · Draw the line first

不是术语通胀，是单向迁移

Not inflation, but a one-way climb

把七个词按时间排开，看到的不是泡沫，而是一条单调上行的轨迹。每个新词的优化对象，都在前一个词的上一层抽象——没有一次跳跃是随机的。

Lay the seven words out by date and you see not a bubble but a monotonic upward trajectory. Each new term optimizes one abstraction above the last; no jump is random.

表面上这是术语通胀：每隔几个月，一位有影响力的工程师发条推文，一个新「X Engineering」便诞生，配套仓库、CLI 与嘲讽视频在七十二小时内集齐——2026 年 6 月的 loop engineering 把这周期压到不足一周。

On the surface it is terminology inflation: every few months an influential engineer posts, a new "X Engineering" is born, and the repo, CLI, and mocking video assemble within seventy-two hours. June 2026's loop engineering compressed that cycle to under a week.

但从单次调用的措辞，到上下文窗口里的信息分布，到意图的源文件，到单个智能体的运行环境，再到驱动智能体的闭环系统本身——模型能力每抬升一档，人原本手工操作的那层就被产品吸收，人被推向更高一层，新的那层需要一个名字。这一卷不追术语热度，它把七个词放回一栋楼里，判定哪些是承重结构、哪些只是装修。

But from the wording of a single call, to the distribution of information in the context window, to the source file of intent, to a single agent's runtime, to the closed loop that drives the agents: each time capability rises a notch, the layer humans hand-operated gets absorbed by products, humans are pushed up a level, and the new level needs a name. This volume does not chase the hype; it puts the seven words back into one building and judges which floors are load-bearing and which are just decoration.

SHEET

THE KERNEL · 内核的时间维

THE KERNEL · The time axis

命题 · 承重

Thesis · Load-bearing

判断节点不只退守，它向上迁移

The judgment node does not just retreat, it migrates up

系列内核：执行变充裕 → 判断退守 → 上下文成基设 → 人回归意义。这一卷给它加一根时间轴——退守的那个节点，会随能力提升沿抽象层级单向上移。这栋楼，就是第②步四年来的轨迹。

The series kernel: execution becomes abundant → judgment retreats → context becomes infrastructure → people return to meaning. This volume adds a time axis: the node it retreats to migrates monotonically up the abstraction stack as capability rises. The building is the trace of step ② over four years.

母版 · 加时间轴MASTER TEMPLATE · with a time axis

① 充裕ABUNDANCE

能力每抬一档

Each capability notch

模型变强，下一层手工操作被产品吸收。

As models strengthen, the next manual layer is absorbed by products.

② 判断↑JUDGMENT↑

杠杆点上移一层

Leverage climbs a floor

人被推向更高抽象；新层 = 新的 X-Engineering。

Humans are pushed to a higher abstraction; the new floor is the new X-Engineering.

③ 上下文CONTEXT

下层产品化为基设

Lower floors become infra

措辞、上下文管理、环境搭建逐层沉为开箱即用。

Wording, context management, environment setup settle into off-the-shelf infrastructure.

④ 人MEANING

守住基岩约束

Hold the bedrock

注意力有限、意图不可替代、行动需边界——压住它们的学科不过时。

Limited attention, irreplaceable intent, bounded action: disciplines pinning these do not expire.

下面的图纸：SHEET 02 把七个词排成一栋楼，SHEET 03 指出它其实是控制论复活，SHEET 04 解剖当前顶层（Loop）的五件套，SHEET 05 区分它与自我迭代智能体，SHEET 06 用唯一判据筛出承重楼层，SHEET 07 给趋势、预测与实践。最后一件仪器，帮你定位自己在第几层。

The sheets ahead: SHEET 02 stacks the seven words into a building, SHEET 03 shows it is cybernetics reborn, SHEET 04 dissects the five pieces of the current top floor (Loop), SHEET 05 separates it from self-improving agents, SHEET 06 filters out the load-bearing floors by one criterion, and SHEET 07 gives trends, predictions, and practice. A final instrument locates which floor you are on.

SHEET

THE MODEL · 楼层模型

THE MODEL · The building

框架 · 核心图

Framework · Key figure

七个词，其实是一栋楼

Seven words are one building

左是楼层与它优化的对象，右是它在控制论里的对应组件。Agentic engineering 不是某一层，它是整栋楼的名字。这栋楼的电梯，只往上开。

Left is the floor and what it optimizes; right is its counterpart component in control theory. Agentic engineering is not a floor; it is the name of the whole building. Its elevator only goes up.

▣

AGENTIC ENGINEERINGAGENTIC ENGINEERING伞形词 = 整栋楼的名字the umbrella term = the whole building

领域名field name

Fleet / factoryFleet / factory推测·未命名 · 组织级：多 loop 的调度、治理与资源经济speculative · org-level: scheduling, governance, resource economics of many loops

层级控制hierarchical control

Loop engineering · 2026.06Loop engineering · 2026.06时间维：谁触发、谁验证、状态如何跨会话持续time: who triggers, who verifies, how state persists across sessions

闭环反馈closed-loop feedback

Harness engineering · 2026 初Harness engineering · early 2026空间维：单智能体的环境——工具、记忆、沙箱、护栏space: a single agent's environment – tools, memory, sandbox, guardrails

执行器约束actuator constraint

Spec engineering · 2025 下Spec engineering · late 2025意图维：「完成」的定义权；spec 作为真相之源intent: the right to define "done"; the spec as source of truth

参考信号·设定值reference signal / setpoint

Context engineering · 2025 中Context engineering · mid-2025信息维：有限注意力下，什么进窗口、什么留磁盘information: under finite attention, what enters the window vs stays on disk

传感器·可观测状态sensor / observable state

Prompt engineering · 2022–23Prompt engineering · 2022–23措辞维：单次调用说对一句话（已吸收为基础素养）wording: getting one call's phrasing right (now absorbed as basic literacy)

单次控制信号single control signal

每上一层，人就把下一层交给产品：措辞被更强的指令遵循消化，上下文管理被自动压缩与检索消化，环境搭建正被开箱即用的 harness 消化。判断哪层承重、哪层装修，标准只有一条——它压住的约束，会不会随模型变强而消失。注意力有限（2F）、意图无人能替你定义（3F）、行动需要边界（4F），这三个约束是物理性的；围绕它们的学科不会过时，只会换名字。

Each floor up, humans hand the one below to products: wording is digested by stronger instruction-following, context management by automatic compression and retrieval, environment setup by off-the-shelf harnesses. The single test of load-bearing vs decoration: does the constraint it pins disappear as models get stronger? Finite attention (2F), intent that no one can define for you (3F), and action that needs boundaries (4F) are physical constraints; the disciplines around them do not expire, they only change names.

SHEET

THEORY · 控制论复活

THEORY · Cybernetics reborn

推论 · 谱系

Lineage · Inference

这套词，半世纪前就有了

This vocabulary is half a century old

触发、执行、验证、状态持久化、生成者与检查者分离、监督者警觉性衰减——这些词在控制论、Sheridan 的监督控制理论与人因工程里，已存在五十年。行业正在重新发明反馈控制系统，只是没人这么叫。

Triggering, execution, verification, state persistence, generator/checker separation, supervisor vigilance decay: these have existed for fifty years in cybernetics, Sheridan's supervisory control theory, and human-factors literature. The field is reinventing feedback control systems, without calling it that.

看 SHEET 02 的右列——每一层都精确对应一个控制论组件：单次控制信号、传感器、设定值、执行器约束、闭环反馈、层级控制。这不是巧合。一个能跑的智能体循环，本质就是一套反馈控制系统：有设定值（spec）、有传感器（context）、有执行器（harness）、有闭环（loop）、有监督者（人）。

Look at the right column of SHEET 02: each floor maps precisely to a control-theory component – single control signal, sensor, setpoint, actuator constraint, closed-loop feedback, hierarchical control. This is no coincidence. A working agent loop is, at bottom, a feedback control system: a setpoint (spec), sensors (context), actuators (harness), a closed loop, and a supervisor (the human).

这给了我们一把尺子：凡是控制论里早有名字的东西，都会留下；凡是只为某次模型短板而临时补的技巧，都会被下一代模型消化。更要紧的是它点破一个被研究了四十年的陷阱（SHEET 05 展开）：系统越可靠，监督者越盯不住。

This hands us a ruler: whatever already has a name in cybernetics will stay; whatever is a stopgap for one generation's model gap will be digested by the next. More importantly, it names a trap studied for forty years (expanded in SHEET 05): the more reliable the system, the less the supervisor can watch it.

承重判据Load-bearing test

一个学科若只在控制论里换了层皮，它就会留下；若它只是某代模型的创可贴，它就会被吸收。名字会换，反馈控制的骨架不会。

A discipline that merely re-skins a control-theory idea will stay; one that is a band-aid for a single model generation gets absorbed. Names change; the feedback-control skeleton does not.

SHEET

ANATOMY · 解剖一个 Loop

ANATOMY · Inside a Loop

机理 · 当前顶层

Mechanism · The top floor

五件套，一根脊柱

Five pieces, one spine

一个能跑的 loop（按 Osmani 的拆解）= 五个原语 + 一处外部记忆。Claude Code 与 Codex 已各自配齐、且一一同构——这意味着 loop 设计正成为工具无关的可迁移技能。

A working loop (per Osmani's breakdown) = five primitives + one external memory. Claude Code and Codex each have all five, isomorphically, which means loop design is becoming a tool-independent, portable skill.

01 · 心跳HEARTBEAT

定时触发Scheduled trigger

Automations / 计划任务 / hooks：按节奏自启，做发现与分诊。这是 loop 区别于「跑过一次」的本体。

Automations / cron / hooks: self-starts on a cadence to discover and triage. This is what makes it a loop, not a one-off.

02 · 隔离ISOLATION

WorktreesWorktrees

每个并行智能体一份独立检出，机械冲突归零。但人的评审带宽，仍是并行度的真实上限。

A separate checkout per parallel agent; mechanical conflicts go to zero. But human review bandwidth is still the real cap on parallelism.

03 · 知识KNOWLEDGE

SkillsSkills

SKILL.md：把项目约定一次性写在体外，否则每一轮都要从零猜你的项目——意图债的解药。

SKILL.md: write project conventions once, outside the model, or every round guesses your project from scratch – the cure for intent debt.

04 · 触手REACH

连接器Connectors

MCP 连接器与插件：让 loop 能开 PR、改工单、发消息——从「告诉你怎么做」变成「做完了」。

MCP connectors and plugins: the loop opens PRs, edits tickets, sends messages – from "tells you how" to "done."

05 · 制衡CHECK

子代理：做与查分离Subagent: do vs check

写代码的模型给自己打分太宽容。独立的 checker（可换模型）是整个结构唯一的承重墙。

The model that writes is too lenient grading itself. An independent checker (a different model) is the structure's one load-bearing wall.

+ · 脊柱SPINE

外部状态：磁盘记忆External state on disk

一个 markdown 或一块看板，活在单次对话之外，记已做/在做/待做。模型每次醒来失忆，repo 不会。

A markdown file or board living outside any single chat, tracking done / doing / to-do. The model wakes amnesiac; the repo does not.

把这六样配齐，你就有了一个能无人值守跑起来的循环。注意 05 那堵承重墙——没有独立验证，循环只是在用惊人的速度放大自己的错误。

Assemble these six and you have a loop that runs unattended. Note the load-bearing wall at 05: without independent verification, a loop merely amplifies its own mistakes at remarkable speed.

SHEET

LINEAGE · 元层级

LINEAGE · Meta-levels

推论 · 分界

Inference · The line

它和自我迭代智能体，差在哪

Where it differs from self-improving agents

Loop 常被误认作自我改进智能体的同类。分界线只有一条：循环的结构由谁定义。四档共享同一副骨架——生成器 + 独立验证器 + 外部持久状态，即变异 / 选择 / 保留。

A loop is often mistaken for a self-improving agent. The line is single: who defines the loop's structure. All four levels share one skeleton – generator + independent verifier + external persistent state, i.e. variation / selection / retention.

Ⅰ · 2023–24

人在环中Human in the loop

人是串联节点，每步先批准再执行。撤掉人，系统停摆。

A serial node, approving each step before execution. Remove the human and the system halts.

Ⅱ · 2025

人在环上Human on the loop

系统默认自主，人并联监督、可随时叫停。撤掉人，系统照跑。

Autonomous by default; the human supervises in parallel and can stop it. Remove the human and it keeps running.

Ⅲ · 2026 ← 现在now

人设计循环Human designs the loop

人定义结构、验证与停机条件；介入移到设计时与异常时。

The human defines structure, verification, and stop conditions; intervention moves to design-time and exceptions.

Ⅳ · 前沿frontier

循环自我设计Loop self-designs

输出反哺自身机制：进化提示、技能库乃至验证器。AlphaEvolve、AI Scientist 在此。

Output feeds back into its own mechanism: evolving prompts, skill libraries, even the verifier. AlphaEvolve and AI Scientist live here.

编码 loop 用测试与 CI 做选择压力——廉价、确定、毫秒级；科研 loop 用实验与自然做选择压力——昂贵、有噪、以周计。这就是编码成为第一个被 loop 化的智力劳动的全部原因：它验证成本最低。哪个领域的验证成本降下来，哪个就是下一个。

A coding loop uses tests and CI as selection pressure: cheap, deterministic, millisecond-scale. A research loop uses experiments and nature: expensive, noisy, week-scale. That is the whole reason coding is the first intellectual labor to be loop-ified – its verification cost is the lowest. Whichever domain's verification cost drops next is the next to go.

从第Ⅱ档到第Ⅲ档，回应着 Bainbridge 1983《自动化的反讽》：系统越可靠，监督者警觉性衰减越快，而恰在最需要接管的异常时刻，人已丢失情境感知。「人在环上」注定盯不住。Loop 的回答不是要人盯得更紧，而是把验证写进结构（checker 子代理、可机检条件），再把人的介入从实时监督改成异步分诊——人不再需要与机器的时钟同步。

The jump from Ⅱ to Ⅲ answers Bainbridge's 1983 "Ironies of Automation": the more reliable the system, the faster supervisor vigilance decays, and exactly when takeover is needed, the human has lost situational awareness. "Human on the loop" is doomed to not watch. The loop's answer is not to watch harder but to write verification into the structure (checker subagents, machine-checkable conditions), and move human intervention from real-time supervision to asynchronous triage: the human no longer has to keep the machine's clock.

SHEET

VERDICT · 留下还是脚手架

VERDICT · Survive or scaffold

命题 · 基岩

Thesis · Bedrock

三个硬约束，是基岩

Three hard constraints are the bedrock

判据唯一：是否压住一个不随模型变强而消失的约束。压住的活下来，没压住的被吸收成素养。整张判定表都站在三个硬约束之上。

One criterion: does it pin a constraint that does not vanish as models improve? Those that do survive; those that do not get absorbed into literacy. The whole table stands on three hard constraints.

基岩 01BEDROCK 01

注意力有限Finite attention

上下文窗口稀缺、context rot 客观存在——信息取舍永远是工程问题（撑起 2F）。

The window is scarce and context rot is real, so triaging information is always an engineering problem (holds up 2F).

基岩 02BEDROCK 02

验证昂贵Verification is expensive

生成近免费，确认正确依旧贵——生成/验证的不对称，划定自动化的边界。

Generation is near-free, confirming correctness is not; the generation/verification asymmetry draws automation's boundary.

基岩 03BEDROCK 03

意图不可替代Intent is irreplaceable

机器替不了你决定想要什么；「完成」的定义权，是人无法外包的最后资产（撑起 3F）。

No machine decides what you want; the right to define "done" is the last asset you cannot outsource (holds up 3F).

概念Concept	押的约束Constraint	判定Verdict
Prompt engineering	模型对指令敏感（正消失）instruction-sensitivity (fading)	吸收为素养absorbed as literacy
Context engineering	注意力有限 · context rotfinite attention · context rot	长期有效long-term
Spec engineering	意图不可替代intent irreplaceable	原理长存principle endures
Harness engineering	行动需边界 · 知识需注入action needs bounds · knowledge injected	长期有效long-term
Loop engineering	验证昂贵 · 闭环反馈verification cost · closed loop	长期有效long-term
Agentic engineering	整个时代的总称name of the era	伞形词留下umbrella stays
Vibe coding / engineering	文化现象 / 败于命名之争culture note / lost the naming war	时代注脚a footnote

SHEET

PLAYBOOK · 趋势与实践

PLAYBOOK · Trends & practice

行动 · 可执行

Action · Operable

认栈，不认词

Know the stack, not the word

名字还会换，栈不会。三条趋势定方向，一条实践闭环给动作——核心一句：定位你在第几层，把下层交出去，把判断花在更上一层。

The names will keep changing; the stack will not. Three trends set direction, one practice loop gives the moves. The core: locate your floor, hand the floors below away, spend judgment one floor up.

三条趋势

Three trends

趋势 01TREND 01

验证成为唯一瓶颈Verification is the only bottleneck

无人值守地跑，也在无人值守地错。未来两年最大投入流向验证基础设施——评测、可观测性、对抗式评审。某领域自动化进度 ≈ 其验证成本的下降速度。

Run unattended, err unattended. The next two years' biggest investment goes to verification infrastructure – evals, observability, adversarial review. A domain's automation progress ≈ how fast its verification cost falls.

趋势 02TREND 02

原语跨厂商收敛Primitives converge

五件套在两大阵营一一同构——行业在发现「承重脚手架」，不是各自的时尚。沉淀在 SKILL.md 与 loop 设计里的知识，不再绑定任何单一工具。

The five pieces are isomorphic across both camps – the field is finding load-bearing scaffolding, not separate fashions. Knowledge in SKILL.md and loop design binds to no single tool.

趋势 03TREND 03

人的角色固化为三件The human role fixes to three

写 spec（定义完成）、设计 loop（定义过程）、做终审（承担责任）。同一个 loop，两个人会得到相反结果——循环分不出区别，人分得出。

Write the spec (define done), design the loop (define process), do the final review (own the outcome). The same loop yields opposite results for two people – the loop can't tell the difference; the person can.

实践闭环

The practice loop

①

定位楼层Locate your floor

你团队还在手工操作哪一层？用下面的仪器找出来。

Which floor is your team still hand-operating? Find it with the instrument below.

②

认栈不认词Know the stack

别追新词；看它压住哪个基岩约束——压住了就学，没压住就等它变素养。

Don't chase the word; see which bedrock constraint it pins. If it pins one, learn it; if not, wait for it to become literacy.

③

装配五件套Assemble the five

心跳 / 隔离 / 知识 / 触手 / 制衡 + 脊柱——尤其那堵独立验证的承重墙。

Heartbeat / isolation / knowledge / reach / check + spine – especially the load-bearing wall of independent verification.

④

设计而非盯屏Design, don't watch

把介入从实时监督改成异步分诊——警觉性衰减是物理规律，盯不住。

Move intervention from real-time supervision to async triage – vigilance decay is physics; you cannot watch.

⑤

守住基岩Hold the bedrock

注意力、验证、意图——这三件永远是你的，别外包。

Attention, verification, intent – these three are always yours; don't outsource them.

⑥

棘轮 + 出更好的问题Ratchet + better questions

每上一层就锁死、不退回；把省下的判断，用来问更上一层的问题。

Each floor up, ratchet and don't fall back; spend the freed judgment asking the next floor's questions.

INSTRUMENT 08楼层定位器FLOOR LOCATOR