07 - 扩展指南

加新功能 / 改 backend / 加新检测家族. 必读 — 不读会重蹈 R10-R28 19 轮覆辙.

改 backend 必做 SOP

任何 backend .py 改动 (新加功能 / 修 bug / 改 LLM prompt) 走以下 6 步:

Step 1: 起 phase 前先调查 (不允许直接动手)

找真凶 - 不靠猜, 靠 venv 实测 + 真后端 fixture 复现:

cd workspace/system_monitor
# 跑某真后端 fixture 验真凶现状
PYTHONPATH=. ./backend/.venv/bin/python -c "
import json
d = json.load(open('test_backup/fixtures_real_backend/r37_b_dns_multi.json'))
# 真验你怀疑的字段...
"

grep 同源 lineage 件数 - 决定该改源头还是出口补丁:

ls docs/agent-handoff/midterm-stage-5-operator-trial/phase-archive/phase67-exit-patch-lineage/ | \
    grep -E "^phase67(h|e|p|q|r|s)" | sed -E 's/-.+$//' | sort -u | wc -l

lesson 1: ≥3 件同源出口字面 caveat 必改源头. 不允许加第 N+1 件.

看真凶层级 — 倾向选深一层 (改源头) 而非浅一层 (出口字面):
- LLM 输出错 → 真凶通常是装配链上游 silent 0 吞 (Phase 68.A) / 字段对齐 (Phase 68.C) / 调用时机 (Phase 68.D)
- 不是修 LLM 出口字面 caveat

Step 2: 起 phase prompt — 必过 plugin v0.1.2 5 检查

用 phase-handoff skill 或直接写. 完成后跑:

PROMPT="docs/agent-handoff/midterm-stage-5-operator-trial/phase<NN>-coding-agent-prompt.md"

# Check 1: ≤ 2 sub-task
grep -cE "^- \*\*sub-task" "$PROMPT"   # 必 ≤ 2

# Check 2: 行数 (self-contained ≤ 300)
wc -l < "$PROMPT"   # ≤ 300

# Check 3: 红线 verbatim
for rl in "不删除现有 docs/" "不修改安全三旗" "不在 commit / 日志"; do
    grep -qF "$rl" "$PROMPT" && echo "OK $rl" || echo "❌ MISSING $rl"
done

# Check 4: exit-patch blacklist
ls docs/agent-handoff/midterm-stage-5-operator-trial/phase-archive/phase67-exit-patch-lineage/ | \
    grep -E "^phase67(h|e|p|q|r|s)" | sed -E 's/-.+$//' | sort -u | wc -l
# ≥ 3 = BLOCK 不允许加新 exit-patch

# Check 5: self-contained (零上下文 coding agent 可执行)
grep -cE "读这.{0,10}份|读.{0,20}preliminary|沿用.{0,20}phase[0-9]+|MEMORY 第|R[0-9]+ (现场|教训|评估|真后端|持续)" "$PROMPT"
# 0 = self-contained, >0 = 必修

任一 fail = 不允许 commit 给 coding agent 跑.

Step 3: 跑 coding (用户自跑, 不用 sub-agent)

feedback_no_subagent_for_coding 真教训: 用户自跑能控制每步真值, sub-agent 容易"主动整理"删 docs / 漂逻辑.

把通过 5 检查的 prompt copy-pastable 给用户. 用户跑完返回 summary.

Step 4: 改后必重启 backend + 验 mtime + 清 .pyc

cd workspace/system_monitor

# kill
OLD_PID=$(pgrep -f "backend/app.py" | grep -v "bash -c" | head -1)
kill "$OLD_PID" && sleep 3

# 清 .pyc (lesson 11 — Phase 69.A 教训)
find backend/ -name "__pycache__" -type d -exec rm -rf {} + 2>/dev/null

# 重启
set -a; source .env.local; set +a
nohup ./backend/.venv/bin/python backend/app.py > logs/backend-verify.log 2>&1 &
NEW_PID=$!
sleep 6

# 验 mtime (lesson 6)
BACKEND_TS=$(stat -c %Y "/proc/$NEW_PID")
SRC_TS=$(stat -c %Y backend/assistant/<你改的文件>.py)
[ "$BACKEND_TS" -ge "$SRC_TS" ] && echo "✅ 真应用" || echo "❌ 重启失败"

# 跑 unittest baseline
PYTHONPATH=. ./backend/.venv/bin/python -m unittest discover test_backup 2>&1 | tail -3
# 期望: 1850+ passed

Step 5: 跑真后端评估 R(N) — fixture 必入 git

cd workspace/system_monitor
./scripts/run_round_evaluation.sh r<round_num>

6 步 pipeline 真跑:

backend health + mtime
capture 4 endpoint
sanitize
git add fixture (lesson 3 — 每轮必入 git)
unittest discover
logger 切片

末尾出评估 agent 输入清单. 喂给评估 agent 跑 R(N) 评估 (用 acceptance-evaluator-prompt.md 模板).

Step 6: planning agent 二审 — 独立推真凶

按 lesson 4, 评估 agent 给的"修法建议"必先亲核:

评估真值 verbatim 引用 fixture 字面 (grep)
真凶独立推断 (python -c 实测 / 源码 grep)
R(N-1) → R(N) trend diff

二审跟评估 agent 真凶判定不一致时, 以 planning 二审为准. R30/R31 真 3 次评估真凶判定错位, 都是 planning 二审改判后真修对.

加新 ES 工具 (扩 14 → 15+)

1. 设计真聚合 query

新工具必须满足:

aggregate only (不 _source, 不读 raw doc) — 守 raw_log_scan=false 物理边界
白名单 ES index (dns-dnstap-* / syslog-bras-* / iaaa_*) — 不读未审 index

2. 注册到 tool_registry

# backend/assistant/tool_orchestrator/dns_tool_registry.py
DNS_TOOL_SPECS = {
    ...
    "query_dns_xxx_new": {
        "required_args": {"start", "end", "resolver_ip"},
        "allowed_args": {"start", "end", "resolver_ip", "protocol", "<新 arg>"},
        "requires_window": True,
    },
}

EXECUTORS = {
    ...
    "query_dns_xxx_new": _execute_xxx_new,
}

def _execute_xxx_new(args, store):
    # 真 ES DSL (aggregate only)
    body = {"size": 0, "aggs": {...}}
    result = store.search(index="dns-dnstap-*", body=body)
    return {
        "<结构化字段>": ...,
        "raw_log_scan": False,
        "preview_only": True,
        "execution_enabled": False,
    }

3. 更新 LLM model planner prompt 白名单 (如要 LLM 真选这个工具)

# backend/assistant/tool_orchestrator/multiround_agent.py
# 在 `build_round_planner_input` 真注释段加新工具名 + required_args 说明

4. 加测试

# test_backup/test_phase<NN>_xxx_new_tool.py
class TestXxxNewTool(unittest.TestCase):
    def test_basic_aggregate(self):
        # mock store + verify 真返结构

5. 跑真后端 R(N) 验

./scripts/run_round_evaluation.sh r<N> → 新工具应在 tool_trace 真出现.

加新 endpoint (扩 4 → 5+)

1. 设计 endpoint contract

# backend/api/log_assistant_api.py
@app.route("/api/v1/log-assistant/<新功能>", methods=["POST"])
@require_role("admin")
def api_xxx_new():
    # 1. 取 payload
    payload = request.get_json()
    # 2. 调 assistant 业务模块
    report = call_xxx_assistant(payload)
    # 3. 装回安全旗 + sanitize
    report["raw_log_scan"] = False
    report["preview_only"] = True
    report["execution_enabled"] = False
    return jsonify({"report": report, ...})

2. 加业务模块

# backend/assistant/xxx_diagnostics.py
def call_xxx_assistant(payload):
    # 真业务逻辑
    return {...}

3. 加 frontend API client

// frontend/src/api/index.js
export async function runXxxNew(payload) {
    const resp = await axios.post("/api/v1/log-assistant/<新功能>", payload)
    return resp.data
}

4. 加 frontend Vue 组件 (如要 UI)

// frontend/src/components/logAssistant/XxxNewCard.vue
// 真复用 BusinessImpactCard / AIReasoningCard 等 pattern

5. 更新 evaluator-query-design.md (R(N) 评估真要钉死的新 endpoint)

# docs/agent-handoff/midterm-stage-5-operator-trial/evaluator-query-design.md
# §1 endpoint table 加新行 (与 R25 旗位表对齐)
| E | POST /api/v1/log-assistant/<新功能> | <session_kind> | <protocol> | admin |

并按 lesson 2 (evaluator query 设计必须钉死) 改 capture_real_backend.sh + sentinel test 3 处同步.

6. 跑真后端 R(N) 验

改 LLM prompt (慎)

⚠ 风险: LLM prompt 改一行可能影响所有真后端真返字面. 必 R(N) 评估真验.

真做法

不直接改主 prompt — 改前先 dump 当前 prompt 到 git diff 备份
改一句话再 R(N) 评估 — 一次只改一句, 不要 batch
对比 R(N-1) trend — 看 evaluator score 是否退步

真案例

Phase 64.F P2-3 真改 model_identity (强制覆盖 LLM 自由发挥 model_name 字面). 改前 LLM 真返 model_name=ds-chat/v1 等乱字面. 改后强制 dns_resolver_v1 / dhcp_resolver_v1. 走源头修 (AI_MODEL_IDENTITY 常量) 而非出口字面替换.

加新检测家族 (e.g. WiFi / VPN / SNMP)

本系统目前只覆盖 DNS / DHCP. 加新检测家族 (e.g. WiFi 接入认证, VPN 隧道, SNMP 流量) 是真大改:

真步骤

设计新 ES index (e.g. wifi-auth-* index, 找 ES 管理员协调)
加新 diagnostics 模块 (wifi_diagnostics.py 类比 dns_diagnostics.py)
加新 tool registry (wifi_tool_registry.py)
加新 reasoner (wifi_hypothesis_reasoner.py)
加新 endpoint
加新 frontend Vue 组件
asset_registry yaml 加 WiFi controller 资产
跑真后端 R(N) 评估 (建议起新 round 系列, e.g. r-wifi-1)

预估工作量: 10-15 个 phase, 类比 phase15-17 + phase32-37 + phase49 + phase68 全套. 几周到几个月.

按 lesson 1, 走 source-fix lineage 走源头, 不要走 exit-patch 出口补丁.

改 frontend UI

真改一个 Vue 组件

改 .vue 文件 (template + script + style 三段)
vite HMR 自动 reload, 不用重启
npm run build 验真过
backend 不动, frontend 改不破 backend

加新 Vue 组件

在 frontend/src/components/logAssistant/ 加 NewCard.vue
import 到主 page (e.g. LogAssistant.vue)
defineProps + computed (Vue 3 composition API)
跑 npm run build 验真过

接 backend 真新字段

backend 真返新字段 (e.g. report.new_field)
frontend defineProps({ ... }) 加新 prop
computed 读 props.something?.new_field || default
template 真渲染

详见 Phase 69.E (frontend data_freshness_warning banner) + Phase 69.F (BusinessImpactCard 展开/折叠) 真案例.

Plugin v0.1.2 phase-prompt-guard 维护

5 检查现状

Check	真生效	触发条件
1 sub-task ≤ 2	自动 grep `^- **sub-task`	任 phase prompt 起时
2 行数 mode-aware	self-contained ≤ 300, 否则 ≤ 150	任 phase prompt 起时
3 红线 verbatim 或 inherited	grep 3 句红线字面	任 phase prompt 起时
4 exit-patch blacklist `67(h\|e\|p\|q\|r\|s)` ≥ 3 → BLOCK	自动 grep phase 系列文件数	任 phase prompt 起时
5 self-contained 零上下文 (V1-V4 4 类探针)	自动 grep	任 phase prompt 起时

加新 lineage 时更新

当起新 phase 系列 (e.g. phase70.{a,b,…}):

是 source-fix? 加进 whitelist (类比 phase68/phase69)
是 exit-patch? 加进 blacklist (类比 phase67)

修法: 改 .claude-plugins/monitor-phase-workflow/skills/phase-prompt-guard/SKILL.md Check 4 段.

评估 agent 模板维护

R(N) 评估用 acceptance-evaluator-prompt.md. 加新 endpoint / 新 phase 时:

改 §输入 5 项 (加新 fixture 路径)
改 §YAML schema (加新 phase 字段)
不破坏老 R(N) 评估 doc 真值 (历史不动)

升级 unittest baseline

加新 test 时 baseline 真涨:

改 CLAUDE.md §硬约束 4 期望 baseline 数字
改 acceptance-evaluator-prompt.md unittest_baseline 真值
改本 doc + 01-overview.md §项目快照数字

真不要做的事

按 feedback_bitter_lessons_r10_r37.md 10 条底线:

❌ 不要加 ≥3 件同源 exit-patch 出口字面 caveat (phase67 系列已 BLOCK)
❌ 不要起 ≥3 sub-task 单 phase prompt
❌ 不要让 evaluator query 设计漂移 (按 evaluator-query-design.md §2 钉死)
❌ 不要不入 fixture 跑评估 (lesson 3)
❌ 不要不二审就采纳评估 agent 真凶判定 (lesson 4)
❌ 不要改 backend 不重启 (lesson 6)
❌ 不要 self-contained prompt 引用外部 doc (lesson 8)
❌ 不要真泄漏发现后只修源头不洗 history (lesson 10)
❌ 不要让 sub-agent 写 coding (feedback_no_subagent_for_coding)
❌ 不要删 docs (含 phase-archive/ 历史挂账)

真做的事

✅ 严守 sub-task ≤ 2 ✅ 严守 self-contained prompt ✅ 严守评估 doc YAML schema ✅ 严守安全三旗 (raw_log_scan=false / preview_only=true / execution_enabled=false) ✅ 严守每轮 fixture 入 git ✅ 严守改 backend 必重启 + 验 mtime + 清 .pyc ✅ 严守真泄漏发现立即双修 (filter-repo + 修源头) ✅ 严守 source-fix 优先 (改源头, 不修出口字面) ✅ 严守 planning agent 独立推断真凶 (不采评估 agent 结论) ✅ 严守 phase 起前先调查再起 prompt (不允许直接动手)

下一步: 08-version-history.md (系统怎么演进到现在 + 教训)

07 - 扩展指南#

改 backend 必做 SOP#

Step 1: 起 phase 前先调查 (不允许直接动手)#

Step 2: 起 phase prompt — 必过 plugin v0.1.2 5 检查#

Step 3: 跑 coding (用户自跑, 不用 sub-agent)#

Step 4: 改后必重启 backend + 验 mtime + 清 .pyc#

Step 5: 跑真后端评估 R(N) — fixture 必入 git#

Step 6: planning agent 二审 — 独立推真凶#

加新 ES 工具 (扩 14 → 15+)#

1. 设计真聚合 query#

2. 注册到 tool_registry#

3. 更新 LLM model planner prompt 白名单 (如要 LLM 真选这个工具)#

4. 加测试#

5. 跑真后端 R(N) 验#

加新 endpoint (扩 4 → 5+)#

1. 设计 endpoint contract#

2. 加业务模块#

3. 加 frontend API client#

4. 加 frontend Vue 组件 (如要 UI)#

5. 更新 evaluator-query-design.md (R(N) 评估真要钉死的新 endpoint)#

6. 跑真后端 R(N) 验#

改 LLM prompt (慎)#

真做法#

真案例#

加新检测家族 (e.g. WiFi / VPN / SNMP)#

真步骤#

改 frontend UI#

真改一个 Vue 组件#

加新 Vue 组件#

接 backend 真新字段#

Plugin v0.1.2 phase-prompt-guard 维护#

5 检查现状#

加新 lineage 时更新#

评估 agent 模板维护#

升级 unittest baseline#

真不要做的事#

真做的事#