架构设计文档¶

本文档面向开发者，介绍 OpenClaw-Py 的整体架构、核心组件和设计决策。

1. 总览¶

OpenClaw-Py 是一个多通道 AI 网关，将 AI Agent 连接到 27+ 消息平台。核心设计目标：

通道无关：Agent 逻辑与通道解耦，新通道只需实现 ChannelPlugin 接口
Provider 无关：统一的流式接口屏蔽 LLM 差异
可扩展：通过 Python entry_points 和 SKILL.md 注入能力
跨平台：TypeScript 客户端在桌面、移动端、Web 复用同一套代码

系统架构图¶

flowchart TB
    subgraph Clients["客户端"]
        UI["TypeScript Client"]
        CLI["CLI (Typer)"]
        ACP["ACP Bridge"]
    end

    subgraph Gateway["Gateway (FastAPI)"]
        WS["WebSocket v3"]
        HTTP["HTTP API"]
        RPC["RPC Handlers"]
        OAI["OpenAI Compat"]
    end

    subgraph Runtime["Agent Runtime"]
        Runner["Runner Loop"]
        Session["Session DAG"]
        SubAgent["Sub-agents"]
    end

    subgraph Tools["工具层"]
        BuiltIn["内置工具"]
        MCP["MCP Tools"]
        Skills["Skills"]
    end

    subgraph LLM["LLM Providers"]
        OpenAI["OpenAI"]
        Anthropic["Anthropic"]
        Google["Google"]
        Ollama["Ollama"]
    end

    UI --> WS
    CLI --> WS
    ACP --> HTTP

    WS --> RPC
    HTTP --> OAI
    RPC --> Runner
    OAI --> Runner

    Runner --> Session
    Runner --> SubAgent
    Runner --> Tools

    BuiltIn --> Runner
    MCP --> Runner
    Skills --> Runner

    Runner --> LLM

数据流图¶

sequenceDiagram
    participant User as 用户
    participant Channel as 通道
    participant Gateway as Gateway
    participant Agent as Agent Runtime
    participant LLM as LLM Provider
    participant Tools as 工具

    User->>Channel: 发送消息
    Channel->>Gateway: on_message callback
    Gateway->>Agent: chat.send RPC
    Agent->>Agent: 追加到会话
    Agent->>Agent: 构建系统提示

    loop 推理循环
        Agent->>LLM: stream(messages, tools)
        LLM-->>Agent: text delta / tool_call

        alt 工具调用
            Agent->>Tools: execute(tool, args)
            Tools-->>Agent: result
            Agent->>Agent: 追加工具结果
        end
    end

    Agent-->>Gateway: chat.done event
    Gateway-->>Channel: send_reply
    Channel-->>User: 回复消息

2. 核心组件¶

2.1 Gateway Server¶

src/pyclaw/gateway/server.py

FastAPI 应用，暴露 WebSocket (/ws) 和 HTTP 端点
WebSocket 采用 JSON 帧协议 v3，支持请求/响应/事件三种帧类型
方法分发：按 method 字段路由到 methods/ 下的处理函数
支持多客户端并发连接
启动时加载插件 (_load_plugins) 和配置监控 (_start_config_watcher)

class GatewayServer:
    def register_handler(self, method: str, handler: MethodHandler) -> None: ...
    async def _dispatch(self, method: str, params: dict, conn: GatewayConnection) -> Any: ...

2.2 Agent Runtime¶

src/pyclaw/agents/runner.py

核心执行循环：

while not done:
    response = await llm.stream(messages, tools)
    for chunk in response:
        if chunk.is_tool_call:
            result = await tool_registry.execute(chunk.tool, chunk.args)
            messages.append(tool_result(result))
        elif chunk.is_text:
            yield chunk.text
    if no_tool_calls:
        done = True

设计特点：

多 Provider 统一流式：stream.py 将 OpenAI / Anthropic / Gemini / Ollama 的流式 API 归一化为相同的 chunk 格式
工具执行沙箱：通过 tool_guards.py 控制危险操作审批
会话持久化：JSONL DAG 格式，支持分支和压缩
子 Agent：subagents/ 管理 spawn / steer / kill 生命周期

2.3 Channels¶

src/pyclaw/channels/

每个通道是一个独立目录，包含 channel.py 实现 ChannelPlugin 基类：

class ChannelPlugin(ABC):
    async def start(self) -> None: ...
    async def stop(self) -> None: ...
    async def send_reply(self, reply: ChannelReply) -> None: ...
    def on_message(self, callback) -> None: ...

通道管理器 (manager.py) 统一管理所有通道的生命周期。

Plugin SDK (plugin_sdk/) 定义了 20 个 Protocol 接口：

Protocol	用途
`ConfigAdapter`	配置 schema + 验证
`AuthAdapter`	认证
`OutboundAdapter`	消息发送
`ActionsAdapter`	反应、置顶
`StreamingAdapter`	草稿/流式消息
`HeartbeatAdapter`	连接心跳
`DirectoryAdapter`	用户目录
...	共 20 个

运行时通过 detect_capabilities(plugin) 探测通道支持哪些能力。

2.4 配置系统¶

src/pyclaw/config/

Pydantic v2 schema (schema.py)：强类型配置模型，30+ 配置节
JSON5 格式：支持注释和尾逗号
环境变量替换 (env_substitution.py)：${VAR} / ${VAR:-default} 语法
$include 分拆 (includes.py)：大配置文件可拆分为多个 JSON5
版本迁移 (migrations.py)：v1 → v2 → v3 自动迁移
热重载 (runtime_overrides.py)：后台轮询检测文件变更，触发回调
原子写入 (backup.py)：先写临时文件再原子重命名，保留备份

2.5 MCP 客户端¶

src/pyclaw/mcp/

实现 MCP (Model Context Protocol) 客户端，连接外部工具服务器：

stdio transport：启动子进程，通过 stdin/stdout 通信
HTTP transport：通过 HTTP 连接远程 MCP 服务器
工具注册表：将发现的工具映射为 Agent 可用的 tool schema
配置格式兼容 Claude Desktop 和 Cursor

2.6 Memory¶

src/pyclaw/memory/

完整的记忆管理系统，支持上下文压缩、对话持久化、长期记忆和智能检索：

┌─────────────────────────────────────────────────────────────┐
│                    MemoryManager (统一接口)                   │
├─────────────────────────────────────────────────────────────┤
│  ┌────────────────┐  ┌────────────────┐  ┌───────────────┐  │
│  │  Compaction    │  │  Persistence   │  │   Long-term   │  │
│  │  上下文压缩     │  │  对话持久化     │  │   长期记忆    │  │
│  │  ContextChecker│  │  DialogStore   │  │   MEMORY.md   │  │
│  │  Summarizer    │  │  DailyLog      │  │               │  │
│  │  ToolCompactor │  │  LongTermMemory│  │               │  │
│  └───────┬────────┘  └───────┬────────┘  └───────┬───────┘  │
│          │                   │                   │          │
│  ┌───────▼───────────────────▼───────────────────▼───────┐  │
│  │                 MemoryStore (存储层)                   │  │
│  │  ┌─────────────┐            ┌─────────────┐           │  │
│  │  │ SQLite FTS5 │            │  LanceDB    │           │  │
│  │  │ (关键词搜索) │            │ (向量搜索)  │           │  │
│  │  └──────┬──────┘            └──────┬──────┘           │  │
│  │         └──────────┬───────────────┘                  │  │
│  │               ┌─────▼─────┐                            │  │
│  │               │  Hybrid   │  MMR + 时间衰减             │  │
│  │               │  Ranker   │                            │  │
│  │               └───────────┘                            │  │
│  └────────────────────────────────────────────────────────┘  │
│  ┌────────────────────────────────────────────────────────┐  │
│  │                 Hooks (Agent 集成)                     │  │
│  │  pre_reasoning: 自动检查上下文并触发压缩               │  │
│  └────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────┘

核心组件¶

模块	功能
`manager.py`	统一记忆管理接口，整合所有子模块
`store.py`	SQLite + FTS5 存储，支持全文检索
`lancedb_backend.py`	LanceDB 向量存储，语义搜索
`hybrid.py`	混合检索：关键词 + 向量融合
`mmr.py`	MMR 重排序，减少冗余结果
`temporal_decay.py`	时间衰减，近期记忆权重更高

上下文压缩 (`compaction/`)¶

模块	功能
`context_checker.py`	检查上下文长度，判断是否需要压缩
`summarizer.py`	使用 LLM 生成结构化摘要
`tool_compactor.py`	压缩长工具输出，保留关键信息

持久化 (`persistence/`)¶

模块	功能
`dialog_store.py`	JSONL 格式对话记录，支持按日期存储
`daily_log.py`	每日摘要日志管理
`long_term.py`	长期记忆管理 (MEMORY.md)

Agent 集成 (`hooks/`)¶

模块	功能
`pre_reasoning.py`	Agent 推理前的上下文检查 Hook

配置系统 (`types.py`)¶

Pydantic 模型定义：

MemoryConfig(
    compaction=CompactionConfig(max_tokens=128000, reserve_ratio=0.3),
    summary=SummaryConfig(model="gpt-4o-mini", max_summary_tokens=2000),
    dialog=DialogConfig(enabled=True, retention_days=30),
    long_term=LongTermMemoryConfig(file="MEMORY.md"),
    tool_result=ToolResultConfig(max_length=2000),
)

工作流程¶

Agent 推理前 (pre_reasoning hook)
  │
  ├─ ContextChecker.check_context(messages)
  │     │
  │     ├─ 计算总 token 数
  │     ├─ 判断是否超过阈值
  │     └─ 返回 ContextCheckResult
  │
  ├─ if needs_compaction:
  │     │
  │     ├─ Summarizer.generate_summary(messages_to_compact)
  │     │     └─ LLM 生成结构化摘要 (goals, decisions, progress, next_steps)
  │     │
  │     └─ 应用压缩：摘要替换旧消息
  │
  └─ Agent 继续推理

文件存储布局¶

workspace/
├── MEMORY.md                    # 长期记忆：用户偏好、重要决策
├── memory/
│   ├── 2026-03-24.md           # 每日摘要日志
│   └── ...
├── dialog/
│   ├── 2026-03-24.jsonl        # 原始对话记录
│   └── ...
├── tool_result/
│   └── <uuid>.txt              # 长工具输出缓存
└── memory.db                   # SQLite 索引数据库

2.7 UI (TypeScript 客户端)¶

src/pyclaw/ui/

基于 TypeScript 的跨平台 UI：

模块	功能
`app.py`	主应用 + ChatView + NavigationRail
`theme.py`	Light/Dark 主题
`agents_panel.py`	Agent 管理面板
`toolbar.py`	聊天工具栏
`menubar.py`	桌面菜单栏
`voice.py`	语音交互 (TTS + STT)
`onboarding.py`	4 步设置向导
`tray.py`	系统托盘
`i18n.py`	多语言 (en/zh-CN/ja)
`permissions.py`	移动端权限管理

3. 数据流¶

3.1 用户消息 → Agent 回复¶

User (Web/Desktop/Mobile/CLI/Channel)
  │
  ▼
Gateway.chat.send(sessionId, message)
  │
  ▼
AgentRunner.run(session, message)
  │
  ├─ append user message to session
  │
  ├─ build system prompt (AGENTS.md + skills + context)
  │
  ├─ LLM.stream(messages, tools)
  │     │
  │     ├─ text delta → push chat.delta event
  │     │
  │     └─ tool_call → ToolRegistry.execute()
  │           │
  │           ├─ built-in tool
  │           ├─ MCP tool (stdio/HTTP)
  │           └─ approval gate (if exec/patch)
  │
  ├─ append assistant message to session
  │
  └─ push chat.done event

3.2 通道消息路由¶

Channel (Telegram/Discord/...)
  │
  ▼
ChannelPlugin.on_message(callback)
  │
  ▼
ChannelManager → Routing (7-tier priority)
  │
  ▼
AgentRunner.run(...)  ← 选定的 Agent
  │
  ▼
Channel.send_reply(response)

路由优先级：

会话绑定 (session → agent)
线程绑定 (thread → agent)
通道绑定 (channel → agent)
群组绑定 (chat_id → agent)
用户绑定 (sender → agent)
全局绑定 (pattern match)
默认 Agent

4. 安全模型¶

4.1 命令执行审批¶

Agent 请求执行命令
  │
  ▼
ToolGuards.check_policy(command)
  │
  ├─ allow_list → 直接执行
  ├─ deny_list → 拒绝
  └─ approval_required → 推送审批请求到 UI/CLI
        │
        ▼
      用户审批/拒绝

4.2 工作区沙箱¶

工具操作限制在配置的工作区目录内
文件路径规范化防止目录遍历
环境变量过滤敏感信息

4.3 网络安全¶

SSRF 防护：阻止对内网 IP / DNS 重绑定的请求
Secret 扫描：检测配置中的明文密钥并警告
Gateway 绑定：默认绑定 127.0.0.1，防止未授权访问
TLS 指纹：验证远程连接的证书

5. 配置层次结构¶

配置文件采用 JSON5 格式，支持注释、尾逗号和环境变量引用。

graph TD
    subgraph Config["pyclaw.json 配置结构"]
        Models["models<br/>LLM Provider 与模型"]
        Agents["agents<br/>Agent 默认设置"]
        Channels["channels<br/>消息通道"]
        Tools["tools<br/>工具与 MCP"]
        Session["session<br/>会话管理"]
        Gateway["gateway<br/>网关设置"]
        Cron["cron<br/>定时任务"]
        Memory["memory<br/>记忆系统"]
        UI["ui<br/>界面设置"]
    end

    subgraph Providers["models.providers"]
        P1["openai"]
        P2["anthropic"]
        P3["google"]
        P4["ollama"]
        P5["自定义 Provider"]
    end

    subgraph ChannelList["channels"]
        C1["telegram"]
        C2["discord"]
        C3["slack"]
        C4["dingtalk"]
        C5["其他 21 个..."]
    end

    subgraph ToolsConfig["tools"]
        T1["exec (命令执行)"]
        T2["mcpServers"]
        T3["restrictToWorkspace"]
    end

    Models --> Providers
    Channels --> ChannelList
    Tools --> ToolsConfig

配置优先级¶

flowchart LR
    A["环境变量<br/>PYCLAW_*"] --> B["配置文件<br/>~/.pyclaw/pyclaw.json"]
    B --> C["$include 分拆文件"]
    C --> D["默认值"]

    style A fill:#f9f,stroke:#333
    style B fill:#bbf,stroke:#333
    style C fill:#bfb,stroke:#333
    style D fill:#ddd,stroke:#333

优先级说明：

环境变量：最高优先级，如 OPENAI_API_KEY 覆盖配置中的值
配置文件：主配置 pyclaw.json
$include 分拆：通过 $include 引入的子配置文件
默认值：代码中定义的安全默认值

环境变量替换¶

配置中可使用 ${VAR} 或 ${VAR:-default} 语法引用环境变量：

{
  models: {
    providers: {
      openai: { apiKey: "${OPENAI_API_KEY}" },
      anthropic: { apiKey: "${ANTHROPIC_API_KEY:-}" },
    },
  },
}

6. 扩展机制¶

6.1 Plugin (entry_points)¶

通过 Python entry_points 机制自动发现第三方插件：

# 第三方包的 pyproject.toml
[project.entry-points."pyclaw.plugins"]
my_plugin = "my_package.plugin:MyPlugin"

Gateway 启动时自动加载所有已安装的插件。

6.2 SKILL.md¶

Agent 能力注入：

~/.pyclaw/workspace/skills/
├── my-skill/
│   └── SKILL.md    # 包含系统提示和工具定义
└── ...

技能在 Agent 系统提示中被注入，可通过 ClawHub marketplace 安装。

6.3 Hook 系统¶

事件钩子 (HOOK.md)：

# HOOK.md
## on_message_received
Run sentiment analysis before processing.

支持 before_send / after_send / on_error 等生命周期钩子。

7. 设计决策¶

决策	选择	理由
Web 框架	FastAPI	原生 async，WebSocket 支持好
CLI 框架	Typer + Rich	类型安全 + 丰富输出
配置格式	JSON5	支持注释，兼容 JSON
会话存储	JSONL (文件)	无需数据库，便于调试
向量引擎	LanceDB	零配置，嵌入式
UI 框架	Next.js / Tauri / Expo	TypeScript 跨平台
LLM 流式	统一 chunk 格式	屏蔽 Provider 差异
进程管理	asyncio subprocess	与 async 架构一致
日志	stdlib logging	无额外依赖
类型系统	Pydantic v2	验证 + 序列化
记忆存储	SQLite + LanceDB	混合检索，兼顾精确与语义
上下文压缩	LLM 摘要	结构化摘要，保留关键上下文
对话持久化	JSONL + 按日期分片	便于管理和归档
长期记忆	Markdown 文件	人类可读，便于编辑

8. 目录结构¶

src/pyclaw/
├── __init__.py
├── main.py               # 入口
│
├── agents/               # Agent 运行时
│   ├── runner.py          # 核心循环
│   ├── stream.py          # 多 Provider 流式
│   ├── session.py         # 会话存储
│   ├── system_prompt.py   # 系统提示构建
│   ├── tokens.py          # Token 计数
│   ├── embedded_runner/   # 嵌入式运行器
│   ├── providers/         # LLM Provider 适配
│   ├── subagents/         # 子 Agent 管理
│   ├── skills/            # SKILL.md 系统
│   ├── tools/             # 20+ 内置工具
│   ├── progress.py        # 进度事件
│   ├── memory_integration.py  # 记忆系统集成
│   └── types.py           # 共享类型
│
├── gateway/              # Gateway 服务
│   ├── server.py          # FastAPI + WebSocket
│   ├── openai_compat.py   # OpenAI API 兼容层
│   ├── protocol/          # 帧定义
│   ├── methods/           # RPC 方法处理器
│   └── events.py          # 事件广播
│
├── channels/             # 25 个消息通道
│   ├── base.py            # ChannelPlugin 基类
│   ├── manager.py         # 通道管理器
│   ├── plugin_sdk/        # 20 个 Protocol 接口
│   ├── plugins/           # 通道增强 (onboarding/outbound/actions/normalize)
│   └── <channel>/         # 各通道实现
│
├── memory/               # 记忆系统
│   ├── __init__.py        # 公共接口导出
│   ├── manager.py         # 统一记忆管理器
│   ├── store.py           # SQLite + FTS5 存储
│   ├── lancedb_backend.py # LanceDB 向量存储
│   ├── hybrid.py          # 混合检索
│   ├── mmr.py             # MMR 重排序
│   ├── temporal_decay.py  # 时间衰减
│   ├── embeddings.py      # 向量嵌入
│   ├── types.py           # 类型定义
│   ├── compaction/        # 上下文压缩
│   │   ├── context_checker.py
│   │   ├── summarizer.py
│   │   └── tool_compactor.py
│   ├── persistence/       # 持久化存储
│   │   ├── dialog_store.py
│   │   ├── daily_log.py
│   │   └── long_term.py
│   ├── hooks/             # Agent 集成钩子
│   │   └── pre_reasoning.py
│   └── qmd/               # QMD 格式支持
│
├── config/               # 配置管理
├── mcp/                  # MCP 客户端
├── security/             # 安全策略
├── secrets/              # 密钥管理
├── hooks/                # 事件钩子
├── plugins/              # 扩展系统
├── routing/              # 消息路由
├── media/                # 媒体处理
├── social/               # Agent 社交网络
├── infra/                # 基础设施
├── cli/                  # CLI 命令
└── ui/                   # TypeScript 客户端