成本失控的 5 个技术陷阱与实时监控方案

Hacker News 上的警告此起彼伏：

"I set $10 on fire the other day as I was running through some tests."

"70 million tokens on 1 day and 30 million the next day by 9 AM"

Clawdbot 的成本失控不是偶发现象，而是多个技术陷阱叠加的必然结果。

陷阱 1：上下文无限增长（O(N²) 复杂度）

现象

第1天: 50 轮对话，消耗 200万 tokens
第2天: 只发了 10 条消息，消耗 150万 tokens

第2天明明消息更少，为什么消耗更大？

技术根源

每次 API 调用都包含完整历史：

// Day 1 结束时的历史
const historyDay1 = {
  messages: 50 rounds,
  totalTokens: 200,000
};

// Day 2 第1轮调用
const input = {
  system: 13,000 tokens,
  history: 200,000 tokens,  // Day 1 的所有历史
  newMessage: 20 tokens
};
// 总输入: 213,020 tokens

// Day 2 第10轮调用
const input = {
  system: 13,000 tokens,
  history: 200,000 + 50,000 tokens,  // Day 1 + Day 2 前9轮
  newMessage: 20 tokens
};
// 总输入: 263,020 tokens

累积效应：

# Python 模拟
def calculate_cost(rounds, tokens_per_round=500):
    total = 0
    for i in range(1, rounds+1):
        # 每轮包含之前所有历史
        input_tokens = i * tokens_per_round
        total += input_tokens
    return total

print(f"100轮对话总消耗: {calculate_cost(100):,} tokens")
# 输出: 2,525,000 tokens

这是 O(N²) 的复杂度，无法持续。

缓解方案

class ContextManager {
  async getContext(sessionId: string): Promise<Message[]> {
    const messages = await db.getMessages(sessionId);
    
    // 策略1: 滑动窗口
    const maxTokens = 50000;
    let tokens = 0;
    let result = [];
    
    for (let i = messages.length - 1; i >= 0; i--) {
      const msgTokens = estimateTokens(messages[i]);
      if (tokens + msgTokens > maxTokens) break;
      tokens += msgTokens;
      result.unshift(messages[i]);
    }
    
    // 策略2: 保留关键事实
    if (result.length < messages.length) {
      const keyFacts = await this.extractKeyFacts(
        messages.slice(0, messages.length - result.length)
      );
      result.unshift({
        role: 'system',
        content: `历史关键信息: ${keyFacts}`
      });
    }
    
    return result;
  }
  
  async extractKeyFacts(messages: Message[]): Promise<string> {
    // 使用 Haiku 提取关键事实
    const response = await claude.messages.create({
      model: 'claude-haiku-4.5',
      messages: [{
        role: 'user',
        content: `提取以下对话的关键事实（用户偏好、重要信息），忽略闲聊：
        ${messages.map(m => m.content).join('\n---\n')}`
      }]
    });
    return response.content[0].text;
  }
}

陷阱 2：Agent Loop 死循环

现象

用户: "帮我找一下关于 AI 的最新新闻"

[30 分钟后]
Clawdbot 仍在运行...

[检查日志]
browser_navigate(hackernews.com)
browser_click("More")  ← 重复 50 次
browser_click("More")
browser_click("More")
...

Agent 进入无限循环，疯狂消耗 token。

技术根源

AI 的"停止条件"判断失误：

// AI 的内部逻辑（伪代码）
while (!taskCompleted) {
  const page = await browser.getCurrentPage();
  
  if (page.includes('More')) {
    await browser.click('More');  // 加载更多
    continue;
  }
  
  taskCompleted = true;
}

问题：HN 的"More"按钮是动态加载的，永远有下一页。

AI 不知道"应该停在第几页"，会一直点下去。

真实案例

某用户让 Clawdbot "下载 Python 官方文档"：

AI 执行:
1. browser_navigate(docs.python.org)
2. browser_click("Download PDF") → 404 (没有 PDF)
3. browser_navigate(docs.python.org/3/)
4. browser_click("Download") → 找不到元素
5. browser_navigate(docs.python.org/3/download.html)
6. browser_click(...) → 失败
7. 尝试其他方法 → 失败
...
[重试 100 次]

3 小时后，用户发现任务还在运行，已消耗 7000 万 tokens（约 $120）。

缓解方案

class LoopDetector {
  private actions = [];
  
  checkLoop(action: Action): boolean {
    this.actions.push({
      type: action.type,
      target: action.target,
      timestamp: Date.now()
    });
    
    // 清理 5 分钟前的记录
    this.actions = this.actions.filter(
      a => a.timestamp > Date.now() - 300000
    );
    
    // 检测重复操作
    const recent = this.actions.slice(-10);
    const uniqueActions = new Set(recent.map(a => `${a.type}:${a.target}`));
    
    // 如果最近 10 次操作只有 2-3 种，可能是循环
    if (uniqueActions.size <= 3) {
      logger.warn('Possible infinite loop detected', {
        actions: recent
      });
      return true;
    }
    
    // 检测完全相同的操作
    const lastAction = recent[recent.length - 1];
    const sameCount = recent.filter(
      a => a.type === lastAction.type && a.target === lastAction.target
    ).length;
    
    if (sameCount >= 5) {
      logger.error('Infinite loop detected', {
        action: lastAction,
        count: sameCount
      });
      return true;
    }
    
    return false;
  }
}

// 使用
if (loopDetector.checkLoop(action)) {
  throw new Error('任务被终止：检测到无限循环');
}

陷阱 3：Browser Tool 加载大型页面

现象

用户: "总结这篇论文 arxiv.org/abs/2501.12345"

AI:
  browser_navigate(arxiv.org/abs/2501.12345)
  → 加载了完整的 PDF (转换为 Markdown)
  → 50 页论文 = 150,000 tokens

单次加载就消耗大量 token。

技术根源

async function fetchPage(url: string) {
  const html = await browser.getContent();
  const markdown = htmlToMarkdown(html);
  
  // 没有长度检查，直接返回
  return markdown;
}

缓解方案

async function fetchPage(url: string) {
  const html = await browser.getContent();
  let markdown = htmlToMarkdown(html);
  
  const tokens = estimateTokens(markdown);
  const maxTokens = 10000;
  
  if (tokens > maxTokens) {
    logger.warn(`Page too large: ${tokens} tokens, summarizing...`);
    
    // 分段总结
    const chunks = splitIntoChunks(markdown, maxTokens);
    const summaries = await Promise.all(
      chunks.map(chunk => summarizeChunk(chunk))
    );
    
    markdown = summaries.join('\n\n');
  }
  
  return markdown;
}

async function summarizeChunk(text: string): Promise<string> {
  const response = await claude.messages.create({
    model: 'claude-haiku-4.5',  // 便宜的模型
    messages: [{
      role: 'user',
      content: `总结以下内容的核心观点（200 字以内）：\n${text}`
    }]
  });
  return response.content[0].text;
}

效果：

原始: 150,000 tokens
分段总结: 10 chunks × 1,000 tokens = 10,000 tokens
节省: 93%

陷阱 4：并发请求爆炸

现象

用户在 3 个 WhatsApp 群、5 个 Telegram 频道里都 @ 了 bot

[同一时刻]
10 条消息同时到达

Clawdbot 并发处理 → 10 个 API 调用同时发起

如果每个调用 30k tokens：

30,000 × 10 = 300,000 tokens (瞬间消耗)

如果触发 Agent Loop，可能 10 个任务都陷入循环，成本爆炸。

技术根源

// 错误的实现：无限制并发
async function handleMessage(message: Message) {
  const response = await agent.chat(message.content);
  await sendReply(message.from, response);
}

// 所有消息同时触发
incomingMessages.forEach(msg => handleMessage(msg));  // 并发！

缓解方案

import PQueue from 'p-queue';

// 限制并发数
const queue = new PQueue({
  concurrency: 2,  // 最多 2 个任务同时执行
  timeout: 300000  // 单个任务超时 5 分钟
});

async function handleMessage(message: Message) {
  await queue.add(async () => {
    const response = await agent.chat(message.content);
    await sendReply(message.from, response);
  });
}

// 添加到队列，自动限流
incomingMessages.forEach(msg => handleMessage(msg));

效果：

10 条消息到达 → 2 条立即处理，8 条排队

而不是 10 条同时处理

陷阱 5：错误模型选择

现象

用户: "1+1等于几？"

Clawdbot 使用 Claude Opus 4.5:
  输入: 13,020 tokens × $15/M = $0.195
  输出: 10 tokens × $75/M = $0.00075
  总计: $0.196

一个简单算术，花费 $0.20。

如果一天问 100 个类似问题，花费 $20。

技术根源

Clawdbot 默认使用最强大（最贵）的模型：

{
  "agent": {
    "model": "claude-opus-4.5"
  }
}

所有任务都用同一个模型，无论简单还是复杂。

缓解方案

function selectModel(task: string): string {
  // 简单问答：Haiku ($1/M 输出)
  if (isSimpleQA(task)) {
    return 'claude-haiku-4.5';
  }
  
  // 代码生成/分析：Sonnet ($15/M 输出)
  if (isCodeTask(task)) {
    return 'claude-sonnet-4.5';
  }
  
  // 复杂推理：Opus ($75/M 输出)
  if (isComplexReasoning(task)) {
    return 'claude-opus-4.5';
  }
  
  // 默认使用 Sonnet
  return 'claude-sonnet-4.5';
}

function isSimpleQA(task: string): boolean {
  const patterns = [
    /^.{1,50}$/,  // 很短的问题
    /天气|时间|日期/,
    /\d+\s*[+\-*/]\s*\d+/,  // 算术
    /^(什么是|who is|what is)/i
  ];
  
  return patterns.some(p => p.test(task));
}

成本对比：

100 个简单问题:
  Opus: $20
  Haiku: $0.30
  节省: 98.5%

实时监控方案

监控 1：Token 计数器

class TokenCounter {
  private counts = new Map<string, {input: number, output: number}>();
  
  async record(sessionId: string, input: number, output: number) {
    const current = this.counts.get(sessionId) || {input: 0, output: 0};
    this.counts.set(sessionId, {
      input: current.input + input,
      output: current.output + output
    });
    
    // 存入数据库
    await db.insertTokenUsage({
      sessionId,
      timestamp: Date.now(),
      inputTokens: input,
      outputTokens: output,
      cost: this.calculateCost(input, output)
    });
  }
  
  calculateCost(input: number, output: number, model: string = 'opus'): number {
    const pricing = {
      opus: {input: 15, output: 75},
      sonnet: {input: 3, output: 15},
      haiku: {input: 0.25, output: 1.25}
    };
    
    const price = pricing[model];
    return (input * price.input + output * price.output) / 1000000;
  }
  
  getSessionCost(sessionId: string): number {
    const counts = this.counts.get(sessionId);
    if (!counts) return 0;
    return this.calculateCost(counts.input, counts.output);
  }
}

监控 2：实时成本仪表盘

// 创建 HTTP 接口
app.get('/api/cost/dashboard', async (req, res) => {
  const today = new Date().toDateString();
  
  // 今日消耗
  const todayCost = await db.query(`
    SELECT SUM(cost) as total
    FROM token_usage
    WHERE DATE(timestamp) = ?
  `, [today]);
  
  // 本月消耗
  const monthCost = await db.query(`
    SELECT SUM(cost) as total
    FROM token_usage
    WHERE strftime('%Y-%m', timestamp) = strftime('%Y-%m', 'now')
  `);
  
  // 各模型占比
  const breakdown = await db.query(`
    SELECT model, SUM(cost) as cost
    FROM token_usage
    WHERE DATE(timestamp) = ?
    GROUP BY model
  `, [today]);
  
  // Top 消耗 sessions
  const topSessions = await db.query(`
    SELECT session_id, SUM(cost) as cost
    FROM token_usage
    WHERE DATE(timestamp) = ?
    GROUP BY session_id
    ORDER BY cost DESC
    LIMIT 10
  `, [today]);
  
  res.json({
    today: todayCost.total,
    month: monthCost.total,
    breakdown,
    topSessions
  });
});

前端可视化：

$ clawdbot cost dashboard

╔═══════════════════════════════════════╗
║       Clawdbot 成本仪表盘              ║
╠═══════════════════════════════════════╣
║ 今日: $12.50 / $20.00 (预算)         ║
║ 本月: $85.30 / $200.00 (预算)        ║
╠═══════════════════════════════════════╣
║ 模型分布:                             ║
║   Opus:   $8.00 (64%)                ║
║   Sonnet: $3.50 (28%)                ║
║   Haiku:  $1.00 (8%)                 ║
╠═══════════════════════════════════════╣
║ Top Sessions:                        ║
║   1. telegram:@user1  $4.20          ║
║   2. whatsapp:+123    $3.80          ║
║   3. discord:server1  $2.50          ║
╚═══════════════════════════════════════╝

监控 3：实时告警

class CostAlertSystem {
  async checkAndAlert(sessionId: string, newCost: number) {
    const totalToday = await this.getTodayCost();
    const sessionTotal = await this.getSessionCost(sessionId);
    
    const config = {
      dailyBudget: 20.0,
      sessionBudget: 5.0,
      alertThreshold: 0.8
    };
    
    // 日预算告警
    if (totalToday + newCost > config.dailyBudget * config.alertThreshold) {
      await this.sendAlert(
        `⚠️ 日成本接近上限: $${(totalToday + newCost).toFixed(2)} / $${config.dailyBudget}\n` +
        `当前会话: ${sessionId}\n` +
        `预计总消耗: $${(totalToday + newCost).toFixed(2)}`
      );
    }
    
    // 会话预算告警
    if (sessionTotal + newCost > config.sessionBudget) {
      await this.sendAlert(
        `🛑 会话预算已用尽: ${sessionId}\n` +
        `消耗: $${(sessionTotal + newCost).toFixed(2)} / $${config.sessionBudget}\n` +
        `建议: 运行 'clawdbot session clear ${sessionId}'`
      );
      
      throw new Error('Session budget exceeded');
    }
  }
  
  private async sendAlert(message: string) {
    // 通过 Telegram 发送告警
    await telegram.sendMessage(ADMIN_CHAT_ID, message);
    
    // 同时写入日志
    logger.error('COST_ALERT', {message});
  }
}

监控 4：预算熔断

class CircuitBreaker {
  async checkBudget(sessionId: string): Promise<boolean> {
    const config = {
      dailyLimit: 20.0,
      sessionLimit: 5.0
    };
    
    const todayCost = await getTodayCost();
    const sessionCost = await getSessionCost(sessionId);
    
    // 超过日预算，直接拒绝
    if (todayCost >= config.dailyLimit) {
      throw new Error(
        `日预算已用尽: $${todayCost.toFixed(2)} / $${config.dailyLimit}\n` +
        `服务将在明天 00:00 UTC 自动恢复`
      );
    }
    
    // 超过会话预算，拒绝该会话
    if (sessionCost >= config.sessionLimit) {
      throw new Error(
        `会话预算已用尽: $${sessionCost.toFixed(2)} / $${config.sessionLimit}\n` +
        `请运行 'session clear' 或明天再试`
      );
    }
    
    return true;
  }
}

监控 5：异常检测

class AnomalyDetector {
  async detectAnomaly(sessionId: string, tokens: number): Promise<boolean> {
    // 获取该 session 的历史消耗
    const history = await db.getSessionHistory(sessionId, 100);
    
    // 计算统计指标
    const mean = history.reduce((sum, h) => sum + h.tokens, 0) / history.length;
    const stdDev = Math.sqrt(
      history.reduce((sum, h) => sum + Math.pow(h.tokens - mean, 2), 0) / history.length
    );
    
    // 如果当前消耗超过平均值 3 倍标准差，标记为异常
    if (tokens > mean + 3 * stdDev) {
      logger.warn('Token usage anomaly detected', {
        sessionId,
        currentTokens: tokens,
        averageTokens: mean,
        threshold: mean + 3 * stdDev
      });
      
      await sendAlert(
        `异常 token 消耗: ${sessionId}\n` +
        `当前: ${tokens} tokens\n` +
        `平均: ${Math.round(mean)} tokens\n` +
        `可能原因: Agent loop, 大型网页, 或攻击`
      );
      
      return true;
    }
    
    return false;
  }
}

完整的监控架构

// 集成所有监控组件
class CostMonitoringSystem {
  private tokenCounter = new TokenCounter();
  private loopDetector = new LoopDetector();
  private costAlert = new CostAlertSystem();
  private circuitBreaker = new CircuitBreaker();
  private anomalyDetector = new AnomalyDetector();
  
  async beforeAPICall(sessionId: string, estimatedTokens: number) {
    // 1. 检查预算
    await this.circuitBreaker.checkBudget(sessionId);
    
    // 2. 检查异常
    if (await this.anomalyDetector.detectAnomaly(sessionId, estimatedTokens)) {
      // 异常但不阻断，继续执行
    }
    
    // 3. 预警
    const cost = this.tokenCounter.calculateCost(estimatedTokens, 0);
    await this.costAlert.checkAndAlert(sessionId, cost);
  }
  
  async afterAPICall(sessionId: string, input: number, output: number, action: Action) {
    // 4. 记录消耗
    await this.tokenCounter.record(sessionId, input, output);
    
    // 5. 检测循环
    if (this.loopDetector.checkLoop(action)) {
      logger.error('Loop detected, stopping session', {sessionId});
      await stopSession(sessionId);
    }
  }
}

命令行工具

查看成本

# 今日成本
$ clawdbot cost today
Today: $12.50 / $20.00 (62.5%)

# 本月成本
$ clawdbot cost month
Month: $85.30 / $200.00 (42.7%)

# 各 session 成本
$ clawdbot cost breakdown
telegram:@user1   $45.20
whatsapp:+123     $28.10
discord:server1   $12.00

设置预算

# 设置日预算
$ clawdbot config set cost.dailyLimit 20.0

# 设置月预算
$ clawdbot config set cost.monthlyLimit 200.0

# 设置告警阈值
$ clawdbot config set cost.alertThreshold 0.8

清理历史

# 查看各 session 的 token 消耗
$ clawdbot session list

telegram:@user1
  Messages: 250
  Tokens: 2.5M (input) + 0.5M (output)
  Cost: $45.20
  
whatsapp:+123
  Messages: 180
  Tokens: 1.8M + 0.3M
  Cost: $28.10

# 清理高消耗 session
$ clawdbot session clear telegram:@user1
Cleared 250 messages
Freed ~2.5M tokens from context

# 验证
$ clawdbot cost breakdown
whatsapp:+123     $28.10
discord:server1   $12.00

最终建议

成本控制的核心原则：

预防胜于补救
- 设置预算上限（熔断）
- 监控异常消耗（告警）
- 定期清理历史（维护）
使用合适的工具
- 简单任务用 Haiku
- 复杂任务才用 Opus
- 启用 Prompt Caching
限制并发
- 最多 2-3 个任务同时执行
- 避免多渠道同时触发
设置超时
- 单个任务最多运行 5 分钟
- 检测 Agent Loop，立即终止
可视化监控
- 实时成本仪表盘
- 每日成本报告
- 异常消耗告警

如果不实施这些措施，Clawdbot 就是一个"烧钱机器"。

$300/2天不是极限，有人报告单日消耗 $1000+。

在 AI Agent 时代，成本控制是核心功能，不是可选项。

参考资料：

Anthropic Pricing: https://www.anthropic.com/pricing
Token 估算工具: https://platform.openai.com/tokenizer
Prometheus 监控: https://prometheus.io/