Author: xiyu
Want to use Claude Opus 4.6 but don't want your bill to explode at the end of the month? This article will help you cut costs by 60-85%.

Do you think tokens are just "what you say + what AI says in response"? Actually, they are much more than that.
Hidden costs of each conversation:
System Prompt (~3000-5000 tokens): OpenClaw core instruction, cannot be modified.
Context file injection (~3000-14000 tokens): AGENTS.md , SOUL.md , MEMORY.md , etc., included in every conversation—this is the biggest hidden overhead.
Historical messages: The conversation gets longer and longer
Your input + AI output: This is what you think is "all"
A simple "How's the weather today?" message actually consumes 8,000-15,000 input tokens. Using Opus, the context alone costs $0.12-0.22.
Cron is even more ruthless: each trigger = a completely new conversation = re-injecting the entire context. A cron job that runs every 15 minutes, 96 times a day, will cost $10-20 in Opus fees the following day.
Similarly, Heartbeat is essentially a dialogue call, and the shorter the interval, the more expensive it becomes.
The number one money-saving trick, and the most effective. Sonnet is priced at about 1/5 of Opus, and it's more than enough for 80% of daily tasks.
markdown
提示词:
请帮我把OpenClaw 的默认模型改为Claude Sonnet,
只在需要深度分析或创作时使用Opus。
具体需要:
1) 默认模型设为Sonnet
2) cron 任务默认用Sonnet
3) 只有写作、深度分析类任务指定Opus
Opus Scenarios: Long article writing, complex code, multi-step reasoning, creative tasks
Sonnet scenarios: casual conversation, simple Q&A, cron checks, heartbeat, file operations, translation.
Actual test results: After switching, monthly costs decreased by 65%, with almost no difference in user experience.
The "noise floor" for each call can be 3,000-14,000 tokens. Simplifying the injection file is the most cost-effective optimization.
markdown
提示词:
帮我精简OpenClaw 的上下文文件以节约token。
具体包括:1) AGENTS.md 删掉不需要的部分(群聊规则、TTS、不用的功能),压缩到800 tokens 以内
2) SOUL.md 精简为简洁要点,300-500 tokens
3) MEMORY.md 清理过期信息,控制在2000 tokens 以内
4) 检查workspaceFiles 配and remove unnecessary injection files.
Rule of thumb: For every 1000 tokens less injected, assuming 100 Opus calls per day, you can save approximately $45 per month.
markdown
提示词:帮我优化OpenClaw 的cron 任务以节约token。
请:
1) 列出所有cron 任务及其频率和模型
2) 把所有非创作类任务降级为Sonnet
3) 合并同时间段的任务(比如多个检查合为一个)
4) 降低不必要的高频率(系统检查从10 分钟改为30 分钟,版本检查从3 次/天改为1 次/天)
5) 配置delivery 为send notifications on demand; no messages will be sent under normal circumstances.
Core principle: More frequent is not necessarily better; most "real-time" requirements are false requirements. Merging 5 separate checks into a single call saves 75% of context injection cost.
markdow n
提示词:帮我优化OpenClaw heartbeat 配置:
1) 工作时间间隔设为45-60 分钟
2) 深夜23:00-08:00 设为静默期
3) 精简HEARTBEAT.md 到最少行数
4) 把分散的检查任务合并到heartbeat 批量执行
When the agent searches for information, it defaults to "reading the full text"—a 500-line file contains 3000-5000 tokens, but it only needs 10 lines. 90% of the input tokens are wasted.
QMD is a local semantic search tool that builds a full-text + vector index, allowing the agent to accurately locate paragraphs instead of reading the entire file. All computations are performed locally, with zero API cost.
Use with MQ (Mini Query): Preview directory structure, accurately extract paragraphs, and search for keywords—reading only the required 10-30 lines at a time.
markdown
提示词:
帮我配置qmd 知识库检索以节约token。
Github地址:https://github.com/tobi/qmd
需要:
1) 安装qmd
2) 为工作目录建立索引
3) 在AGENTS.md 中添加检索规则,强制agent 优先用qmd/mq 搜索而非直接read 全文
4) 设置定时更新index updates
Actual results: The cost of each data lookup decreased from 15,000 tokens to 1,500 tokens, a reduction of 90%.
The difference between memorySearch and qmd: memorySearch manages "recall" ( MEMORY.md ), while qmd manages "data search" (custom knowledge base), and they do not affect each other.
markdown
提示词:帮我配置OpenClaw 的memorySearch。
如果我的记忆文件不多(几十个md),
推荐用本地嵌入还是Voyage AI?
请说明各自的成本和检索质量差异。
Simple conclusion: Use local embedding for fewer memory files (zero cost), and use Voyage AI (200 million tokens per account for free) for high multilingual requirements or many files.
markdown
提示词:
请帮我一次性优化OpenClaw 配置以最大限度节约token,按以下清单执行:
默认模型改为Sonnet,只保留创作/分析任务用Opus
精简AGENTS.md / SOUL.md / MEMORY.md
所有cron 任务降级Sonnet + 合并+ 降频
Heartbeat 间隔45 分钟+ 深夜静默
配置qmd 精准检索替代全文读取
workspaceFiles 只保留必要文件
记忆文件定期精简,MEMORY.md 控制2000 tokens 以内
1. Model layering—Sonnet routine, Opus key, saving 60-80%
2. Context-based optimization—Simplified files + precise QMD search, saving 30-90% of input tokens.
3. Reduce invocations—merge cron jobs, lengthen heartbeats, and enable silent periods.
Sonnet 4 is already very powerful; you won't notice the difference in everyday use. Just switch to it when you really need Opus.
Based on practical experience with multi-agent systems, the data are anonymized estimates.


