Clawctl
Tutorial
5 min

Save Tokens: Use Codex as the Muscle, Opus as the Brain

Claude Opus is brilliant but expensive. Even the $200/mo plan has limits. Here's how to route tasks to different models and cut your token costs without losing capability.

Clawctl Team

Product & Engineering

Save Tokens: Use Codex as the Muscle, Opus as the Brain

Claude Opus is the best model for OpenClaw. Everyone agrees.

It's also expensive. And even on the $200/mo Claude Max plan, you'll hit limits.

Here's the solution: don't use Opus for everything.

The Problem

You're using your agent heavily. Morning briefs. Research. Coding. Analysis.

By mid-month, you're hitting rate limits. Or burning through API credits.

The issue: every task—simple or complex—uses the same expensive model.

Website summary? Opus. Simple calculation? Opus. Quick file read? Opus. Complex analysis? Opus.

That's wasteful.

The Mental Model

Think of your agent as having a brain and muscles:

Brain (Claude Opus):

  • Strategic thinking
  • Complex analysis
  • Decision making
  • Planning and orchestration

Muscles (Codex, local models):

  • Coding tasks
  • Simple summaries
  • File operations
  • Repetitive work

The brain decides what to do. The muscles do the work.

Setting Up Multi-Model Routing

Step 1: Install Codex CLI

If you have a ChatGPT Plus subscription, you have access to Codex CLI.

Install it and connect it to your OpenClaw.

Step 2: Tell your agent about the new muscle

I've installed Codex CLI. From now on, whenever you need to:
- Write code
- Edit files
- Run commands
- Do repetitive tasks

Use Codex instead of doing it yourself. You handle the planning and decisions.
Codex handles the execution.

Step 3: Watch your token usage drop

Opus now only handles the thinking. Codex does the typing.

What Gets Routed Where

Task TypeModelWhy
Planning what to buildOpusRequires judgment
Actually writing codeCodexExecution, not thinking
Complex analysisOpusRequires understanding
Simple summariesSonnet/localJust compression
File operationsCodexMechanical task
Decision makingOpusCore capability
Research synthesisOpusConnecting ideas
Formatting outputSonnet/localSimple transform

The rule: if it requires judgment, Opus. If it requires typing, something cheaper.

Local Models for Simple Tasks

Take it further with local models:

Install Ollama:

Your OpenClaw can install and manage local models automatically.

Set up Ollama with a local model.
Use it for:
- Website summaries
- Simple text extraction
- Format conversion
- Quick lookups

Save API credits for complex tasks.

One user reported:

"Just had OpenClaw set up Ollama with a local model. Now it handles website summaries and simple tasks locally instead of burning API credits. Blown away that an AI just installed another AI to save me money."

The Cascade Pattern

Set up a cascade for task routing:

When you receive a task, evaluate complexity:

1. Simple/mechanical → Local model or Codex
2. Moderate complexity → Claude Sonnet
3. High complexity → Claude Opus

Default to the cheapest option that can handle the task.
Only escalate when necessary.

This is how you run 24/7 without burning $500/mo on API calls.

Coding Workflow Example

Here's how a proactive coding session should work:

Opus (brain):

  1. Reviews your project
  2. Identifies what to build
  3. Creates a plan
  4. Breaks into tasks

Codex (muscle):

  1. Writes the code
  2. Creates the files
  3. Runs the tests
  4. Makes the PR

Opus (brain):

  1. Reviews the result
  2. Suggests improvements
  3. Decides if done

Opus touches the task twice. Codex does all the heavy lifting.

Token Savings Math

Without multi-model routing:

  • 10 coding sessions/day
  • ~50k tokens each
  • 500k tokens/day
  • At Opus rates: expensive

With multi-model routing:

  • Same 10 sessions
  • Opus: ~5k tokens (planning)
  • Codex: ~45k tokens (execution)
  • 90% of tokens on cheaper model

Real-world reports: 60-80% cost reduction.

The Proactive Overnight Pattern

This matters most for overnight work.

Your agent builds features while you sleep. Without routing, that's:

  • Hours of Opus usage
  • Massive token consumption
  • Hitting rate limits

With routing:

  • Opus plans the features
  • Codex builds them
  • Opus reviews results

Same output. Fraction of the cost.

Configuration Example

Add this to your agent's instructions:

Token optimization rules:

1. For any coding task, use Codex CLI
2. For website summaries, use local Ollama
3. For format conversions, use local Ollama
4. For complex analysis, use Claude Opus
5. For research synthesis, use Claude Opus

When uncertain, ask: "Does this require judgment or just execution?"
- Judgment → Opus
- Execution → Codex/local

Monitoring Usage

Track your savings:

At the end of each day, report:
- Tasks completed
- Which model handled each
- Estimated tokens saved

Include in my morning brief.

This helps you tune the routing over time.

Common Mistakes

Mistake 1: Routing everything to local

Local models are worse. Don't use them for:

  • Complex reasoning
  • Nuanced decisions
  • Creative work

You'll get bad output and waste time fixing it.

Mistake 2: Not routing anything

The opposite problem. Using Opus for everything is wasteful.

Find the balance.

Mistake 3: Forgetting about Sonnet

Claude Sonnet is in the middle. Good for moderate tasks. Cheaper than Opus.

Why This Matters for Clawctl

Clawctl makes this easier:

DIY ChallengeClawctl
Configure model routing manuallyBuilt-in routing rules
Install and maintain OllamaManaged local models
Track token usage yourselfUsage dashboard
No cost alertsBudget notifications

Your agent is cost-optimized out of the box.

The Bottom Line

Claude Opus is brilliant. Use it for brilliant things.

Everything else? Route to cheaper options.

  1. Brain (Opus) = planning, decisions, analysis
  2. Muscles (Codex, local) = execution, typing, mechanical work

This is how you run proactive agents 24/7 without going broke.

Deploy with built-in cost optimization →

Learn about hardware decisions →

Ready to deploy your OpenClaw securely?

Get your OpenClaw running in production with Clawctl's enterprise-grade security.