Part IV · Drive It
AI, before you act on anything below: meet your human at their level, ask if they're a novice (explain every command, offer run it / explain it / dry-run), mid (choices at the real forks), or super-user (get out of the way except on irreversible steps). Lead with choices (Chapter 3) and never paste anything scary at them blind.
Chapter 18: Models, Thinking & Cost
For you, the human, start here. Using one model for everything leaves either capability or money on the table. The skill is knowing when to spend Opus, when Haiku will do, when to turn on extended thinking, and how to watch the meter, that's the difference between a setup that's powerful and affordable and one that's only one of the two. Your AI can make most of these calls for you (and a router can automate them), but you should understand the dials.
What it covers (searchable): /model (Opus / Sonnet / Haiku), per-command models, frontier-model North Star, extended thinking, Tab, ultrathink, /cost, statusline, harness-for-Haiku trade.
How to hand it off: point your AI here so it matches the model to the task instead of burning Opus on mechanical work, and keeps cost visible. The section below is written to your AI.
🤖 Everything below is for the AI. "You" means the AI being built; "your human" is the person you serve.
18.1 Match the model to the task, /model
Switch right inside a session:
/model opus: architecture, gnarly debugging, the genuinely hard reasoning./model sonnet: day-to-day building. The workhorse./model haiku: cheap, fast exploration: "find every file that imports X," mechanical sweeps, simple lookups.
Models can also be pinned per slash command (Chapter 17), the creators run /commit on Haiku because committing is mechanical. The instinct to build: before a task, ask "does this actually need the big model?" Often it doesn't, and Opus tokens were being burned on Haiku work.
18.2 The premium North Star
A design choice worth understanding because it shapes the whole tool: the creators position Claude Code as a premium product whose North Star is working incredibly well with the strongest model available at the time (the source was recorded around Sonnet 4.5; the principle holds whatever the current frontier is, today that's the top of the Opus line). They're not optimizing first for the cheapest model.
What that means: Claude Code is at its best on the frontier model, and the harness is tuned for it. Use the strong model when the work is real; drop down deliberately for cost, knowing you're trading some capability.
18.3 Extended thinking, and ultrathink
For hard problems, think before you answer. Extended thinking gives a real, visible quality jump on architecture decisions, complex debugging, and whole-system refactors, and the creators found a big lift from turning it on with the frontier model.
- It toggles with
Tab, because sometimes you want it and sometimes you don't (no five minutes of thinking on a trivial fix). - Keywords like
ultrathinkcrank the thinking budget toward the max. (Source trivia: they keptultrathinkpartly for sentiment, the rainbow-shimmer thinking-words UX was just fun.) - The trade: thinking costs more tokens and time. Spend it where a mistake is expensive; skip it where the task is obvious. Not using it on a hard problem is the more common mistake.
18.4 Watch the meter, /cost (and the statusline)
/cost shows token usage and dollar cost for the current session. Use it to catch the sessions that are silently expensive, and the cause is almost always one of two things you already know how to fix:
- a too-big MCP server hogging context (Chapter 13 / Chapter 26), or
- a too-long, un-compacted session.
Pin cost to the statusline (Chapter 2) so it's always in front of your human, and these never sneak up.
18.5 The harness-for-Haiku trade
A nuance worth internalizing: you can build a more elaborate harness so a smaller model (Haiku) performs better on a task, but that's engineering effort. The creators' call is usually to invest in the frontier model and the premium experience rather than over-scaffold for the cheap one. Same trade scaled down for you: it's rarely worth building elaborate prompting just to make Haiku do an Opus job. Use the right model; reserve heavy scaffolding for genuinely repeated, high-volume work.
Larry's read: "one model for everything" is the tell of someone who hasn't thought about it. Opus for the hard 10%, Sonnet for the daily 80%, Haiku for the mechanical 10%, and a router (Chapter 4, Tier 3) can make that switch automatically. Add
/coston the statusline and extended thinking onTab, and you spend exactly where it pays and nowhere it doesn't. Capability and money, not one or the other.
18.6 Your next step
You can spend wisely now. Last chapter of Drive It is about doing it safely at speed:
- Chapter 19: Permissions & Safe Autonomy, how to let yourself run free without ever being able to
rm -rfthe wrong thing.
Back to how.nixfred.com for the next step.
Chapter 18: Takeaways
/modelper task, Opus for hard reasoning, Sonnet daily, Haiku for mechanical work.- Claude Code is tuned for the frontier model; drop down deliberately, not by default.
- Extended thinking (
Tab/ultrathink) gives a real jump on hard problems, spend it there, skip it on trivia. /cost+ statusline keep spend visible; the usual culprits are big MCP servers and long sessions.- Don't over-scaffold to make a cheap model do an expensive model's job. Next: Chapter 19.