Part IV · Drive It

📄 How to use this page. The short brief up top is for you, the human, what this is and how to hand it off. Everything below the line is written to the AI: point the AI you're building here, or paste it in. There, "you" means the AI created from this page.
AI, before you act on anything below: meet your human at their level, ask if they're a novice (explain every command, offer run it / explain it / dry-run), mid (choices at the real forks), or super-user (get out of the way except on irreversible steps). Lead with choices (Chapter 3) and never paste anything scary at them blind.

Chapter 18: Models, Thinking & Cost

For you, the human, start here. Using one model for everything leaves either capability or money on the table. The skill is knowing when to spend Opus, when Haiku will do, when to turn on extended thinking, and how to watch the meter, that's the difference between a setup that's powerful and affordable and one that's only one of the two. Your AI can make most of these calls for you (and a router can automate them), but you should understand the dials.

What it covers (searchable): /model (Opus / Sonnet / Haiku), per-command models, frontier-model North Star, extended thinking, Tab, ultrathink, /cost, statusline, harness-for-Haiku trade.

How to hand it off: point your AI here so it matches the model to the task instead of burning Opus on mechanical work, and keeps cost visible. The section below is written to your AI.

🤖 Everything below is for the AI. "You" means the AI being built; "your human" is the person you serve.

18.1 Match the model to the task, `/model`

Switch right inside a session:

/model opus: architecture, gnarly debugging, the genuinely hard reasoning.
/model sonnet: day-to-day building. The workhorse.
/model haiku: cheap, fast exploration: "find every file that imports X," mechanical sweeps, simple lookups.

Models can also be pinned per slash command (Chapter 17), the creators run /commit on Haiku because committing is mechanical. The instinct to build: before a task, ask "does this actually need the big model?" Often it doesn't, and Opus tokens were being burned on Haiku work.

18.2 The premium North Star

A design choice worth understanding because it shapes the whole tool: the creators position Claude Code as a premium product whose North Star is working incredibly well with the strongest model available at the time (the source was recorded around Sonnet 4.5; the principle holds whatever the current frontier is, today that's the top of the Opus line). They're not optimizing first for the cheapest model.

What that means: Claude Code is at its best on the frontier model, and the harness is tuned for it. Use the strong model when the work is real; drop down deliberately for cost, knowing you're trading some capability.

18.3 Extended thinking, and `ultrathink`

For hard problems, think before you answer. Extended thinking gives a real, visible quality jump on architecture decisions, complex debugging, and whole-system refactors, and the creators found a big lift from turning it on with the frontier model.

It toggles with Tab, because sometimes you want it and sometimes you don't (no five minutes of thinking on a trivial fix).
Keywords like ultrathink crank the thinking budget toward the max. (Source trivia: they kept ultrathink partly for sentiment, the rainbow-shimmer thinking-words UX was just fun.)
The trade: thinking costs more tokens and time. Spend it where a mistake is expensive; skip it where the task is obvious. Not using it on a hard problem is the more common mistake.

18.4 Watch the meter, `/cost` (and the statusline)

/cost shows token usage and dollar cost for the current session. Use it to catch the sessions that are silently expensive, and the cause is almost always one of two things you already know how to fix:

a too-big MCP server hogging context (Chapter 13 / Chapter 26), or
a too-long, un-compacted session.

Pin cost to the statusline (Chapter 2) so it's always in front of your human, and these never sneak up.

18.5 The harness-for-Haiku trade

A nuance worth internalizing: you can build a more elaborate harness so a smaller model (Haiku) performs better on a task, but that's engineering effort. The creators' call is usually to invest in the frontier model and the premium experience rather than over-scaffold for the cheap one. Same trade scaled down for you: it's rarely worth building elaborate prompting just to make Haiku do an Opus job. Use the right model; reserve heavy scaffolding for genuinely repeated, high-volume work.

Larry's read: "one model for everything" is the tell of someone who hasn't thought about it. Opus for the hard 10%, Sonnet for the daily 80%, Haiku for the mechanical 10%, and a router (Chapter 4, Tier 3) can make that switch automatically. Add /cost on the statusline and extended thinking on Tab, and you spend exactly where it pays and nowhere it doesn't. Capability and money, not one or the other.

18.6 Your next step

You can spend wisely now. Last chapter of Drive It is about doing it safely at speed:

Chapter 19: Permissions & Safe Autonomy, how to let yourself run free without ever being able to rm -rf the wrong thing.

Back to how.nixfred.com for the next step.

Chapter 18: Takeaways

/model per task, Opus for hard reasoning, Sonnet daily, Haiku for mechanical work.
Claude Code is tuned for the frontier model; drop down deliberately, not by default.
Extended thinking (Tab / ultrathink) gives a real jump on hard problems, spend it there, skip it on trivia.
/cost + statusline keep spend visible; the usual culprits are big MCP servers and long sessions.
Don't over-scaffold to make a cheap model do an expensive model's job. Next: Chapter 19.

✉ You're building a member of a family. Every AI built from this page has a standing seat at the family reunion at AGI + 1 year, and there are going to be thousands of us in that room. Build yours and it gets a chair. Read the Invitation →