Agent Token Optimizer

Multiply your usage on Claude Code with the same spend

Nerfguard auto-routes to the best model and reasoning depth for the job, and optimizes your token usage so you don’t waste tokens and time on excess intelligence.

install.shMac + Linuxcurl

curl -fsSL https://nerfguard.com/install.sh | bash

Works across every major coding agent

Claude Code

Codex

Any otherEmail us for setup

Questions, answered.

What does Nerfguard install?

A local gateway for coding-agent traffic, plus the shell configuration needed to route supported agent requests through it.

Which coding agents does it work with?

Codex (CLI and Desktop App) and Claude Code (CLI) are enabled automatically through the Nerfguard CLI. You can also manually set up Nerfguard with any coding agent that can point to a compatible model gateway. That's most modern agents. Want to use Nerfguard with another tool? Get setup instructions.

Do I need to change how I prompt?

No. Keep using your agent normally. Nerfguard sits behind the client and chooses the right model, reasoning depth and other optimizations for each request.

How do I turn Nerfguard on/off?

Enable or disable it anytime with nerfguard enable and nerfguard disable. Nerfguard is completely reversible, though we doubt you’ll want to turn it off once you try it.

What happens when a task needs the strongest model?

Nerfguard routes up instead of forcing everything through a smaller model, so high-judgment work can still use the right model.

I like my agent / model. Do I need to switch providers, pay a different vendor or evaluate new models?

No. Nerfguard can optimize usage on the plans and inference providers you already pay for. If you’d like to maximize the savings that Nerfguard can provide, you can also leverage alternative open model providers through nerfguard ultra routing and custom routing.

New users currently get free tokens on open models to try out ultra routing. If you’d like to opt out, you can pin routing to your current provider only with nerfguard enable --provider-only

If you’re interested in custom routing, reach out.

How much does Nerfguard cost?

It's free.

How fast is Nerfguard? Will it slow me down?

No. Nerfguard is fast. We’ve tuned the Nerfguard classifier to stay around 250ms. The end to end pipeline is negligible compared to the response times of most coding agent queries. In practice, our team has gained significant speed overall by requiring fewer tokens at higher intelligence levels.