For context on what cloud API costs look like when running coding agents: With C...

jychang · 2026-02-04T02:45:54 1770173154

On the other hand, Deepseek V3.2 is $0.38 per million tokens output. And on openrouter, most providers serve it at 20 tokens/sec.

At 20t/s over 1 month, that's... $19something running literally 24/7. In reality it'd be cheaper than that.

I bet you'd burn more than $20 in electricity with a beefy machine that can run Deepseek.

The economics of batch>1 inference does not go in favor of consumers.

selcuka · 2026-02-04T06:41:45 1770187305

> At 20t/s over 1 month, that's... $19something running literally 24/7.

You can run agents in parallel, but yeah, that's a fair comparison.

taneq · 2026-02-03T23:55:53 1770162953

At this point isn’t the marginal cost based on power consumption? At 30c/kWh and with a beefy desktop pc pulling up to half a kW, that’s 15c/hr. For true zero marginal cost, maybe get solar panels. :P

EGreg · 2026-02-04T01:36:02 1770168962

This is an interesting question actually!

Marginal cost includes energy usage but also I burned out a MacBook GPU with vanity-eth last year so wear-and-tear is also a cost.

pstuart · 2026-02-04T01:27:25 1770168445

Might there be a way to leverage local models just to help minimize the retries -- doing the tool calling handling and giving the agent "perfect execution"?

I'm a noob and am asking as wishful thinking.

jermaustin1 · 2026-02-04T14:03:34 1770213814

> I'm a noob and am asking as wishful thinking.

Don't minimize your thoughts! Outside voices and naive questions sometimes provide novel insights that might be dismissed, but someone might listen.

I've not done this exactly, but I have setup "chains" that create a fresh context for tool calls so their call chains don't fill the main context. There is no reason why the Tool Calls couldn't be redirected to another LLM endpoint (local for instance). Especially with something like gpt-oss-20b, where I've found executing tools happens at a higher success than claude sonnet via openrouter.