i think they broke billing for new api users - I just signed up yesterday for the api and am a max customer. Even though i paid and waited 24 hours, I am unable to access claude api at all and never used any of the banned bots on the account.
- API key auth works (/v1/models returns full list)
- Console shows $20 credit balance
- Spend limit is $100/month, $0 used
- Two different API keys generated in this org both fail with same error, I only have 1 org.
request_id req_011CaZLqnqEWcA8nGSVoB7dc
I emailed support have not heard back from a human.
OpenCode harness is so good I'm surprised one of the big players hasn't bought them outright. Essentially their harness:
* Removes all the system prompt cruft and bullshit that CC pumps into the prompt and pollutes context and shit like "adaptive thinking"
* Is extremely good at keeping the model aligned with AGENTS.MD and opencode.json and using all the features available there (parallel agents, sub-sub agents, etc)
For example, I'm working on a repo with 5 distinct components and I have a specialized agent for each component. CLAUDE.MD is just a markdown file where I say "Hey Claude always use X agent for X component. X agent has this prompt blah blah" and then pray Claude remembers to use it. opencode.json is a structured file used by the harness and it has ALWAYS coerced the model to use it, including being able for the agents to delegate subagents in parallel etc.
This makes a massive difference. So if I have a feature that touches multiple components, OpenCode rips through it with the specialized subagents while Claude sits their spinning its wheels and occasionally remembering theres a specialized agent and then maybe once in a blue moon it will do it in parallel.
With CC I feel like I need to do all these invocations and coercions. OpenCode, once you've got your opencode.json and agents defined, just works.
Is there a guide you can link for opencode usage like this? I just use codex and find its generally really good. What you are describing sounds like a bit of an unlock.
First things first get opencode installed https://opencode.ai/ and connected to your provider (works with almost everything except Claude Max).
Then the workflow is like this for each repo (both greenfield and existing):
1. Create an opencode.json in the repo root. opencode.json is the harness config. It tells the system what provider/model endpoint to use, which instruction files to load, what the default agent is, what specialist agents exist, and which slash commands route to which agent/workflow. For now, it can be very simple and barebones.
2. If you have any existing CLAUDE.MD or AGENTS.MD files you like to use, you can point them in opencode.json as "instructions" key. So here's a sample config (or look at https://opencode.ai/docs/config/)
{
"$schema": "https://opencode.ai/config.json",
"instructions": ["AGENTS.md"],
"default_agent": "build",
"agent": {
"coordinator": {
"description": "Lead coordinator",
"prompt": "This will be the prompt for your coordinator agent",
"agents": [
"componentA_build",
"componentA_meta"
]
},
..."
3 [the most crucial step]. Create a plan of how the repo is structured to feed back into the harness so it can generate its own tailored config.
Of course, this is where your own critical thinking comes in. You're going to be prompting the default agent to start extending this opencode.json to fit your project. I found that most models are relatively poor at sussing out the "why" of an existing codebase - they are much much more focused on the how and what.
So much like how companies ship their org chart, with gen AI code you ship your architecture. If you don't have a good mental model of how the game should work (not that it HAS to be working that way now, but how it should be), then you won't go anywhere. If you can provide the "why", that is the connective tissue that really makes the difference.
For instance, lets say you are working on a railroad simulation game with a modular architecture - there's one central "GameController" which references submodules like "CityController", "TrainController" and so on. You'd then want to create agents that specialize in things like cities, trains, railroad, building, etc with a high level "coordinator" agent that has access to every subagent (as defined in the "agents" key). And then you get to the fun part of making higher-order agents - so an agent that specializes in game balance that has CityAgent, TrainAgent, GameAgent etc as sub agents, "RouteAgent" that has RailroadAgent, TrainAgent etc that specializes in efficient routing calculations and so on.
I found that in this step it helps just to braindump into a scratch pad and just write out as much as you can at a high level how it all operates and the architecture and the agents you want. Having a defined OUTPUT is the most important part of the agentic process and prompting. Key things I have in this braindump:
a. Architecture overview, both "actual" and "idealized" versions (what do we want the end game to look like e.g. a full-fledged train sim with stock market and city building - even if currently its just an isometric map with some buttons on it).
b. Core components of the system
c. Key file locations, any workflows like processing images, audio etc for game
d. Any design decisions and whatnot you made that aren't captured in code or documented (e.g. "I'm basing the economy off of XY Game")
One thing that's VERY helpful at this phase is having the default agents ("Build" and "Plan" with OpenCode) go through and document all the code and as much as it can into a docs folder if you don't have that already.
4. Take that braindump and feed it back in OpenCode and tell it to modify its opencode.json to have agents to handle all the components and architecture. Tell it to parallelize and delegate as much as possible.
5. It'll output a new opencode.json. Restart opencode and go wild.
As you work with the harness, you'll get the nuances of how the agents are interacting and what needs some tweaking but the key here is to always keep the feedback loop going. Tell the agents to always update docs after committing code and to always read the docs before doing anything etc etc. This is key to making sure the agents don't go off-rails.
Eventually you'll see the meta-patterns you like and create scripts to e.g. autogenerate this kind of harness for any repo you encounter. I don't have one "source of truth" opencode.json but rather a base template and then a python script that does all of the above automagically for my workflows.
The key insight here, I think, is like learning Lisp. The harness IS code, agents ARE Code - they can be modified as needed dynamically and adapted. They are first-class citizens and can be composited like functions or chained together like graphs. The map is the territory. Once I realized that I could prompt the harness to modify itself and do all the things to itself that it does to the codebase, things really took off for me.
Anecdata disclaimer: This workflow might not fit everyone's mental coding model. Adapt as needed. I use literally dozens of these kind of harness configs every day at work, including meta-harnesses with code review of harnesses and EvalOps. Been doing so for about six months now. I'll also note we are very serious about performance and feature degradation but with this approach we've had less rollbacks and regression than in the previous five years.
As long as everyone is here, have you seen the token usage just go up remarkably recently for the $100 plan? it lasts a lot less time than it used to recently. Might be related to recent releases of claude.
Yes, it’s extremely obvious. The recent “we give you $100/$200 extra credit for a month” is clearly just “you’re supposed to pay extra for the same usage from the now on” dressed up as a “bonus”, just like giving “bonus” usage off-peak before announcing faster burn rate during peak a short while ago.
And the recent “Investigating usage limits hitting faster than expected” [1] is probably them intentionally gauging how much they can push it without too much of an uproar.
I'm on the basic £18/month plan and with Sonnet 4.6 I literally get 20 maybe 30 minutes of use out of it per day. It's borderline useless now. I was using it for some Home Assistant changes yesterday and it used up my entire daily allowance after 8 prompts.
I guess 2026 is the last year AI is widely available to anyone who isn't willing to shell out hundreds if not thousands for a monthly subscription. I guess all that's left is to thank all the investors for the free ride LOL
Hard to square that with how good open-weights models are getting? I'm doing stuff with Qwen3.5-4b that required a frontier hosted model less than a year ago.
the problem is you're still a year behind with this approach and it isn't at all clear locally hosted models can keep the gap. need more turboquant-like algorithmic boosts for this to happen.
Maybe you're experiencing normal usage rates now that the 2x March promotion is over?
> From March 13, 2026 through March 28, 2026, your five-hour usage is doubled during off-peak hours (outside 8 AM-2 PM ET / 5-11 AM PT / 12-6 PM GMT) on weekdays). Usage remains unchanged from 8 AM-2 PM ET / 5-11 AM PT / 12-6 PM GMT on weekdays.
Source: https://support.claude.com/en/articles/14063676-claude-march...
We discovered a bug in AWS Bedrock that is double counting cache writes when thinking/reasoning is enabled for the Anthropic models. It’s not clear to me if this is limited to just AWS Bedrock or all providers. AWS Support is aware.
We’ve also observed a much higher cache miss rate in the past few weeks. Combine both together and your usage consumption can be greatly increased.
I'm on the max 20 plan, and yes, it's the same for me. The week before last it used to last all week for me, but now it's Wednesday and it's already at 40% usage.
Like any company they will squeeze the usage as much as they possibly can. There is not a little chance that prices can be 1k+ so only enterprises can allow coding subs.Those who have ROI will pay for it.
Current phase of usage/pricing is just testing the waters. Especially considering they are the market leader in this category.
I read this on reddit daily ; we have usage monitoring running and collect all stats; we have seen no difference at all. Guess they are split testing or something maybe?
Could you elaborate what these usage monitors look like? I collect data locally and can easily show that cost per token has gone up in some of my sessions
All our people run a cron script which counts tokens (from jsonl) use and runs a scripted cli /usage (sending keyboard input to the running claude code) and sends that to a central system where we can see this. We see no real changes on any of the accounts or averaged. I have to note here that we only use sonnet 4.6; opus always ran over limits if not continuesly monitored and switched over to sonnet since it came out and it's useless to us for that reason.
Unless you’re somehow on a different quota system, or maybe using Haiku, there’s no way you can sustain five continuous hours of parallel agents running without hitting the 5h quota limit, even on the 20x max plan. But maybe your company is flagged as VIP or something.
This is going to sound maybe a bit out there for you but you are in college go find a dance class, ballroom, tango, etc. They have tons and usually lack male participants. Give it a try and it may help you quite a bit. Don't over think this just go, the rest will take care of itself in short order.
Your pricing page disregards the single analyst that wants a non team plan but with Agents and workflows, and advanced integrations. I did not even want to sign up if you are going to make me call before I can even look at the features.
That's good feedback. Honestly, it's so early (we are working with design partners) that we haven't really thought about pricing for orgs and enterprises yet. So we left it as 'Talk to Us'.
Agree that an individual analyst should get all those features, and the team tier should be more about sharing and admin controls.
"Technical guarantee: Processing happens in secure, isolated server instances that immediately purge all data. Your financial information never touches persistent storage." - May I ask how are you doing this?
- API key auth works (/v1/models returns full list) - Console shows $20 credit balance - Spend limit is $100/month, $0 used - Two different API keys generated in this org both fail with same error, I only have 1 org.
request_id req_011CaZLqnqEWcA8nGSVoB7dc
I emailed support have not heard back from a human.
reply