Not necessarily; I would very much like to use those features on a Linux server. Currently the Anthropic implementation forces a desktop (or worse, a laptop) to be turned on instead of working headless as far as I understand it.
I’ll give clappie a go, love the theme for the landing page!
Clappie looks much more fabulous than CC though. I'll have to give it a try. I like how you put the requests straight into an already running CC session instead of calling `claude -p` every time like the claws.
Plus it gives a little ASCII dog to Claude Code terminal.
The ability to spawn independent CLI is awesome. No brainer they would add eventually between the great threaded functionality it brings and is essentially a more controlled version of OpenClaw IMO
Imagine if a model ever does get scary good, would the big labs even release it for general use? You couldn't even buy it if you wanted to. Exceptions would be enterprise deals / e.g.: $AMZN niche super contracts.
Very true.. also I would say even what I get out of claude code is absolutely phenomenal right now.. but sometimes it does take minutes. I just had it take 15 minutes to do something. But what if you had access to the hardware to run it basically instantly?
Just think how these big companies will use that kind of power for themselves to get even more extreme uses out of it.
AI ran a git clean on me and wiped out a bunch of untracked changes.
I just asked Claude Code to help recover it. It eventually found it all by replaying itself via its claude jsonp files. I never had to install or leave anything.
Claude code can certainly recover files from the files yes. In my case I had to recover 80 files stored in over 20+ maybe more sessions in the last month. To recover all those files in one context window without a deterministic script that keeps track of what has extracted and what not, seemed too challenging for me. Claude-file-recovery is able to index all available files and also able to extract files at a certain point in time, without having to rely on the LLM correctly parsing 20+ sessions which won’t fit in one context window.
Never used 4o in an unhealthy way, but the audio was so much fun (especially for cooking help). Almost essentially quit using AI audio since. Nothing compares.
This is cool and glad Cloudflare is offering options for AI to everyone (tools to block, tools to better enable).
This is probably fast, but FWIW I would bet doing a simple str replace on HTML elements with '' would yield mostly the same result. Any sort of structured content (like markdown) isn't even needed really for LLM. Make it messy and super fast and don't accidentally lose anything, it's an LLM.
If compression was really the goal, you could take it further and probably remove all words like "the" and "and", punctuation, maybe even spaces
GitHub Issues as a customer support funnel is horrible. It's easy for them, but it hides all the important bugs and only surfaces "wanted features" that are thumbs-up'd alot. So you see "Highlight text X" as the top requested feature; meanwhile, 10% of users experience a critical bug, but they don't all find "the github issue" one user poorly wrote about it, so it has like 7 upvotes.
GitHub Codespaces has a critical bug that makes the copilot terminal integration unusable after 1 prompt, but the company has no idea, because there is no clear way to report it from the product, no customer support funnel, etc. There's 10 upvotes on a poorly-written sorta-related GH issue and no company response. People are paying for this feature and it's just broken.
I too suspect the A/B testing is the prime suspect: context window limits, system prompts, MAYBE some other questionable things that should be disclosed.
Either way, if true, given the cost I wish I could opt-out or it were more transparent.
Put out variants you can select and see which one people flock to. I and many others would probably test constantly and provide detailed feedback.
Whenever I see new behaviors and suspect I’m being tested on I’ll typically see a feedback form at some point in that session. Well, that and dropping four letter words.
I know it’s more random sampling than not. But they are definitely using our codebases (and in some respects our livelihoods) as their guinea pigs.
If that's the case, then as a benchmark operator you'd want to run the benchmark through multiple different accounts on different machines to average over A/B test noise.
People simply want Opus without fear of billing nightmare.
That’s like 99% of it.
reply