Hacker Newsnew | past | comments | ask | show | jobs | submit | MillionOClock's commentslogin

I wonder why there aren't more open weights model with support for prompt caching on OpenRouter.

It is tricky to build good infrastructure for prompt caching.

Its as simple as telling your claude code to implement prompt caching!

I see the Claude team wanted to make it less verbose, but that's actually something that bothered me since updating to Claude 4.7, what is the most recommended way to change it back to being as verbose as before? This is probably a matter of preference but I have a harder time with compact explanations and lists of points and that was originally one of the things I preferred with Claude.

Peter, while we are on the subject of clarifying what is and isn't allowed I have a question: has OpenAI clearly communicated about precisely where one is supposed to be able to use their Codex quota? For instance, as far as I understand, it is allowed to use it with OpenClaw, but does it extend to any other coding harness? Say I have an app (potentially a paid one) and want my users to use their Codex quota in it, is it permitted to do? As you can probably imagine that would unlock a lot of uses cases given smaller actors can't subsidize as much token costs, but unfortunately, and maybe expectedly due to the nature of subscriptions, I have not been able to find any answer regarding this.

I'm not sure they have "officially" said anything but they do allow Codex OAuth login for 3rd party coding agents: pi, opencode, etc. Employees on twitter have explicitly approved this.

That matches what I have seen, but I think I remember reading a tweet that had mentioned those "developing in the open" (not an exact citation, just based on what I remember), which made me wonder if it meant they considered this allowed only for open source software, or if they were intending to be much more permissive, essentially considering users can use their quotas wherever they want, or maybe even completely different rules, again I feel there could be more transparency regarding all of that.

I had a conversation right during the launch so not fully sure if it was Opus 4.7 but I also noticed the same behavior of asking questions that did not seem particularly useful to me, tho I still prefer that to not asking enough.

Say someone uses AI, treating it as if it was a developer (probably not recommended today due to the risk of errors), and working and speaking with it as if they were some kind of product manager or senior engineer who only makes architectural decisions etc. I wonder what kind of difference would it really make? Sure the person might not be as good anymore as a developer, but how is this different from being a usual product manager or whatever the day AI truly is good enough for a developer role? I'm not saying I know what the answer to this question is, but this is something I genuinely wonder, and I think the same kind of questioning can apply to broader domains.

Why and how do you think it applies to broader domains?

Children learning in schools should not become product managers. If they are, what exactly is the "product" that they are "managing"? Reducing everything to and looking everything from a corporate viewpoint is bizarre.


I'm not saying this should be every single domain. This isn't about products or management, instead I would frame it like this: I notice that multiple cases where we are worried about the impact of AI are basically just about the replacement of certain activities that some humans already aren't doing in today's society. If we are worried we will be less good at doing job X once we don't do job X anymore, why are we not worried about people who never did job X in the first place? If we are worried about people not doing jobs anymore, why are not worried for the human development of people wealthy enough not to work anymore for the rest of their days? I would not assume someone who won the lottery is going to have their life become uninteresting or see some cognitive decline. It could probably happen, but you can also see a path where the person just chooses to do the activities they always wanted to do, where they keep learning and exploring without the burden of usual life constraints. People already play chess when machines have beaten us for decades, just because they enjoy it.

Regarding education I think AI is a huge revolution waiting to happen. Usual courses have become boring? Have future super powerful AI generate per student highly personalized programs, create bespoke video games where succeeding can only happen once the student has validated all the notions you wanted them to validate etc.


>If we are worried we will be less good at doing job X once we don't do job X anymore, why are we not worried about people who never did job X in the first place? If we are worried about people not doing jobs anymore, why are not worried for the human development of people wealthy enough not to work anymore for the rest of their days?

None of this is equivalent to the topic of discussion. The point is that even in a world of division of labour and shared expertise, there is no atrophy in general populace because someone is trying to become expert in something. The whole point is that the brain is being put in use to do something. If not in X, then in Y. If none of the alphabets are available, where do you put your brain in use to?

>I would not assume someone who won the lottery is going to have their life become uninteresting or see some cognitive decline. It could probably happen, but you can also see a path where the person just chooses to do the activities they always wanted to do, where they keep learning and exploring without the burden of usual life constraints. People already play chess when machines have beaten us for decades, just because they enjoy it.

Again, please play attention to the main idea of the article linked. Most of cognitive development happens in the early formative years. Yes, learning itself never stops, but the primary period of it during perhaps the first 25 years of someone's life. You NEED to make mistakes and learn from them during this period. If you are offloading work that your brain was supposed to do here, it's extremely worrying.

>Regarding education I think AI is a huge revolution waiting to happen. Usual courses have become boring? Have future super powerful AI generate per student highly personalized programs, create bespoke video games where succeeding can only happen once the student has validated all the notions you wanted them to validate etc.

I think there is some truth to it, but you need to regulate how much AI can assist a student. It can be a patient teacher but it shouldn't replace their cognitive abilities. That is the whole point.


What is your app doing? Just LLM inference?

It's a custom agent harness with on-device models and the ability to swap between models.

Basically, a "toy" app to showcase where we are with coding agents on-device.


I hope some company trains their models so that expert switches are less often necessary just for these use cases.


A model "where expert switches are less necessary" is hard to tell apart from a model that just has fewer total experts. I'm not sure whether that will be a good approach. "How often to switch" also depends on how much excess RAM has been available in the system to keep layers opportunistically cached from the previous token(s). There's no one-size fits all decision.


Very interesting! On what platforms can this run? If it can run on iOS, how would you handle attempts to access to the file system or networking, is this already wired in somehow? If not is it easy to add custom handlers to handle these actions?


Yes, it could run in iOS (using JavascriptCore, V8 in jitless mode, or QuickJS), although we don't have a prototype app yet.

It should probably take a few hours with AI to get a demo for it :)


Awesome! Are you planning on setting a license soon? I might have missed it but I don't see it on the GitHub repo.


Just set it to MIT :)


It is definitely not foolproof but IMHO, to some extent, it is easier to describe what you expect to see than to implement it so I don't find it unreasonable to think it might provide some advantages in terms of correctness.


That definitely depends upon the situation. More often than not, properly testing a component takes me more time than writing it.


In my experience, this tends to be more related to instrumentation / architecture than a lack of ability to describe correct results. TDD is often suggested as a solution.


I think both should be done, they don't really serve the same purpose.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: