More

Lwerewolf · 2026-04-12T18:12:24 1776017544

This.

They recently made "efficient" even more verbose, my custom instructions can't suppress it properly anymore.

These "little" changes are incredibly annoying.

reactordev · 2026-04-12T18:15:52 1776017752

they are trying to burn your tokens on purpose to make you spend more... like introducing limits but making it so API requests continue, at cost...

Lwerewolf · 2026-04-12T18:35:37 1776018937

Ehh... can't really hit "chatbot" limits on the $20 plan. Pretty sure the limits are not token based for that in the first place, and if it spews out a ton of stuff, it takes me longer to go through it and I end up asking it follow-up questions in a way where it replies... _relatively_ concisely. Still, gimme robot back. On a good note, it almost managed to call me stupid.

Codex has also been fine, but I'm guessing they know better than to tweak it like that, given their target users.

raw_anon_1111 · 2026-04-12T20:41:58 1776026518

I have hit chatbot limits with the $20 a month plan. During the day I use it with Codex and I night I use it to study Spanish. I don’t know if the two are correlated.

But then I just switch to another OpenAi and strangely enough, chat forces me into “thinking mode” when that happens and won’t let me do instant

Lwerewolf · 2026-03-18T14:12:00 1773843120

Just dust and echoes.

(:

Lwerewolf · 2026-02-08T10:06:41 1770545201

I wonder what ECC is for. So, unless you're Google and you're having to deal with "mercurial cores"...

Also, sorry, but what did I just actually attempt to read?

zvitiate · 2026-02-08T10:49:02 1770547742

Okay but if you aren’t using RAIM or a TMR system then is he really wrong?

And if you weren’t being snarky I’m sure you could understand. Generate 100 answers. Compare them. You’ll find ~90% the same. Choose that one.

Lwerewolf · 2026-02-07T18:00:25 1770487225

Re: $1k/day on tokens - you can also build a local rig, nothing "fancy". There was a recent thread here re: the utility of local models, even on not-so-fancy hardware. Agents were a big part of it - you just set a task and it's done at some point, while you sleep or you're off to somewhere or working on something else entirely or reading a book or whatever. Turn off notifications to avoid context switches.

Check it: https://news.ycombinator.com/item?id=46838946

Lwerewolf · 2026-02-03T16:01:00 1770134460

On that note, I could also comfortably fit a couple of chat windows (skype) on a 17'' CRT (1024x768) back in those days. It's not just the "browser-based resource hog" bit that sucks - non-touch UIs have generally become way less space-efficient.

Lwerewolf · 2026-01-26T12:14:54 1769429694

FoundationDB's approach - look up their testing framework.

I've worked in a company that, for all intents and purposes, had the same thing - single thread & multi process everything (i.e. process per core), asserts in prod (like why tf would you not), absurdly detailed in-memory ring buffer binary logs & good tooling to access them plus normal logs (journalctl), telemetry, graphing, etc.

So basically - it's about making your software debuggable and resilient in the first place. These two kind of go hand-in-hand, and absolutely don't have to cost you performance. They might even add performance, actually :P

Lwerewolf · 2026-01-23T08:54:58 1769158498

Re: "yes men" - critical thinking always helps. I kind of treat their responses like a random written down shower thought - malicious without scrutiny. Same with anything that you haven't gone over properly, really.

The advantages that you listed make them worth it.

drtgh · 2026-01-23T09:40:18 1769161218

The output of the prompts always needs peer review, scrutiny. The longer is the context, the longer it will deviate, like if a magnet were put nearer and nearer to a navigation compass.

This is not new, as LLMs root are statistics, data compression with losses, It is statistically indexed data with text interface.

The problem is someones are selling to people this as the artificial intelligence they watched at movies, and they are doing it deliberately, calling hallucinations to errors, calling thinking to keywords, and so on.

There is a price to pay by the society for those fast queries when people do not verify such outputs/responses, and, unfortunately, people is not doing it.

I mean, it is difficult to say. When I hear some governments are thinking in to use LLMs within the administrations I get really concerned, as I know those outputs/responses/actions will nor be revised nor questioned.

Lwerewolf · 2026-01-04T16:12:27 1767543147

This kind of works for me, GPT 5.2:

Base style & tone - Efficient

Characteristics - Defaults (they must've appeared recently, haven't played with them)

Custom instructions: "Be as brief and direct as possible. No warmth, no conversational tone. Use the least amount of words, don't explain unless asked.'

I basically tried to emulate the... old... "robot" tone, this works almost too well sometimes.

Lwerewolf · 2025-12-20T18:43:13 1766256193

Same with doing things in RAM as well. Sequential writes and cache-friendly reads, which b-trees tend to achieve for any definition of cache. Some compaction/GC/whatever step at some point. Nothing's fundamentally changed, right?

Lwerewolf · 2025-12-04T18:41:45 1764873705

Google "K-series Cam lobe pitting".

Anyways, nice engines, but you don't need something to be exceptionally reliable to keep it in production for 25 years.