Hacker Newsnew | past | comments | ask | show | jobs | submit | mattbee's commentslogin

Er yeah same. I can believe it's a PITA to self-host because why would they care to make it easy. It's open source, good luck.

$10/year seems pretty fair to avoid all that.

The clients are fine, could be smoother, but I've internalised the quirks by now.


This OS doesn't says it's maintenance-free! But it skips a whole load of maintenance you'd need to think about with a traditional base system, because 1) there's almost nothing there, and 2) the upgrade to that base is easy, you just reboot and restart your containers.

Obviously the software you run needs upgrades, but (again, but a layer down) it's based on Docker and probably someone else is maintaining it. So you pull that new container, restart and the OS is just making sure your data lands in the same place with the new container.

If you're happy with all your software running from Docker this seems like a step up from a Debian or Redhat, and it has a lot less bureaucracy than something like CoreOS.

Whether it's _usable_ I'm not sure (especially around storage management) but it's a really clear pitch.


The internet of 20 years ago was awash with info for running dedicated servers, fragmented and badly-written in places but it was all there. I can absolutely believe LLMs would enable more people to find that knowledge more easily.

I founded a hosting company 25 years ago when User-Mode Linux was the hot new virtualisation tech. We aspired to just replicate the dedicated server experience because that was obviously how you deploy services with the most flexibility, and UML made it so cheap! Through the 2010s I (extremely wrongly) assumed that being metered on each little part of their stack was not something most developers would choose, for the sake of a little convenience.

Does a regular 20-something software engineer still know how to turn some eBay servers & routers into a platform for hosting a high-traffic web application? Because that is still a thing you can do! (I've done it last year to make a 50PiB+ data store). I'm genuinely curious how popular it is for medium-to-big projects.

And Hetzner gives you almost all of that economic upside while taking away much of the physical hassle! Why are they not kings of the hosting world, rather than turning over a modest €367M (2021).

I find it hard to believe that the knowledge to manage a bunch of dedicated servers is that arcane that people wouldn't choose it for this kind of gigantic saving.


> I find it hard to believe that the knowledge to manage a bunch of dedicated servers is that arcane that people wouldn't choose it for this kind of gigantic saving.

Managing servers is fine. Managing servers well is hard for the average person. Many hand-rolled hosting setups I've encountered includes fun gems such as:

- undocumented config drift.

- one unit of availability (downtime required for offline upgrades, resizing or maintenance)

- very out of date OS/libraries (usually due to the first two issues)

- generally awful security configurations. The easiest configuration being open ports for SSH and/or database connections, which probably have passwords (if they didn't you'd immediately be pwned)

Cloud architecture might be annoying and complex for many use-cases, but if you've ever been the person who had to pick up someone else's "pet" and start making changes or just maintaining it you'll know why the it can be nice to have cloud arch put some of their constraints on how infra is provisioned and be willing to pay for it.


For the record, I have seen every one of those in cloud based hosting multiple times. None of those issues require special work any more than they do than in traditional hosting.

> And Hetzner gives you almost all of that economic upside while taking away much of the physical hassle! Why are they not kings of the hosting world, rather than turning over a modest €367M (2021).

Hetzner is an oldschool German company, it is not surprising to see them act this way. They are very profitable (165M Euro in 2024) and have very little debt. They also seem to be mostly bootstrapped and are not VC funded

https://www.northdata.com/Hetzner%20Online%20GmbH,%20Gunzenh...


There was a time before Google when various mailing lists of grumpy sysadmins in key institutions could decide the fate of a new mail sender, internet-wide. But yes that "internet community" is small fry now, and can only cut off their own noses if they don't like Google's mail policies.

Before Google, AOL were the previous big-beast mail host, and they did provide some tools to help diagnose why you couldn't get through to their users. It still felt like there was more of a balance of power towards the grumpy sysadmins.


Or just switch your browser to Reader Mode and it's free.


The law (OK, well, British law) does recognise that many terms can be Unfair especially when one of the parties is an individual, and especially when it relates to employment. They can nullify them on that basis.


Pasting a big batch of new code and asking Claude "what have I forgotten? Where are the bugs?" is a very persuasive on-ramp for developers new to AI. It spots threading & distributed system bugs that would have taken hours to uncover before, and where there isn't any other easy tooling.

I bet there's loads of cryptocurrency implementations being pored over right now - actual money on the table.


I like biasing it towards the fact that there is a bug, so it can't just say "no bugs! all good!" without looking into it very hard.

Usually I ask something like this:

"This code has a bug. Can you find it?"

Sometimes I also tell it that "the bug is non-obvious"

Which I've anecdotally found to have a higher rate of success than just asking for a spot check


Do you not run into too many false positives around "ah, this thing you used here is known to be tricky, the issue is..."

I've seen that when prompting it to look for concurrency issues vs saying something more like "please inspect this rigorously to look for potential issues..."


What's more useful is to have it attempt to not only find such bugs but prove them with a regression test. In Rust, for concurrency tests write e.g. Shuttle or Loom tests, etc.


It would be generally good if most code made setting up such tests as easy as possible, but in most corporate codebases this second step is gonna require a huge amount of refactoring or boilerplate crap to get the things interacting in the test env in an accurate, well-controlled way. You can quickly end up fighting to understand "is the bug not actually there, or is the attempt to repro it not working correctly?"

(Which isn't to say don't do it: I think this is a huge benefit you can gain from being able to refactor more quickly. Just to say that you're gonna short-term give yourself a lot more homework to make sure you don't fix things that aren't bugs, or break other things in your quest to make them more provable/testable.)


That is an unfortunate case you described, but also, git gud and write tests in the first place so you don't need to refactor things down the road.


yes but i can identify those easily. i know that if it flags something that is obviously a non issue, i can discard it.

...because false positives are good errors. false negatives is what i'm worried about.

i feel massively more sure that something has no big oversights if multiple runs (or even multiple different models) cannot find anything but false positives


Just in case you didn't read the full article, this is how they describe finding the bugs in the Linux kernel as well.

Since it's a large codebase, they go even more specific and hint that the bug is in file A, then try again with a hint that the bug is in file B, and so on.


very interesting. i think "verbal biasing" and "knowing how to speak" in general is a really important thing with LLMs. it seems to massively affect output. (interestingly, somewhat less with Opus than with GPT-5.4 and Composer 2. Opus seems to intuit a little better. but still important.)

it's like the idea behind the book _The Mom Test_ suddenly got very important for programming


As a meta activity, I like to run different codebases through the same bug-hunt prompt and compare the number found as a barometer of quality.

I was very impressed when the top three AIs all failed to find anything other than minor stylistic nitpicks in a huge blob of what to me looked like “spaghetti code” in LLVM.

Meanwhile at $dayjob the AI reviews all start with “This looks like someone’s failed attempt at…”


> so it can't just say "no bugs! all good!"

If anyone, or anything, ever answers a question like that, you should stop asking it questions.


You just have to be careful because it will sometimes spot bugs you could never uncover because they’re not real. You can really see the pattern matching at work with really twisted code. It tends to look at things like lock free algorithms and declare it full of bugs regardless of whether it is or not.


I have seen it start on a sentence, get lost and finish it with something like "Scratch that, actually it's fine."

And if it's not giving me a reason I can understand for a bug, I'm not listening to it! Mostly it is showing me I've mixed up two parameters, forgotten to initialise something, or referenced a variable from a thread that I shouldn't have.

The immediate feedback means the bug usually gets a better-quality fix than it would if I had got fatigued hunting it down! So variables get renamed to make sure I can't get them mixed up, a function gets broken out. It puts me in the mind of "well make sure this idiot can't make that mistake again!"


> Pasting a big batch of new code and asking Claude "what have I forgotten? Where are the bugs?"

It's actually the main way I use CC/codex.


I find Codex sufficiently better for it that I’ve taught Claude how to shell out to it for code reviews


Ditto, I made a "/codex-review" skill in Claude Code that reviews the last git commit and writes an analysis of it for Claude Code to then work. I've had very good luck with it.

One particularly striking example: I had CC do some work and then kicked off a "/codex-review" and while it was running went to test the changes. I found a deadlock but when I switched back to CC the Codex review had found the deadlock and Claude Code was already working on a fix.


I think OpenAI has actually released an official version of exactly this: https://community.openai.com/t/introducing-codex-plugin-for-...

https://github.com/openai/codex-plugin-cc

I actually work the other way around. I have codex write "packets" to give to claude to write. I have Claude write the code. Then have Codex review it and find all the problems (there's usually lots of them).

Only because this month I have the $100 Claude Code and the $20 Codex. I did not renew Anthropic though.


Yeah and it comes with the blood of children included


> It spots threading & distributed system bugs that would have taken hours to uncover before, and where there isn't any other easy tooling.

Go has a built in race detector which may be useful for this too: https://go.dev/doc/articles/race_detector

Unsure if it's suitable for inclusion in CI, but seems like something worth looking into for people using Go.


I usually do several passes of "review our work. Look for things to clean up, simplify, or refactor." It does usually improve the quality quite a lot; then I rewind history to before, but keep the changes, and submit the same prompt again, until it reaches the point of diminishing returns.


ive gone down this rabbit hole and i dunno, sometimes claude chases a smoking gun that just isn't a smoking gun at all. if you ask him to help find a vulnerability he's not gonna come back empty handed even if there's nothing there, he might frame a nice to have as a critical problem. in my exp you have to have build tests that prove vulnerabilities in some way. otherwise he's just gonna rabbithole while failing to look at everything.

ive had some remarkable successes with claude and quite a few "well that was a total waste of time" efforts with claude. for the most part i think trying to do uncharted/ambitious work with claude is a huge coinflip. he's great for guardrailed and well understood outcomes though, but im a little burnt out and unexcited at hearing about the gigantic-claude exercises.


> "Codex wrote this, can you spot anything weird?"


Absolutely the opposite here, after reading a few paragraphs I was a bit bored. Then I saw the length of the piece, noticed the AI imagery, quit, came here. I read your comment and it makes sense. I'm not reading a story that somebody couldn't be bothered to write.


"I'm sorry to ask, but have you forwarded me unedited output from an LLM? I'd rather hear what you think!"


That's about as polite as you can get, and it's still risky: people get defensive, the output might NOT be from an LLM, etc.

That's the asymmetry of the problem: Writing with AI delegates the thinking to the reader as well as all the risk for correcting it.



Damn, then we automated bullshit generation


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: