And it's running as a single process on a single server without a database storing everything on the file system. That it's doing the traffic it is I find quite impressive.
You know I'd love to see a writeup of the issues and challenges you've faced with this persistence strategy as the site grew. You know, more details of what you decided to lazy load, why, and how, and the impact it had.
Technically speaking, I find hacker news persistent strategy one of the most interesting things about the implementation.
I use a similar strategy for my blog and just playing with larger datasets I've certainly run into hard limits on what seems to be acceptable.
I'd have thought that most of HN's traffic is on the front page, and it's associated comments. I could well be wrong - But given this I'd have thought 2GB was plenty, and wasn't the bottleneck.
Also, unless your DB tables are in-memory tables, you are using a very fancy interface to a file system store (that is usually in a nontrivial format.)
PLT Scheme is not "interpreting" code. More than that, Arc
adds a bunch of huge overheads that could make things
substantially faster. I had some patches lying around that
made things around 4-5 times faster (including the news
server).
33k users per day is piddly little bits. I think that a platform change is called for, the current recipe is clearly no longer up to the task. Which is a pity, because the lisp code really is quite elegant. It's just that the web and high performance and lisp are not usually used in a single sentence (other than this one...).
33k/day doesn't sound like a lot of traffic, honestly. But also I can't imagine lisp being that slow either. Maybe add some caching?