> There's definitely going to be cheap or open source models
What makes you think your "cheap or open source model" running on your piddling desktop cluster will be able to complete against a SOTA one running in a billion-dollar datacenter?
It's a cyberpunk fantasy. It won't work out that way.
Local models that run on a laptop (not even needing a "cluster") are already better than ChatGPT from a couple of years ago. Yes, Claude and ChatGPT today are certainly better than these local models, but they can't keep getting better indefinitely -- there's only so much info to scrape. When they hit a plateau, it is only a matter of time that consumer hardware will catch up to it.
While that's most likely true, it rests on the assumption that consumer hardware stays affordable enough, and isn't locked down to disallow running "untrusted" models. I would have never believed that these assumptions could ever turn out false, but the recent developments have shown that even if unlikely, it's not impossible.
Maybe? We dont really know this right? People have been saying this for 5 years now and the models are still getting better. The companies running the frontier models have already scraped everything on the web, but the models are still getting better, even if it's only marginally better, with each release. Maybe eventually some company will actually achieve AGI/ASI, who knows..
I think the parent is speculating that there may be an order of magnitude improvement in the cheap / OSS model space such that one running on a piddling desktop cluster could match or exceed the capabilities of the current SOTA on billion-dollar datacenter.
> I think the parent is speculating that there may be an order of magnitude improvement in the cheap / OSS model space such that one running on a piddling desktop cluster could match or exceed the capabilities of the current SOTA on billion-dollar datacenter.
And then they take that model, put it in a billion-dollar datacenter, and kick your desktop cluster's ass with it.
For those of us who care about the answers to these questions, rather than who gets credit for doing it, we will welcome any faster means of solving these problems.
The trick is, at that time most of the possible mass range was excluded experimentally, so it is a bit less impressive. I'm not sure how much tuning went into it (possibly none)
Interestingly, there is some neuroscience research that transformer architecture resembles "cue based retrieval" in the human brain in some important ways.
"But internal study found users who stopped using Facebook and Instagram for a week showed lower rates of anxiety, depression, and loneliness."
This isn't causal though. The users who quit were not randomly selected. Maybe they were receiving some kind of mental health treatment, and as part of that they stopped. Then the recovery could have been from the treatment or it could have been from stopping.
So this argument you've made, you've just constructed a strawman.
> The users who quit were not randomly selected. Maybe they were receiving some kind of mental health treatment
You don't know that? You don't know anything about the selection process since facebook did not share their research. Your whole argument pins on the selection process you have no idea what happened. I'd find it very difficult to believe that researchers could not anticipate and control for situations like that. Researchers are after all, experts in research.
Facebook does not typically do academic level research - they do quick studies to verify product direction.
From what I have seen, the actual academic studies on this are mixed. It is hard to say one way or the other, and it can affect different teens differently depending on how they use it.
My point is if the people in the study were not randomly selected, there are any number of confounding factors that could influence why their anxiety changed.
How? Other then calling utility functions that C++ doesn't have you can't just like skip understanding what you are coding by using Python. If you are importing libraries that do stuff for you that wouldn't be any different than if someone wrote those libs in C++.
Are you saying I was incorrect for feeling that way?
The reason is that you no longer really know what's going on. (And yes, that feeling would be the same if C++ had as rich a library of packages as python for numerical analysis.)
If you are doing something that requires precision you need to know everything that is happening in that library. Also IIRC, I think not knowing what type something is bothered me at the time.
>Are you saying I was incorrect for feeling that way?
I think they just wanted clarification. If a program is just "make lines of code do thing" then it wouldn't be different.
But if you are used to ummanged code and considering the hardware architecture and memory management when you make a high performance program, working on python can feel like a black box. Things will slow down because there's a lot of "magic" weighing down the program. But not everyone works in that space.
Unlike LLMs, at least thos box can be peered inside of you really want to.