What do you (or anyone else, feel free to chime in) do with other LLMs that make...

mcmoor · on July 8, 2023

I still wonder what makes GPT-4 so much better than its contemporaries. That's why I saw tons of people trying to explain how GPT-4 works starting from simple neural network distasteful, tons of people already knew and do that but none of them is nearly close to GPT-4.l

losteric · on July 9, 2023

> I still wonder what makes GPT-4 so much better than its contemporaries.

OpenAI have had many years to craft their dataset down from the noisy public datasets, and GPT4 is (supposedly) a mixture of 8 "expert models" each of which is 220B (5x+ larger than the Falcon 40B) with a total of 1.7B parameters (3x+ Google's huge 540B PaLM). The hardware and software to train networks of that scale is also a deep moat. Relatively speaking, the model architecture ("gpt from scratch") is the easiest piece.

two_in_one · on July 9, 2023

From my understanding. GPT-4 is the biggest, or one of the biggest. It was trained on low quality internet datasets, like the others. What makes it different is post-training on custom data with human supervision. We know they even outsourced that to Africa. Second, they integrated it with external tools. Like Python interpreter, internet browser. But the first is most important. Also most likely they have experimented and found some tricks which make it bit better.

sp332 · on July 9, 2023

They pay tons of people to type out conversations that they can feed into it. It's just a lot of people doing a lot of work.

xmprt · on July 8, 2023

This line of thinking only works if it's impossible to imagine a world where OpenAI isn't the leader. In 2 years if the non OpenAI models are better then it will serve us much better to allow these tools to work with other models as well.

jstummbillig · on July 9, 2023

Since OpenAI is all just APIs with simple interfaces, I don't think that plugging a different, capable model in whatever tool you are building is going to be an issue.

freediver · on July 9, 2023

You are correct in this assesment. A majority of individuals and startups playing around with turning LLMs into products aim to be prepared for the arrival of the subsequent generation of models. When that occurs, they'll already have a product or company in place and can simply integrate the new models.

Models are getting commoditized, well executed ideas are not.