The reason that language models require large amounts of data is because they la...

p1esk · on Aug 3, 2022

Even simple cognitive abilities like object persistence requires time perception, which these models lack.

What do you mean? If we send a robot to explore its environment, and train it by having it constantly predict the next video frame, wouldn’t it eventually learn the physics and therefore gained the “time perception”?

Jack000 · on Aug 3, 2022

Yeah that would work. I'm talking about LLMs, DALL-E and diffusion image generators specifically.

tehsauce · on Aug 3, 2022

this is exactly what clip already does, and there will be massive improvements in this area in coming years, I promise.