Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Are there any good resources related to expanding context windows, or even just the mechanics of how they actually work as properties of a model?


Lots. LLAMA 2 was trained on 4K context windows but can run on arbitrary length just the results become garbage as you go longer.

I refer you to https://blog.gopenai.com/how-to-speed-up-llms-and-use-100k-c... for an "easy" to digest summary




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: