With the caveat that I don't know a lot about Linux GUI infrastructure but I would bucket GTK and the X11 client library into "toolkit", and the compositor would live in the X server and probably to some extent in the display and HID drivers. If you wanted to dig in I would look at doing an analysis of everything between physical finger movement and that movement being reflected on the physical screen, and how to reduce the latency and jitter. Anything about the Firefox APZ implementation [1] and that John Carmack has written about VR latency [2] would give a feel for the problem.