Intel do make GPUs. They're relatively new to the high-performance space, but they have been making iGPUs for a very long time. Their success in the dGPU market is far from certain, but they're making a credible effort.
More to the point, we use GPUs because that's what we have, not because they're optimal. Intel have a lot of experience in designing special-purpose accelerators, with perhaps the best example being Quick Sync - a low-end Intel CPU outperforms pretty much anything at video encoding and decoding. Inferencing is a very different workload to training and there is substantial potential for on-CPU accelerators to massively improve the performance and efficiency of inferencing tasks.
More to the point, we use GPUs because that's what we have, not because they're optimal. Intel have a lot of experience in designing special-purpose accelerators, with perhaps the best example being Quick Sync - a low-end Intel CPU outperforms pretty much anything at video encoding and decoding. Inferencing is a very different workload to training and there is substantial potential for on-CPU accelerators to massively improve the performance and efficiency of inferencing tasks.