In this case the CPU wasn’t really saturated with work but with contention on global locks. The contention is lessened by removing the amount of concurrent mounts that are being done.
I wonder if simply setting a maximum number of concurrent mounts in the code or by letting containerd think there are only half the amount of cores, would have solved the contention to the same amount.