But on hardware for which the hash function implementation is optimized, you will be able to crank up the cost factors higher than on comparable hardware for which no optimization was done. So a different hash with an implementation optimized more for ARM could give more protection than Argon2 on ARM because you would be able to use higher cost factors while still using the same amount of wall clock time. But I don't think such a hash function exists, and if not you could as well create a more ARM optimized implementation of Argon2.