Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Author here!

I wanted to share a follow-up to this post. https://bluuewhale.github.io/posts/further-optimizing-my-jav...

This time I went back with a profiler and optimized the actual hot path.

A huge chunk of time was going to Objects.equals() because of profile pollution / missed devirtualization.

After fixing that, the next bottleneck was ARM/NEON “movemask” pain (VectorMask.toLong()), so I tried SWAR… and it ended up faster (even on x86, which I did not expect).



FYI, we ended up implementing a _really_ nice SWAR version in the Carbon derivative of SwissTable that might be worth looking at for inspiration: https://github.com/carbon-language/carbon-lang/blob/trunk/co...

Can see the rest of that file and the adjacent `raw_hashtable.h` for the rest of the SwissTable-like implementation and `hashing.h` for the hash function.

FWIW, it consistently out-performs SwissTable in some respects, but uses a weaker but faster hash function that is good enough for the hash table, but not good for other use cases.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: