Excellent library, the only thing that's a bit odd is the requirements for instruction sets seem arbitrary. For example aes-gcm requires sse3, blake hashing has full support for various intruction sets and chooses the best one while sha256 doesn't have optimized code for any instruction set. I'm genuinely interested in the reasoning.
You shouldn't look at libsodium as something that provides implementations for AES, SHA256 or whatever specific primitives.
It was designed to provide high-level APIs to perform common operations, abstracting implementation details. And this is what it gets optimized for. The Hydrogen API which libsodium 2.0 will be based on makes this even more obvious.
Low-level primitives are available (or partially available, in minimal builds) only to provide backward compatibility with NaCl, or because the final construction hasn't been chosen yet.
That said, SHA-2 is about 20% faster in version 1.0.12 while remaining portable.
I understand the design of libsodium for sure, but because libsodium is well written I look at it for reference implementations of specific algorithms. In this particular case I needed a zero-dependency sha256 implementation and I turned to libsodium (and was a bit disappointed by the lack of an sse3 optimization.) Would you accept a pull request that folds in intel's reference implementation with sse3 for sha256?
Fun fact, sha256 in openssl pulls in asn1 parsing even after section garbage collection. Yup.
Some imbedded systems don't have sse3. But I guess if you're doing crypto on such hardware and using TLS then you'll want to use chachapoly and not aes-gcm. My question really focuses on it being bizarre that sha256 doesn't have a specialization for sse3.
For something new, you may want to look at its little brother instead: https://github.com/jedisct1/libhydrogen/wiki
Anyway, version 1.0.12 will be released soon.