CPU byte order definitely matters to device drivers read/writing across the I/O bus: they must perform wide aligned reads and writes using single CPU loads and stores.
Rob's approach simply won't work there. Similarly, OS-bypass networking and video, which expose hardware device interfaces in user space, require CPU-endian aware libraries.
That said, use Rob's portable approach anytime you don't have a compelling reason not to, if only to not have to worry about alignment and portability. Doing otherwise is premature optimization and a maintenance headache.
That said, use Rob's portable approach anytime you don't have a compelling reason not to, if only to not have to worry about alignment and portability. Doing otherwise is premature optimization and a maintenance headache.