libbb/sha1: use SSE2 in unrolled x86-64 code. ~10% faster

function                                             old     new   delta
.rodata                                           108241  108305     +64
sha1_process_block64                                3502    3495      -7
------------------------------------------------------------------------------
(add/remove: 5/0 grow/shrink: 1/1 up/down: 64/-7)              Total: 57 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2 files changed