commit | 205042c07a3bf6c8e685c434713f2a9e46630cd0 | [log] [tgz] |
---|---|---|
author | Denys Vlasenko <vda.linux@googlemail.com> | Tue Jan 25 17:00:57 2022 +0100 |
committer | Denys Vlasenko <vda.linux@googlemail.com> | Tue Jan 25 17:21:45 2022 +0100 |
tree | e28ec9b7f1b2e922258029befea1432212334323 | |
parent | 99e22d230ded676ab53dfa8ab276c1301c2955a0 [diff] |
libbb/sha1: in unrolled x86-64 code, pass initial W[] in registers, not on stack This can be faster on some CPUs. On Skylake, evidently load latency from L1 (or store-to-load forwarding in LSU) is fast enough to completely hide memory reference latencies here. function old new delta sha1_process_block64 3495 3514 +19 Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>