x86 extensions are useless

Author:Wojciech Muła
Added on:2013-12-12

Intel announced new extension to SSE: instructions accelerating calculating hashes SHA-1 and SHA256. As everything else added recently to the x86 ISA, these new instructions address special cases of "something". The number of instructions, encoding modes, etc. is increasing, but do not help in general.

Let see what sha1msg1 xmm1, xmm2 does (type of arguments is packed dword):

result[0] := xmm1[0] xor xmm1[2]
result[1] := xmm1[1] xor xmm1[3]
result[2] := xmm1[2] xor xmm2[0]
result[3] := xmm1[3] xor xmm2[1]

Such generic instruction would be saved as generic_op xmm1, xmm2, imm_1, imm_2, imm_3 and execute following algorithm:

for i := 0 to 3 do
        arg1_indice := imm_1[2*i:2*i + 1]
        arg2_indice := imm_2[2*i:2*i + 1]

        if imm_3[2*i] = 1 then
                arg1 := xmm1
        else
                arg1 := xmm2
        end if

        if imm_3[2*i + 1] = 1 then
                arg2 := xmm2
        else
                arg2 := xmm1
        end if

        result[i] := arg1[arg1_indice] op arg2[arg2_indice]
end for

Then sha1msg1 is just a special case:

generic_xor xmm1, xmm2, 0b11100100, 0b01001110, 0b01010000

Maybe this example is "too generic", too complex, and would be hard to express in hardware. I just wanted to show that we will get shine new instructions useful in few cases. Compilers can vectorize loops and make use of SSE, but SHA is used in drivers, OS and is encapsulated in libraries — sha1msg1 and friends will never appear in ordinary programs.