The rounds value is constant and can be one of three hardcoded values, so
instead of checking it on every loop, just split the function into three
different implementations for each value.
Before:
aes_decrypt_128_aesni: 93.8 (47.58x)
aes_decrypt_192_aesni: 106.9 (49.30x)
aes_decrypt_256_aesni: 109.8 (56.50x)
aes_encrypt_128_aesni: 93.2 (47.70x)
aes_encrypt_192_aesni: 111.1 (48.36x)
aes_encrypt_256_aesni: 113.6 (56.27x)
After:
aes_decrypt_128_aesni: 71.5 (63.31x)
aes_decrypt_192_aesni: 96.8 (55.64x)
aes_decrypt_256_aesni: 106.1 (58.51x)
aes_encrypt_128_aesni: 81.3 (55.92x)
aes_encrypt_192_aesni: 91.2 (59.78x)
aes_encrypt_256_aesni: 109.0 (58.26x)
Signed-off-by: James Almer <jamrial@gmail.com>