]> git.sesse.net Git - stockfish/commitdiff
AVX-512 for smaller affine and feature transforms.
authorTomasz Sobczyk <tomasz.sobczyk1997@gmail.com>
Tue, 3 Nov 2020 21:49:10 +0000 (22:49 +0100)
committerJoost VandeVondele <Joost.VandeVondele@gmail.com>
Sat, 7 Nov 2020 15:49:49 +0000 (16:49 +0100)
For the feature transformer the code is analogical to AVX2 since there was room for easy adaptation of wider simd registers.

For the smaller affine transforms that have 32 byte stride we keep 2 columns in one zmm register. We also unroll more aggressively so that in the end we have to do 16 parallel horizontal additions on ymm slices each consisting of 4 32-bit integers. The slices are embedded in 8 zmm registers.

These changes provide about 1.5% speedup for AVX-512 builds.

Closes https://github.com/official-stockfish/Stockfish/pull/3218

No functional change.


No differences found