]> git.sesse.net Git - stockfish/commit - src/nnue/nnue_feature_transformer.h
Reduce SIMD register count from 32 to 16
authormstembera <MissingEmail@email>
Fri, 22 Sep 2023 02:26:11 +0000 (19:26 -0700)
committerJoost VandeVondele <Joost.VandeVondele@gmail.com>
Fri, 22 Sep 2023 17:15:34 +0000 (19:15 +0200)
commit95fe2b9a9d33811a7fcad1cdfea79c54e8fdb074
tree5349ef845e53aebb68c50ad567a6750df215dcbc
parentfce4cc1829f25fd52c5dd637ab54d867eec065fb
Reduce SIMD register count from 32 to 16

in the case of avx512 and vnni512 archs.

Up to 17% speedup, depending on the compiler, e.g.

```
AMD pro 7840u (zen4 phoenix apu 4nm)
bash bench_parallel.sh ./stockfish_avx512_gcc13 ./stockfish_avx512_pr_gcc13 20 10
sf_base =  1077737 +/-   8446 (95%)
sf_test =  1264268 +/-   8543 (95%)
diff    =   186531 +/-   4280 (95%)
speedup =  17.308% +/- 0.397% (95%)
```

Prior to this patch, it appears gcc spills registers.

closes https://github.com/official-stockfish/Stockfish/pull/4796

No functional change
src/nnue/nnue_feature_transformer.h