]> git.sesse.net Git - stockfish/commitdiff
Reduce SIMD register count from 32 to 16
authormstembera <MissingEmail@email>
Fri, 22 Sep 2023 02:26:11 +0000 (19:26 -0700)
committerJoost VandeVondele <Joost.VandeVondele@gmail.com>
Fri, 22 Sep 2023 17:15:34 +0000 (19:15 +0200)
in the case of avx512 and vnni512 archs.

Up to 17% speedup, depending on the compiler, e.g.

```
AMD pro 7840u (zen4 phoenix apu 4nm)
bash bench_parallel.sh ./stockfish_avx512_gcc13 ./stockfish_avx512_pr_gcc13 20 10
sf_base =  1077737 +/-   8446 (95%)
sf_test =  1264268 +/-   8543 (95%)
diff    =   186531 +/-   4280 (95%)
speedup =  17.308% +/- 0.397% (95%)
```

Prior to this patch, it appears gcc spills registers.

closes https://github.com/official-stockfish/Stockfish/pull/4796

No functional change

src/nnue/nnue_feature_transformer.h

index 56442bac9b17818ba74610c1213d21e853c512d8..902918b2c6ff433656a110c9714481a7b72eac77 100644 (file)
@@ -69,7 +69,7 @@ namespace Stockfish::Eval::NNUE {
   #define vec_add_psqt_32(a,b) _mm256_add_epi32(a,b)
   #define vec_sub_psqt_32(a,b) _mm256_sub_epi32(a,b)
   #define vec_zero_psqt() _mm256_setzero_si256()
-  #define NumRegistersSIMD 32
+  #define NumRegistersSIMD 16
   #define MaxChunkSize 64
 
   #elif USE_AVX2