Use popcount intrinsic with Interl compiler
authorerbsenzaehler <erbsenzaehler@users.noreply.github.com>
Sun, 1 May 2016 08:57:50 +0000 (10:57 +0200)
committerMarco Costalba <mcostalba@gmail.com>
Sun, 1 May 2016 12:18:16 +0000 (14:18 +0200)
It seems that icc used our fallback version of popcount.
Now use intrinsics.

icc version 16.0.2 (gcc version 5.3.0 compatibility)
bmi2 compile
uname -r 4.5.1-1-ARCH

20xbench gives a nice speedup
./stockfish-icc-master 2161515 +- 34462
./stockfish-icc-sse42 2260857 +- 50349

src/bitboard.h
src/types.h

index a704edb..c4fc26e 100644 (file)
@@ -268,7 +268,7 @@ inline int popcount(Bitboard b) {
   union { Bitboard bb; uint16_t u[4]; } v = { b };
   return PopCnt16[v.u[0]] + PopCnt16[v.u[1]] + PopCnt16[v.u[2]] + PopCnt16[v.u[3]];
 
-#elif defined(_MSC_VER) && defined(__INTEL_COMPILER)
+#elif defined(_MSC_VER) || defined(__INTEL_COMPILER)
 
   return _mm_popcnt_u64(b);
 
index 6f62c77..e45d267 100644 (file)
@@ -64,7 +64,7 @@
 #  define IS_64BIT
 #endif
 
-#if defined(USE_POPCNT) && defined(__INTEL_COMPILER) && defined(_MSC_VER)
+#if defined(USE_POPCNT) && (defined(__INTEL_COMPILER) || defined(_MSC_VER))
 #  include <nmmintrin.h> // Intel header for _mm_popcnt_u64() intrinsic
 #endif