In particular on ARM processors. Original patch by
Jean-Francois, sligtly modified by me to preserve
the meaning of NO_PREFETCH flag.
Verified with gcc, clang and icc that prefetch instruction
is correctly created.
No functional change.
+# if defined(__INTEL_COMPILER) || defined(__ICL) || defined(_MSC_VER)
_mm_prefetch(addr, _MM_HINT_T2);
_mm_prefetch(addr+64, _MM_HINT_T2); // 64 bytes ahead
_mm_prefetch(addr, _MM_HINT_T2);
_mm_prefetch(addr+64, _MM_HINT_T2); // 64 bytes ahead
+# else
+ __builtin_prefetch(addr);
+ __builtin_prefetch(addr+64);
+# endif