Marco Costalba [Mon, 2 Mar 2015 07:11:39 +0000 (08:11 +0100)]
Disable spinlocks
Now that c++11 branch has been merged in master,
disable unconditionally the spinlocks and use mutex
instead. This will allow to run fishtest even on HT
machines withouth changes.
In the future we will reintorduce spinlocks, once
we will have took care of fishtest.
Marco Costalba [Sun, 1 Mar 2015 16:12:09 +0000 (17:12 +0100)]
Allow to disable spinlocks
And use mutex instead. You may never want to do this.
It is a workaround to run c++11 on fishtest where many
machiens have HTenabled and this can be a problem when
number of cores set is higher than number of physical cores.
To disable spinlocks, just compile with -DNO_SPINLOCK flag
Marco Costalba [Mon, 23 Feb 2015 18:22:37 +0000 (19:22 +0100)]
Improve spinlock implementation
Calling lock.test_and_set() in a tight loop creates expensive
memory synchronizations among processors and penalize other
running threads. So syncronize only only once at the beginning
with fetch_sub() and then loop on a simple load() that puts much
less pressure on the system.
Marco Costalba [Sun, 22 Feb 2015 13:59:55 +0000 (14:59 +0100)]
Use spinlock instead of mutex for Threads and SplitPoint
It is reported to be defenitly faster with increasing
number of threads, we go from a +3.5% with 4 threads
to a +15% with 16 threads.
The only drawback is that now when testing with more
threads than physical available cores, the speed slows
down to a crawl. This is expected and was similar at what
we had setting the old sleepingThreads to false.
Marco Costalba [Thu, 19 Feb 2015 09:08:29 +0000 (10:08 +0100)]
Add a couple of asserts to late join
Document and clarify that we cannot rejoin on ourselves
and that we never late join if we are master and all
slaves have finished, inded in this case we exit idle_loop.
Marco Costalba [Thu, 19 Feb 2015 08:51:17 +0000 (09:51 +0100)]
Remove useless condition in late join
In case of Threads.size() == 2 we have that sp->allSlavesSearching
is always false (because we have finished our search), bestSp is
always NULL and we never late join, so there is no need to special
case here.
Tested with dbg_hit_on(sp && sp->allSlavesSearching) and
verified it never fires.
Marco Costalba [Tue, 17 Feb 2015 09:10:58 +0000 (10:10 +0100)]
Compute SplitPoint::spLevel on the fly
And retire a redundant field. This is important also
from a concept point of view becuase we want to keep
SMP structures as simple as possible with the only
strictly necessary data.
Verified with
dbg_hit_on(sp->spLevel != level)
that the values are 100% the same out of more 50K samples.
Joona Kiiski [Sat, 14 Feb 2015 20:46:00 +0000 (20:46 +0000)]
Improve smp performance for high number of threads
Balance threads between split points.
There are huge differences between different machines and autopurging makes it very difficult to measure the improvement in fishtest, but the following was recorded for 16 threads at 15+0.05:
For Bravone (1000 games): 0 ELO
For Glinscott (1000 games): +20 ELO
For bKingUs (1000 games): +50 ELO
For fastGM (1500 games): +50 ELO
The change was regression for no one, and a big improvement for some, so it should be fine to commit it.
Also for 8 threads at 15+0.05 we measured a statistically significant improvement:
ELO: 6.19 +-3.9 (95%) LOS: 99.9%
Total: 10325 W: 1824 L: 1640 D: 6861
Finally it was verified that there was no (significant) regression for
mstembera [Tue, 3 Feb 2015 03:09:37 +0000 (11:09 +0800)]
Profile build options
I went through all the individual compile options that differ between
-fprofile-generate/-fprofile-use and -fprofile-arcs/-fbranch-probabilities
and distilled the speed difference down to only turning off
-fno-peel-loops and -fno-tracer. Using this we still get the full speedup
(maybe a bit more because other optimizations stay on) and it's also much cleaner
because we can get rid of the "@rm -f ucioption.gc*" hack for all versions of gcc.
Marco Costalba [Sat, 31 Jan 2015 17:39:51 +0000 (18:39 +0100)]
Implicit conversion from ExtMove to Move
Verified with perft there is no speed regression,
and code is simpler. It is also conceptually correct
becuase an extended move is just a move that happens
to have also a score.
Marco Costalba [Sun, 25 Jan 2015 18:22:43 +0000 (19:22 +0100)]
Simplify skill level and reduce ELO
This patch has two positive effects:
- Retire a hackish formula and leave
just a natural, simple and plain one.
- Reduce strenght at very low level, but
don't impact medium/high levels.
Actually even at level 0, SF is still too
strong for many beginners (this was reported
many times for instance on Droidfish user
comments on Google Play).
Test on fishtest shows that ELO drop is around
170 ELO at level 0 (good!), 130 ELO at level 1
and smoothly reduces (as expected) until level
10 where the drop is just of 8 ELO.
Joona Kiiski [Sun, 25 Jan 2015 22:03:57 +0000 (22:03 +0000)]
Stockfish 6 Release Candidate 3
- Fix a skill level problem: Don't allow move pruning at root node
- Revert "Fix profile build for gcc on Mac OSX". Results for a faster binary in x86-64.
- Fix a MSVC warning
Marco Costalba [Tue, 20 Jan 2015 21:17:22 +0000 (22:17 +0100)]
Don't use _pext_u64() directly
This intrinsic to call BMI2 PEXT instruction is
defined in immintrin.h. This header should be
included only when USE_PEXT is defined, otherwise
we define _pext_u64 as 0 forcing a nop.
But under some mingw platforms, even if we don't
include the header, immintrin.h gets included
anyhow through an include chain that starts with
STL <algorithm> header. So we end up both defining
_pext_u64 function and at the same time defining
_pext_u64 as 0 leading to a compile error.
The correct solution is of not using _pext_u64 directly.
This patch fixes a compile error with some mingw64
package when compiling with x86-64.
Marco Costalba [Tue, 20 Jan 2015 08:13:30 +0000 (09:13 +0100)]
Try hard to retrieve a ponder move
In case we stop the search during a fail-high
it is possible we return to GUI without a ponder
move. This patch try harder to find a ponder move
retrieving it from TT. This is important in games
played with 'ponder on'.
Marco Costalba [Wed, 21 Jan 2015 18:53:26 +0000 (19:53 +0100)]
Document how to enable PEXT with MSVC
When not using Makefile, e.g. with MSVC, if hardware
supports BMI2 instructions, then USE_PEXT should be
added in project configuration to enable pext support.
Marco Costalba [Tue, 20 Jan 2015 21:17:22 +0000 (22:17 +0100)]
Don't use _pext_u64() directly
This intrinsic to call BMI2 PEXT instruction is
defined in immintrin.h. This header should be
included only when USE_PEXT is defined, otherwise
we define _pext_u64 as 0 forcing a nop.
But under some mingw platforms, even if we don't
include the header, immintrin.h gets included
anyhow through an include chain that starts with
STL <algorithm> header. So we end up both defining
_pext_u64 function and at the same time defining
_pext_u64 as 0 leading to a compile error.
The correct solution is of not using _pext_u64 directly.
This patch fixes a compile error with some mingw64
package when compiling with x86-64.