From: Stéphane Nicolet Date: Tue, 5 Dec 2017 06:25:42 +0000 (+0100) Subject: A better contempt implementation for Stockfish (#1325) X-Git-Url: https://git.sesse.net/?p=stockfish;a=commitdiff_plain;h=be382bb0cf5927dc10ff9be882f6980a78d1484a A better contempt implementation for Stockfish (#1325) * A better contempt implementation for Stockfish The round 2 of TCEC season 10 demonstrated the benefit of having a nice contempt implementation: it gives the strongest programs in the tournament the ability to slow down the game when they feel the position is slightly worse, prefering to stay in a complicated (even if slightly risky) middle game rather than simplifying by force into a drawn endgame. The current contempt implementation of Stockfish is inadequate, and this patch is an attempt to provide a better one. Passed STC non-regression test against master: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 83360 W: 15089 L: 15075 D: 53196 http://tests.stockfishchess.org/tests/view/5a1bf2de0ebc590ccbb8b370 This contempt implementation is showing promising results in certains situations. For instance, it obtained a nice +30 Elo gain when playing with contempt=40 against Stockfish 7, compared to current master: • master against SF 7 (20000 games at LTC): +121.2 Elo • this patch with contempt=40 (20000 games at LTC): +154.11 Elo This was the result of real cooperative work from the Stockfish team, with key ideas coming from Stefan Geschwentner (locutus2) and Chris Cain (ceebo) while most of the community helped with feedback and computer time. In this commit the bench is unchanged by default, but you can test at home with the new contempt in the UCI options. The style of play will change a lot when using contempt different of zero (I repeat: not done in this version by default, however)! The Stockfish team is still deliberating over the best default contempt value in self-play and the best contempt modeling strategy, to help users choosing a contempt value when playing against much weaker programs. These informations will be given in future commits when available :-) Bench: 5051254 * Remove the prefetch No functional change. --- diff --git a/src/evaluate.cpp b/src/evaluate.cpp index 8df58609..9f50ded6 100644 --- a/src/evaluate.cpp +++ b/src/evaluate.cpp @@ -840,7 +840,7 @@ namespace { // Initialize score by reading the incrementally updated scores included in // the position object (material + piece square tables) and the material // imbalance. Score is computed internally from the white point of view. - Score score = pos.psq_score() + me->imbalance(); + Score score = pos.psq_score() + me->imbalance() + Eval::Contempt; // Probe the pawn hash table pe = Pawns::probe(pos); @@ -903,6 +903,7 @@ namespace { } // namespace +Score Eval::Contempt = SCORE_ZERO; /// evaluate() is the evaluator for the outer world. It returns a static evaluation /// of the position from the point of view of the side to move. diff --git a/src/evaluate.h b/src/evaluate.h index 95a1f19b..d9e03255 100644 --- a/src/evaluate.h +++ b/src/evaluate.h @@ -31,6 +31,8 @@ namespace Eval { const Value Tempo = Value(20); // Must be visible to search +extern Score Contempt; + std::string trace(const Position& pos); Value evaluate(const Position& pos); diff --git a/src/search.cpp b/src/search.cpp index 4ad1eebb..ed01e0de 100644 --- a/src/search.cpp +++ b/src/search.cpp @@ -96,8 +96,6 @@ namespace { Move best = MOVE_NONE; }; - Value DrawValue[COLOR_NB]; - template Value search(Position& pos, Stack* ss, Value alpha, Value beta, Depth depth, bool cutNode, bool skipEarlyPruning); @@ -202,8 +200,9 @@ void MainThread::search() { TT.new_search(); int contempt = Options["Contempt"] * PawnValueEg / 100; // From centipawns - DrawValue[ us] = VALUE_DRAW - Value(contempt); - DrawValue[~us] = VALUE_DRAW + Value(contempt); + + Eval::Contempt = (us == WHITE ? make_score(contempt, contempt / 2) + : -make_score(contempt, contempt / 2)); if (rootMoves.empty()) { @@ -444,7 +443,7 @@ void Thread::search() { int improvingFactor = std::max(229, std::min(715, 357 + 119 * F[0] - 6 * F[1])); Color us = rootPos.side_to_move(); - bool thinkHard = DrawValue[us] == bestValue + bool thinkHard = bestValue == VALUE_DRAW && Limits.time[us] - Time.elapsed() > Limits.time[~us] && ::pv_is_draw(rootPos); @@ -532,8 +531,7 @@ namespace { { // Step 2. Check for aborted search and immediate draw if (Threads.stop.load(std::memory_order_relaxed) || pos.is_draw(ss->ply) || ss->ply >= MAX_PLY) - return ss->ply >= MAX_PLY && !inCheck ? evaluate(pos) - : DrawValue[pos.side_to_move()]; + return ss->ply >= MAX_PLY && !inCheck ? evaluate(pos) : VALUE_DRAW; // Step 3. Mate distance pruning. Even if we mate at the next move our score // would be at best mate_in(ss->ply+1), but if alpha is already bigger because @@ -1074,7 +1072,7 @@ moves_loop: // When in check search starts from here if (!moveCount) bestValue = excludedMove ? alpha - : inCheck ? mated_in(ss->ply) : DrawValue[pos.side_to_move()]; + : inCheck ? mated_in(ss->ply) : VALUE_DRAW; else if (bestMove) { // Quiet best move: update move sorting heuristics @@ -1142,8 +1140,7 @@ moves_loop: // When in check search starts from here // Check for an instant draw or if the maximum ply has been reached if (pos.is_draw(ss->ply) || ss->ply >= MAX_PLY) - return ss->ply >= MAX_PLY && !InCheck ? evaluate(pos) - : DrawValue[pos.side_to_move()]; + return ss->ply >= MAX_PLY && !InCheck ? evaluate(pos) : VALUE_DRAW; assert(0 <= ss->ply && ss->ply < MAX_PLY);