git.sesse.net Git - stockfish/log

Explicitly zero TT upon resize.

as discussed in issue #1349, the way pages are allocated with calloc might imply some overhead on first write.
This overhead can be large and slow down the first search after a TT resize significantly, especially for large TT.
Using an explicit clear of the TT on resize fixes this problem.

Not implemented, but possibly useful for large TT, is to do this zero-ing using all search threads. Not only would this be faster, it could also lead to a more favorable memory allocation on numa systems with a first touch policy.

No functional change.

Include x-ray attacks through all queens independently of the color.

When calculating attacks from rooks/bishops current master includes
x-rays through own queen. This patch includes also x-rays through
opponent queen.

Credits go to Brian who inspired for this idea
https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/Z3APRYpQeMU

STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 41549 W: 7544 L: 7244 D: 26761
Elo 2.05 [-0.29,4.19] (95%)
http://tests.stockfishchess.org/tests/view/5a3b5fe50ebc590ccbb8bf9a

LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 110010 W: 14208 L: 13739 D: 82063
Elo 1.20 [-0.27,2.55] (95%)
http://tests.stockfishchess.org/tests/view/5a3b8e7c0ebc590ccbb8bfad

bench: 5544445

Upon changing the number of threads, make sure all threads are bound

The heuristic to avoid thread binding if less than 8 threads are requested resulted in the first 7 threads not being bound.
The branch was verified to yield a roughly 13% speedup by @CoffeeOne on the appropriate hardware and OS, and an earlier version of this patch tested well on his machine:

http://tests.stockfishchess.org/tests/view/5a3693480ebc590ccbb8be5a
ELO: 9.24 +-4.6 (95%) LOS: 100.0%
Total: 5000 W: 634 L: 501 D: 3865

To make sure all threads (including mainThread) are bound as soon as the total number exceeds 7, recreate all threads on a change of thread number.
To do this, unify Threads::init, Threads::exit and Threads::set are unified in a single Threads::set function that goes through the needed steps.
The code includes several suggestions from @joergoster.

Fixes issue #1312

No functional change

Allow for general transposition table sizes. (#1341)

For efficiency reasons current master only allows for transposition table sizes that are N = 2^k in size, the index computation can be done efficiently as (hash % N) can be written instead as (hash & 2^k - 1). On a typical computer (with 4, 8... etc Gb of RAM), this implies roughly half the RAM is left unused in analysis.

This issue was mentioned on fishcooking by Mindbreaker:
http://tests.stockfishchess.org/tests/view/5a3587de0ebc590ccbb8be04

Recently a neat trick was proposed to map a hash into the range [0,N[ more efficiently than (hash % N) for general N, nearly as efficiently as (hash % 2^k):

https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/

namely computing (hash * N / 2^32) for 32 bit hashes. This patch implements this trick and now allows for general hash sizes. Note that for N = 2^k this just amounts to using a different subset of bits from the hash. Master will use the lower k bits, this trick will use the upper k bits (of the 32 bit hash).

There is no slowdown as measured with [-3, 1] test:

http://tests.stockfishchess.org/tests/view/5a3587de0ebc590ccbb8be04
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 128498 W: 23332 L: 23395 D: 81771

There are two (smaller) caveats:

1) the patch is implemented for a 32 bit hash (so that a 64 bit multiply can be used), this effectively limits the number of clusters that can be used to 2^32 or to 128Gb of transpostion table. That's a change in the maximum allowed TT size, which could bother those using 256Gb or more regularly.

2) Already in master, an excluded move is hashed into the position key in rather simple way, essentially only affecting the lower 16 bits of the key. This is OK in master, since bits 0-15 end up in the index, but not in the new scheme, which picks the higher bits. This is 'fixed' by shifting the excluded move a few bits up. Eventually a better hashing scheme seems wise.

Despite these two caveats, I think this is a nice improvement in usability.

Bench: 5346341

Enhanced verify search (#1338)

by disabling null-move-pruning for the side to move for first part of
the remaining search tree. This helps to better recognize zugzwang.

STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 18220 W: 3379 L: 3253 D: 11588
http://tests.stockfishchess.org/tests/view/5a2fa6460ebc590ccbb8bc2f

LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41899 W: 5359 L: 5265 D: 31275
http://tests.stockfishchess.org/tests/view/5a2fcf440ebc590ccbb8bc47

For further detail see commit notes and discussion at
https://github.com/pb00068/Stockfish/commit/6401a80ab91df5c54390ac357409fef2e51ff5bb

bench: 5776193

Remove QueenMinorsImbalance array #1340

Remove QMI array and adjust bishop, knight and queen coefficients
in QuadraticOurs and QuadraticTheirs arrays in compensation of
this removal.

STC : http://tests.stockfishchess.org/tests/view/5a21d8350ebc590ccbb8b5fe
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 49659 W: 9029 L: 8957 D: 31673

LTC : http://tests.stockfishchess.org/tests/view/5a33c0dd0ebc590ccbb8bd7e
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 45905 W: 5834 L: 5745 D: 34326

Bench: 5176807

Make staticEval independent of the search path

Current master can yield different staticEvals depending on the path
used to reach the position. The reason for this is that the evaluation after a
null move is always computed subtracting 2 * Eval::Tempo, while this is not
the case for lazy or specialized evals. This patch always adds tempo to evals,
which doesn't affect playing strength:

LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 59911 W: 7616 L: 7545 D: 44750

STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 104947 W: 18897 L: 18919 D: 67131

Fixes issue #1335

Bench: 5208264

Simplify other checks (#1337)

Replace an intricate definition with a more natural one.

Master was excluding squares occupied by a pawn which was blocked by a pawn.
This version excludes any squares occupied by a pawn which is blocked by "something"

Passed STC
http://tests.stockfishchess.org/tests/view/5a2f557b0ebc590ccbb8bc0d
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 44211 W: 8009 L: 7928 D: 28274

and LTC
http://tests.stockfishchess.org/tests/view/5a301d440ebc590ccbb8bc80
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 31958 W: 4108 L: 4002 D: 23848

Bench 5000136

Simplify other checks #1334

Simplify the other check penalty computation. Compared to current master,

a) it uses a 143 kingDanger penalty instead of S(10, 10) for the "otherCheck"
(credits to ElbertoOne for finding a suitable kingDanger range to replace the score
and to Guardian for showing this could also be a neutral change at LTC).
This makes our king safety model more consistent and simpler.

b) it might also score more than one "otherCheck" penalty for a given piece type instead of just one

c) it might score many pinned penalties instead of just one.

d) It also remove 3 conditionals and uses simpler expressions.
So it was tested as a SPRT[-3, 1]

Passed STC
http://tests.stockfishchess.org/tests/view/5a2b560b0ebc590ccbb8ba6b
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 11705 W: 2217 L: 2080 D: 7408

And LTC
http://tests.stockfishchess.org/tests/view/5a2bfd0d0ebc590ccbb8bab0
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 26812 W: 3575 L: 3463 D: 19774

Trying to improve on b) another attempt was made to score also the
"otherchecks" for piece types which had some safe checks, but this
failed STC http://tests.stockfishchess.org/tests/view/5a2c79e60ebc590ccbb8badd

bench: 5149133

Add Resources to understand code base (#1332)

No functional change.

Don't consider defending queen as check blocker (#1328)

STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 22636 W: 4212 L: 3990 D: 14434
http://tests.stockfishchess.org/tests/view/5a2506140ebc590ccbb8b75a

LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 63448 W: 8287 L: 7965 D: 47196
http://tests.stockfishchess.org/tests/view/5a253a610ebc590ccbb8b776

bench: 5767699

A better contempt implementation for Stockfish (#1325)

* A better contempt implementation for Stockfish

The round 2 of TCEC season 10 demonstrated the benefit of having a nice contempt implementation: it gives the strongest programs in the tournament the ability to slow down the game when they feel the position is slightly worse, prefering to stay in a complicated (even if slightly risky) middle game rather than simplifying by force into a drawn endgame.

The current contempt implementation of Stockfish is inadequate, and this patch is an attempt to provide a better one.

Passed STC non-regression test against master:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 83360 W: 15089 L: 15075 D: 53196
http://tests.stockfishchess.org/tests/view/5a1bf2de0ebc590ccbb8b370

This contempt implementation is showing promising results in certains situations. For instance, it obtained a nice +30 Elo gain when playing with contempt=40 against Stockfish 7, compared to current master:

• master against SF 7 (20000 games at LTC): +121.2 Elo
• this patch with contempt=40 (20000 games at LTC): +154.11 Elo

This was the result of real cooperative work from the Stockfish team, with key ideas coming from Stefan Geschwentner (locutus2) and Chris Cain (ceebo) while most of the community helped with feedback and computer time.

In this commit the bench is unchanged by default, but you can test at home with the new contempt in the UCI options. The style of play will change a lot when using contempt different of zero (I repeat: not done in this version by default, however)!

The Stockfish team is still deliberating over the best default contempt value in self-play and the best contempt modeling strategy, to help users choosing a contempt value when playing against much weaker programs. These informations will be given in future commits when available :-)

Bench: 5051254

* Remove the prefetch

No functional change.

Pawn endgames directly skip early pruning.

Instead of checking individual steps. Idea by @Stefano80.

passed STC
http://tests.stockfishchess.org/tests/view/5a23e5d20ebc590ccbb8b6d5
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 37445 W: 6866 L: 6773 D: 23806

passed LTC
http://tests.stockfishchess.org/tests/view/5a24260c0ebc590ccbb8b716
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 38780 W: 4946 L: 4848 D: 28986

Bench: 5466219

Avoid warnings by the Clang compiler

Clang gave a couple of warnings for unused parameters after the recnet commit "Use constexpr when makes sense".

No functional change.

Use a Direction enum for Square deltas

Currently the NORTH/WEST/SOUTH/EAST values are of type Square, but conceptually they are not squares but directions. This patch separates these values into a Direction enum and overloads addition and subtraction to allow adding a Square to a Direction (to get a new Square).

I have also slightly trimmed the possible overloadings to improve type safety. For example, it would normally not make sense to add a Color to a Color or a Piece to a Piece, or to multiply or divide them by an integer. It would also normally not make sense to add a Square to a Square.

This is a non-functional change.

Use bool(Bitboard b) instead of !!b (#1321)

The idiom !!b is confusing newcomers (e.g. Stefan needs explaining here https://groups.google.com/d/msg/fishcooking/vYqnsRI4brY/Gaf60QuACwAJ).

No functional change.

Use constexpr when makes sense

No functional change.

Compile without exceptions

Add the -fno-exceptions flag to the Makefile to avoid the unecessary exceptions support in the executable (we do not use any exception in Stockfish at the moment).

This change gives a 9.2% reduction in size for the executable binary.

Before : executable size = 376956 bytes
After: executable size = 347652 bytes

No functional change.

Minor cleanup of search.cpp

Four very minor edits. Note that tte->save() uses posKey and
not pos.key() in other places.

Originally I also added a futility_move_counts() function to
make things more consistent with the futility_margin() and
reduction() functions. But then razor_margin[] should probably
also be turned into a function, etc. Maybe a good idea, maybe not.
So I did not include it.

Non functional change.

Attack threats

Give bonus for safe attack threats from bishops and rooks on opponent queen

STC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 8629 W: 1599 L: 1438 D: 5592
http://tests.stockfishchess.org/tests/view/5a1ad4490ebc590ccbb8b30d

LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 7093 W: 987 L: 846 D: 5260
http://tests.stockfishchess.org/tests/view/5a1aec5d0ebc590ccbb8b317

Bench: 5051254

OpenBSD friendly start.

Simplify good/bad capture detection. bench 5336313

Fix comments. Bench: 5109559.

Simplify away the PawnSet[] imbalance array (#1308)

Simplify away the PawnSet[] imbalance array

STC:
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 52977 W: 9550 L: 9484 D: 33943
http://tests.stockfishchess.org/tests/view/5a06b4780ebc590ccbb8a833

LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 83717 W: 10599 L: 10559 D: 62559
http://tests.stockfishchess.org/tests/view/5a0aa36a0ebc590ccbb8aa99

Bench: 5340212

Simplify some kingring penalties expressions

The new "weak" expression helps simplify the safe check calculations for rooks or minors, (but the end result for all the safe checks is the exactly the same as in current master)

The only functional change is for the "outer king ring" (for example, squares f3 g3 h3 when white king is on g1). In current master, there was a 191 penalty if any of these was not defended at all.
With this pr, there is this 191 penalty if any of these is not defended at all or is only defended by a white queen.

Tested as a simplification
STC
http://tests.stockfishchess.org/tests/view/59fb03d80ebc590ccbb89fee
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 66167 W: 12015 L: 11971 D: 42181
(against master (Update Copyright year inMakefile))

LTC
http://tests.stockfishchess.org/tests/view/5a0106ae0ebc590ccbb8a55f
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 15790 W: 2095 L: 1968 D: 11727
(against master (Handle BxN trade as good capture when history scor))

same as #1296 but rebased on latest master
bench: 5109559

Add comments to pos.see_ge()

In terms of technical changes this patch eliminates the return
statements from the main loop of pos.see_ge() and replaces two conditional
computations with a single bitwise negation.

No functional change

Capture Stat Simplification- Bench: 5363761

Always do MaxCardinality checks.

Stockfish currently relies on the "filter_root_moves" function also
having the side effect of clamping Cardinality against MaxCardinality
(the actual piece count in the tablebases). So if we skip this function,
we will end up probing in the search even without tablebases installed.

We cannot bail out of this function before this check is done, so move
the MultiPV hack a few lines below.

Simplify Null Move Search condition

Removes depth condition, adjust parameters.

passed STC:
http://tests.stockfishchess.org/tests/view/5a008cbc0ebc590ccbb8a512
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 29282 W: 5317 L: 5210 D: 18755

passed LTC:
http://tests.stockfishchess.org/tests/view/5a00d8530ebc590ccbb8a541
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 26893 W: 3458 L: 3345 D: 20090

Bench: 5015773

Handle BxN trade as good capture when history score is good

STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 19374 W: 3499 L: 3294 D: 12581
http://tests.stockfishchess.org/tests/view/59fc23f50ebc590ccbb8a0bf

LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 91030 W: 11680 L: 11274 D: 68076
http://tests.stockfishchess.org/tests/view/59fc43ad0ebc590ccbb8a0d0

Bench: 5482249

Introduce capture history table for capture move sorting

Introduce capture move history table indexed by moved piece,
target square and captured piece type for sorting capture moves.

STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 11374 W: 2096 L: 1924 D: 7354
http://tests.stockfishchess.org/tests/view/59fac8dc0ebc590ccbb89fc5

LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 24791 W: 3196 L: 3001 D: 18594
http://tests.stockfishchess.org/tests/view/59fae4d20ebc590ccbb89fd9

Bench: 5536775

Replace easyMove with simple scheme

Reduces time for a stable bestMove, giving some of the won time for the next move.

the version before the pvDraw passed both STC and LTC

passed STC:
http://tests.stockfishchess.org/tests/view/59e98d5a0ebc590ccbb896ec
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 78561 W: 13945 L: 13921 D: 50695
elo =    0.106 +-    1.445 LOS:   55.716%

passed LTC:
http://tests.stockfishchess.org/tests/view/59eb9df90ebc590ccbb897ae
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 29056 W: 3640 L: 3530 D: 21886
elo =    1.315 +-    1.982 LOS:   90.314%

This version, rebased on pvDrawPR with the obvious change, was verified again on STC:

http://tests.stockfishchess.org/tests/view/59ee104e0ebc590ccbb89899
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 19890 W: 3648 L: 3525 D: 12717
elo =    2.149 +-    2.895 LOS:   92.692%

and LTC:
http://tests.stockfishchess.org/tests/view/59f9673a0ebc590ccbb89ea0
Total             :    17966
Win               :     2273 (  12.652%)
Loss              :     2149 (  11.961%)
Draw              :    13544 (  75.387%)
Score             :   50.345%
Sensitivity       :    0.014%
2*(W-L)/(W+L)     :    5.608%

LLR  [-3.0,  1.0] :     2.95

BayesElo range    : [  -1.161,   4.876,  10.830] (DrawElo:  341.132)
LogisticElo range : [  -0.501,   2.105,   4.677]
LOS               :   94.369 %

LTC again:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 17966 W: 2273 L: 2149 D: 13544
LogisticElo range : [ -0.501, 2.105, 4.677]
LOS : 94.369 %

unchanged bench: 5234652

Update Copyright year inMakefile

No functional change.

Extra thinking before accepting draw PVs.

If the PV leads to a draw (3-fold / 50-moves) position
and we're ahead of time, think a little longer, possibly
finding a better way.

As this is most likely effective at higher draw rates,
tried speculative LTC after a yellow STC:

STC:
http://tests.stockfishchess.org/tests/view/59eb173a0ebc590ccbb8975d
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 56095 W: 10013 L: 9902 D: 36180
elo = 0.688 +- 1.711 LOS: 78.425%

LTC:
http://tests.stockfishchess.org/tests/view/59eba1670ebc590ccbb897b4
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 59579 W: 7577 L: 7273 D: 44729
elo = 1.773 +- 1.391 LOS: 99.381%

bench: 5234652

Fix premature using of all available time in x/y TC

In x/y time controls there was a theoretical possibility
to use all available time few moves before the clock will
be updated with new time. This patch fixes that issue.

Tested at 60/15 time control:

LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 113963 W: 20008 L: 20042 D: 73913

The test was done without adjudication rules!

Bench 5234652

Don't filter root moves if MultiPV mode is enabled

A band-aid patch to workaround current TB code
limitations with multi PV.

Hopefully this will be removed after committing the
big update of TB impementation, now under discussion.

No functional change.

Add initiative to trace

No functional change

Fix issue #1268

If the search is quit before skill.pick_best is called,
skill.best_move might be MOVE_NONE.

Ensure skill.best is always assigned anyhow.

Also retire the tricky best_move() and let the underlying
semantic to be clear and explicit.

No functional change.

Simplify bonus for bishop on long diagonal

Removing 2 conditions, and increase the ThreatbyPawn to compensate.

Passed STC
http://tests.stockfishchess.org/tests/view/59dbde900ebc5916ff64be6d
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 14236 W: 2615 L: 2483 D: 9138

Passed LTC
http://tests.stockfishchess.org/tests/view/59dc26470ebc5916ff64be92
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 16552 W: 2136 L: 2010 D: 12406

Bench: 5234652

WLDEntryPiece -> WDLEntryPiece for consistency

No functional change.

Good bishops on the main diagonals

Bonus in midgame for bishops on long diagonals when the central squares are not occupied by pawns.

Author: ElbertoOne

Passed STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 10801 W: 1955 L: 1786 D: 7060
http://tests.stockfishchess.org/tests/view/59cf5c1d0ebc5916ff64b9da

and LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 83978 W: 10685 L: 10303 D: 62990
http://tests.stockfishchess.org/tests/view/59cf6f6c0ebc5916ff64b9e4

Bench: 5620312

Decrease reduction for exact PV nodes

STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 59004 W: 10621 L: 10249 D: 38134

LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 25801 W: 3306 L: 3108 D: 19387

Bench: 5742466

Measure nodes after search finished.

Only affects nmpsec in the multithreaded case.

No functional change.

Tweak statScore condition

The first change (ss->statScore >= 0) does nothing.

The second change ((ss-1)->statScore >= 0 ) has a massive change.
(ss-1)->statScore is not set until (ss-1) begins to apply LMR to moves.
So we now increase the reduction for bad quiets when our opponent is
running through the first captures and the hash move.

STC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 57762 W: 10533 L: 10181 D: 37048

LTC
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 19973 W: 2662 L: 2480 D: 14831

Bench: 5037819

Let ss->ply denote the number of plies from the root to the current node

This patch lets ss->ply be equal to 0 at the root of the search.

Currently, the root has ss->ply == 1, which is less intuitive:

- Setting the rootNode bool has to check (ss-1)->ply == 0.

- All mate values are off by one: the code seems to assume that mated-in-0
  is -VALUE_MATE, mate-1-in-ply is VALUE_MATE-1, mated-in-2-ply is VALUE_MATE+2, etc.
  But the mate_in() and mated_in() functions are called with ss->ply, which is 1 in
  at the root.

- The is_draw() function currently needs to explain why it has "ply - 1 > i" instead
  of simply "ply > i".

- The ss->ply >= MAX_PLY tests in search() and qsearch() already assume that
  ss->ply == 0 at the root. If we start at ss->ply == 1, it would make more sense to
  go up to and including ss->ply == MAX_PLY, so stop at ss->ply > MAX_PLY. See also
  the asserts testing for 0 <= ss->ply && ss->ply < MAX_PLY.

The reason for ss->ply == 1 at the root is the line "ss->ply = (ss-1)->ply + 1" at
the start for search() and qsearch(). By replacing this with "(ss+1)->ply = ss->ply + 1"
we keep ss->ply == 0 at the root. Note that search() already clears killers in (ss+2),
so there is no danger in accessing ss+1.

I have NOT changed pv[MAX_PLY + 1] to pv[MAX_PLY + 2] in search() and qsearch().
It seems to me that MAX_PLY + 1 is exactly right:

- MAX_PLY entries for ss->ply running from 0 to MAX_PLY-1, and 1 entry for the
  final MOVE_NONE.

I have verified that mate scores are reported correctly. (They were already reported
correctly due to the extra ply being rounded down when converting to moves.)

The value of seldepth output to the user should probably not change, so I add 1 to it.
(Humans count from 1, computers from 0.)

A small optimisation I did not include: instead of setting ss->ply in every invocation
of search() and qsearch(), it could be set once for all plies at the start of
Thread::search(). This saves a couple of instructions per node.

No functional change (unless the search searches a branch MAX_PLY deep), so bench
does not change.

Score unopposed weak pawns only if majors

Do not use the opposed flag for scoring backward and isolated pawns
in pawns.cpp, instead give a S(5,25) bonus for each opponent unopposed
weak pawns when we have a rook or a queen on the board.

STC run stopped after 113188 games:
LLR: 1.63 (-2.94,2.94) [0.00,5.00]
Total: 113188 W: 20804 L: 20251 D: 72133
http://tests.stockfishchess.org/tests/view/59b58e4d0ebc5916ff64b12e

LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 66673 W: 8672 L: 8341 D: 49660
http://tests.stockfishchess.org/tests/view/59b902580ebc5916ff64b231

This is Alain Savard's idea, just with a different bonus.
Original patch there:

green STC, http://tests.stockfishchess.org/tests/view/597dcd2b0ebc5916ff64a09b
yellow LTC, http://tests.stockfishchess.org/tests/view/597ea69e0ebc5916ff64a0e6

Bench: 6259498

Higher Move Overhead

This shoudl reduce time losses experienced by
users after new time management code.

Verified for no regression in very short TC (4sec + 0.1)
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 35262 W: 7426 L: 7331 D: 20505

Bench 5322108

Extend ShelterWeakness array by dimension isKingFile

Use different penalties for weaknesses in the pawn shelter
depending on whether it is on the king's file or not.

STC
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 71617 W: 13471 L: 13034 D: 45112

LTC
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 48708 W: 6463 L: 6187 D: 36058

Bench: 5322108

Streamlline reduction based on movecount

Use MoveCount History only at quiet moves and simply reduce
reduction by one depth instead of increasing moveCount in formula.

STC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 27511 W: 5171 L: 4919 D: 17421

LTC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 92337 W: 12135 L: 11748 D: 68454

Bench: 6351176

Small simplication of see_ge()

Two simplifications:

- Remove the initialisation to 0 of occupied, which is now unnecessary.
- Remove the initial check for nextVictim == KING

If nextVictim == KING, then PieceValue[MG][nextVictim] will be 0, so that
balance >= threshold is true. So see_ge() returns true anyway.

No functional change.

Travis CI: Make all warnings into errors

Compile with -Werror flag. To make debugging easier
also show compile ourput.

This flag is enabled only in Travis CI, not in the shipped
Makefile becuase we can't test on every possible platform.

Remove unneeded compile options.

In light of issue #1232, a test was performed about the value of '-fno-exceptions' and a second one of the combination '-fno-exceptions -fno-rtti'. It turns out these options are can be removed without introducing slowdown.

STC for removing '-fno-exceptions'
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 13678 W: 2572 L: 2439 D: 8667

STC for removing '-fno-exceptions -fno-rtti' (current patch)
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 32557 W: 6074 L: 5973 D: 20510

No functional change.

Prevent Stockfish from exiting if DTZ table is not present

During TB initialisation, Stockfish checks for the presence of WDL
tables but not for the presence of DTZ tables. When attempting to probe
a DTZ table, it is therefore possible that the table is not present.
In that case, Stockfish should neither exit nor report an error.

To verify the bug:
$ ./stockfish
setoption name SyzygyTable value <path_to_WDL_dir>
position fen 8/8/4r3/4k3/8/1K2P3/3P4/6R1 w - -
go infinite
Could not mmap() /opt/tb/regular/KRPPvKR.rtbz
$

(On my system, the WDL tables are in one directory and the DTZ tables
in another. If they are in the same directory, it will be difficult
to trigger the bug.)

The fix is trivial: check the file descriptor/handle after opening
the file.

No functional change.

Fix a warning with MSVC

warning C4244: '*=': conversion from 'double' to 'int', possible loss of data

No functional change.

Multi-threaded search testing with valgrind

Also check with valgrind the multi-threaded search.

On top of the fix for issue #1227 (PR #1235).

No functional change.

Fix uninitialized memory usage

After increasing the number of threads, the histories were not cleared,
resulting in uninitialized memory usage.

This patch fixes this by clearing threads histories in Thread c'tor as
is the idomatic way.

This fixes issue 1227

No functional change.

Adjust moveCount history only at LMR

STC:
LLR: 3.32 (-2.94,2.94) [-3.00,1.00]
Total: 17584 W: 3277 L: 3131 D: 11176

LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 26412 W: 3447 L: 3334 D: 19631

Bench: 5417521

Simplify away non-normal moves in SEE

credit goes to @mstembera for suggesting this approach.
SEE now deals with castling, promotion and en passant in a similar way.

passed STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 32902 W: 6079 L: 5979 D: 20844

passed LTC
LLR: 3.92 (-2.94,2.94) [-3.00,1.00]
Total: 110698 W: 14198 L: 14145 D: 82355

Bench: 5713905

Appveyor: do a Debug and Release build

And set x86 and x64 platforms for real.

Currently this is broken and the same binary is compiled for all platforms.

This is becuase we use a custom build step. OTH the default
build step seems not compatible with cmake generated *sln file.

No functional change.

Improve multi-threaded mate finding

If any thread found a 'mate in x' stop the search. Previously only
mainThread would do so. Requires the bestThread selection to be
adjusted to always prefer mate scores, even if the search depth is less.

I've tried to collect some data for this patch. On 30 cores, mate finding
seems 5-30% faster on average. It is not so easy to get numbers for this,
as the time to find a mate fluctuates significantly with multi-threaded runs,
so it is an average over 100 searches for the same position. Furthermore,
hash size and position make a difference as well.

Bench: 5965302

Count all weak squares in the king ring with a single popcount

Passed STC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 26966 W: 4993 L: 4745 D: 17228
http://tests.stockfishchess.org/tests/view/599e798a0ebc5916ff64aa8c

and LTC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 39570 W: 5104 L: 4857 D: 29609
http://tests.stockfishchess.org/tests/view/599ee5230ebc5916ff64aabe

Bench: 5965302

Use moveCount history for reduction

Use less reduction for moves with larger moveCount if your
opponent did an unexpected (== high moveCount) move in the
previous ply... unexpected moves might need unexpected answers.

passed STC:
http://tests.stockfishchess.org/tests/view/599f08cc0ebc5916ff64aace
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 9638 W: 1889 L: 1720 D: 6029

passed LTC:
http://tests.stockfishchess.org/tests/view/599f1e5c0ebc5916ff64aadc
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 28308 W: 3742 L: 3533 D: 21033

Bench: 5747429

Avoid constructing an empty tuple in qsearch

Avoid constructing, passing as a parameter and binding a useless empty tuple of pointers in the qsearch move picker constructor.

Also reformat the scoring function in movepicker.cpp and do some cleaning in evaluate.cpp while there.

No functional change.

Improve appeyor build

Check bench number and do not
hard-code *.cpp file names.

No functional change.

Restore safety margin of 60ms

What this patch does is:
* increase safety margin from 40ms to 60ms. It's worth noting that the previous
  code not only used 60ms incompressible safety margin, but also an additional
  buffer of 30ms for each "move to go".
* remove a whart, integrating the extra 10ms in Move Overhead value instead.
  Additionally, this ensures that optimumtime doesn't become bigger than maximum
  time after maximum time has been artificially discounted by 10ms. So it keeps
  the code more logical.

Tested at 3 different time controls:

Standard 10+0.1
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 58008 W: 10674 L: 10617 D: 36717

Sudden death 16+0
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 59664 W: 10945 L: 10891 D: 37828

Tournament 40/10
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 16371 W: 3092 L: 2963 D: 10316

bench: 5479946

Fix some Clang warnings

Found by Clang in extra verbose mode :-)

No functional change.

Wide bench coverage

Add tests for:

- Positions with move list
- Chess960 positions

Now bench covers almost all cases, only few endgames
are still out of reach (verified with lcov)

It is a non functionality patch, but bench
changed because we added new test positions.

bench: 5479946

Restore perft

Rewrite perft to be placed naturally inside new
bench code. In particular we don't have special
custom code to run perft anymore but perft is
just a new parameter of 'go' command.

So user API is now changed, old style command:

$perft 5

becomes

$go perft 4

No functional change.

Rewrite benchmark

First step in improving bench to handle
arbitrary UCI commands so to test many
more code paths.

This first patch just set the new code
structure.

No functional change.

Reformat time manager code

In particular clarify that 'sd'
parameter is used only in !movesToGo
case.

Verified with Ivan's check tool it is
equivalent to original code.

No functional change.

Collect more corrections to optimum/maximum

The only call site of Time.maximum() corrected by 10.
Do this directly in remaining().

Ponder increased Time.optimum by 25% in init(). Idem.
Delete unused includes.

No functional change.

Speed up Trevis CI

Avoid a couple of redundant rebuilds and compile
with 2 threads since travis gives 2vCPUs.

Also enable -O1 optimization for valgrind and
sanitizers, it should be safe withouth false
positives and it gives a very sensible speed
up, especially with valgrind.

The spee dup allow us to increase testing to
depth 10, useful for thread sanitizer.

No functional change.

Clarify stats range

Current update formula ensures that the
possible value range is [-32 * D, 32 * D].

So we never overflow if abs(32 * D) < INT16_MAX

Thanks to Joost and mstembera to clarify this.

No functional change.

Time management simplification

STC (http://tests.stockfishchess.org/tests/view/598188a40ebc5916ff64a21b):
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 25363 W: 4658 L: 4545 D: 16160

LTC (http://tests.stockfishchess.org/tests/view/5981d59a0ebc5916ff64a229):
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 75356 W: 9690 L: 9640 D: 56026

40/10 TC (http://tests.stockfishchess.org/tests/view/5980c5780ebc5916ff64a1ed):
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 19377 W: 3650 L: 3526 D: 12201

15+0 TC (http://tests.stockfishchess.org/tests/view/5982cb730ebc5916ff64a25d):
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 5913 W: 1217 L: 1069 D: 3627

This time management handles base time and movestogo cases separatelly. One can test one case without affecting the other. Also, increment usage can be tested separately without (necessarily) affecting sudden death or x moves in y seconds performance.

On stable machines there are no time losses on 0.1+0.001 time control (tested on i7 + Windows 10 platform).

Bench 5608839

Fix involuntary conversions of ExtMove to Move

The trick is to create an ambiguity for the
compiler in case an unwanted conversion to
Move is attempted like in:

    ExtMove m1{Move(17),4}, m2{Move(4),17};

    std::cout << (m1 < m2) << std::endl; // 1
    std::cout << (m1 > m2) << std::endl; // 1(!)

This fixes issue #1204

No functional change.

Unify stats update()

Now that is a bit bigger makes sense to
unify some duplicated code.

No functional change.

Use int16_t in History values

Reduces memory footprint by ~1.2MB (per thread).

Strong pressure: small but mesurable gain
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 258430 W: 46977 L: 45943 D: 165510

Low pressure: no regression
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 73542 W: 13058 L: 13026 D: 47458

Strong pressure + LTC: elo gain confirmed
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 31489 W: 4532 L: 4295 D: 22662

Tested for crashing on overflow and after 70K
games at STC we have only 4 time losses,
possible candidate for an overflow.

No functional change.

Fix incorrect StateInfo

We use Position::set() to set root position across
threads. But there are some StateInfo fields (previous,
pliesFromNull, capturedPiece) that cannot be deduced
from a fen string, so set() clears them and to not lose
the info we need to backup and later restore setupStates->back().
Note that setupStates is shared by threads but is accessed
in read-only mode.

This fixes regression introduced by df6cb446eaf21

Tested with 3 threads at STC:
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 14436 W: 2304 L: 2196 D: 9936

Bench: 5608839

Run clang-tidy 'modernize'

Some warnings after a run of:

$ clang-tidy-3.8 -checks='modernize-*' *.cpp syzygy/*.cpp -header-filter=.* -- -std=c++11

I have not fixed all suggestions, for instance I still prefer
to declare the type instead of a spread use of 'auto'. I also
perfer good old 'typedef' to the new 'using' form.

I have not fixed some warnings in the last functions of
syzygy code because those are still the original functions
and need to be completely rewritten anyhow.

Thanks to erbsenzaehler for the original idea.

No functional change.

Thread code reformat

Simplify out low level sync stuff (mutex
and friends) and avoid to use them directly
in many functions.

Also some renaming and better comment while
there.

No functional change.

Retire States global variable

And other small touches in uci.cpp

No functional change.

Fix the handling of opposite bishops in KXK endgame evaluation

The case of three or more bishops against a long king must look at all of the
bishops and not just the first two in the piece lists. This patch makes sure
that the position is treated as a win when there are bishops on opposite
colors. This functional change is very small because bench remains the same.

LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 24249 W: 4349 L: 4275 D: 15625
http://tests.stockfishchess.org/tests/view/598186530ebc5916ff64a218

Bench: 5608839

Simplify finished search in ponder/infinite mode.

In this rare case (e.g. go infinite on a stalemate),
just spin till ponderhit/stop comes.

The Thread::wait() is a renmant of the old YBWC
code, today with lazy SMP, threads don't need to
wait when outside of their idle loop.

No functional change.

Re-apply the fix for Limits::ponder race

But this time correctly set Threads.ponder

We avoid using 'limits' for passing pondering
flag because we don't want to have 2 ponder
variables in search scope: Search::Limits.ponder
and Threads.ponder. This would be confusing also
because limits.ponder is set at the beginning of
the search and never changes, instead Threads.ponder
can change value asynchronously during search.

No functional change.

Revert "Fix a race on Limits::ponder"

This reverts commit 5410424e3d036b43715c7989aa99e449cdcde18e.

After the commit pondering is broken, so revert for now. I will
resubmit with a proper fix.

The issue is mine, Joost original code is correct.

No functional change.

Fix a race on Limits::ponder

Limits::ponder was used as a signal between uci and search threads,
but is not an atomic variable, leading to the following race as
flagged by a sanitized binary.

Expect input:
```
spawn  ./stockfish
send "uci\n"
expect "uciok"
send "setoption name Ponder value true\n"
send "go wtime 4000 btime 4000\n"
expect "bestmove"
send "position startpos e2e4 d7d5\n"
send "go wtime 4000 btime 4000 ponder\n"
sleep 0.01
send "ponderhit\n"
expect "bestmove"
send "quit\n"
expect eof
```

Race:
```
WARNING: ThreadSanitizer: data race (pid=7191)
  Read of size 4 at 0x0000005c2260 by thread T1:

  Previous write of size 4 at 0x0000005c2260 by main thread:

  Location is global 'Search::Limits' of size 88 at 0x0000005c2220 (stockfish+0x0000005c2260)
```

The reason of teh race is that ponder is not just set in UCI go()
assignment but also is signaled by an async ponderhit in uci.cpp:

      else if (token == "ponderhit")
          Search::Limits.ponder = 0; // Switch to normal search

The fix is to add an atomic bool to the threads structure to
signal the ponder status, letting Search::Limits to reflect just
what was passed to 'go'.

No functional change.

Fix some races and clarify the code

Better split code that should be run at
startup from code run at ucinewgame. Also
fix several races when 'bench', 'perft' and
'ucinewgame' are sent just after 'bestomve'
from the engine threads are still running.

Also use a specific UI thread instead of
main thread when setting up the Position
object used by UI uci loop. This fixes a
race when sending 'eval' command while searching.

We accept a race on 'setoption' to allow the
GUI to change an option while engine is searching
withouth stalling the pipe. Note that changing an
option while searchingg is anyhow not mandated by
UCI protocol.

No functional change.

Make variable naming consistent

moved_piece is the only variable in search not using camel case

Unify scoring functions in MovePicker

No functional change.

Remove Stack/thread dependence in movepick

as a lower level routine, movepicker should not depend on the
search stack or the thread class, removing a circular dependency.
Instead of copying the search stack into the movepicker object,
as well as accessing the thread class for one of the histories,
pass the required fields explicitly to the constructor (removing
the need for thread.h and implicitly search.h in movepick.cpp).
The signature is thus longer, but more explicit:

Also some renaming of histories structures while there.

passed STC [-3,1], suggesting a small elo impact:

LLR: 3.13 (-2.94,2.94) [-3.00,1.00]
Total: 381053 W: 68071 L: 68551 D: 244431
elo = -0.438 +- 0.660 LOS: 9.7%

No functional change.

Tweak connected pawns seed[] array values

Raise a little bit the values in the connected pawns seed[] array.

STC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 99033 W: 17939 L: 17448 D: 63646
http://tests.stockfishchess.org/tests/view/597355630ebc5916ff649e3e

LTC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 48044 W: 6371 L: 6099 D: 35574
http://tests.stockfishchess.org/tests/view/597596610ebc5916ff649eba

Bench: 5608839

Closes #1182

Rework the "unsupported" penalty into a "supported" bonus

A pawn (according to all the searched positions of a bench run) is not supported 85% of the time,
(in current master it is either isolated, backward or "unsupported").

So it made sense to try moving the S(17, 8) "unsupported" penalty value into the base pawn value hoping for a more representative pawn value, and accordingly
a) adjust backward and isolated so that they stay more or less the same as master
b) increase the mg connected bonus in the supported case by S(17, 0) and let the Connected formula find a suitable eg value according to rank.

Tested as a simplification SPRT(-3, 1)

Passed STC
http://tests.stockfishchess.org/tests/view/5970dbd30ebc5916ff649dd6
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 19613 W: 3663 L: 3540 D: 12410

Passed LTC
http://tests.stockfishchess.org/tests/view/597137780ebc5916ff649de3
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 24721 W: 3306 L: 3191 D: 18224

Bench: 5581946

Closes #1179

Remove redundant if-statements

No functional change

Closes #1173

Tuned PSQT using a custom tuner.

bench: 5878420

Closes #1177

Simplify aspiration window

Don't modify alpha window on fail-high

Bench: 5875983

Closes #1172

Faster travis checks

in the last month a couple of timeouts have been seen in travis valgrind testing, leading to undesired false positives. The precise cause of this is unclear: a normal valgrind instrumented run is about 6min, the timeout is 10min. Either there are rare hangs (not reproduced locally), or maybe the actual runtime fluctuates on the travis infrastructure (which uses VMs on AWS as far as I know). This patch leads to roughly a 2x speedup of the instrumented testing by reducing the depth from 10 to 9. If timeouts persist, it needs further analysis.

No functional change.

Closes #1171

Move game_phase() to material.cpp

For some reason, although game phase is used
only in material, it is computed in Position.

Move computation to material, where it belongs,
and remove the useless call chain.

No functional change.

Revert "Remove questionable gcc flags from profile-build"

This reverts commit 0371a8f8c4a043cb3e7d08b5b8e7d08d49f28324.

Provide selective search depth info for each pv move

No functional change

Closes #1166

Move stop signal to Threads

Instead of having Signals in the search namespace,
make the stop variables part of the Threads structure.
This moves more of the shared (atomic) variables towards
the thread-related structures, making their role more clear.

No functional change

Closes #1149