We recently added a bonus for double pawn attacks on unsupported enemy pawns,
on June 27. However, it is possible that the unsupported pawn may become a passer
by simply pushing forward out of the double attack. By rewarding double attacks,
we may inadvertently reward the creation of enemy passers, by encouraging both of
our would-be stoppers to attack the enemy pawn even if there is no opposing
friendly pawn on the same file.
Here, we revise this term to exclude passed pawns. In order to simplify the code
with this change included, we non-functionally rewrite Attacked2Unsupported to
be a penalty for enemy attacks on friendly pawns, rather than a bonus for our
attacks on enemy pawns. This allows us to exclude passed pawns with a simple
& ~e->passedPawns[Us], while passedPawns[Them] is not yet defined in this part
of the code.
This dramatically reduces the proportion of positions in which Attacked2Unsupported
is applied, to about a third of the original. To compensate, maintaining the same
average effect across our bench positions, we nearly triple Attacked2Unsupported
from S(0, 20) to S(0, 56). Although this pawn formation is rare, it is worth more
than half a pawn in the endgame!
Daniel Axtens [Wed, 10 Jul 2019 23:12:54 +0000 (09:12 +1000)]
Enable popcount and prefetch for ppc-64
PowerPC has had popcount instructions for a long time, at least as far
back as POWER5 (released 2004). Enable them via a gcc builtin.
Using a gcc builtin has the added bonus that if compiled for a processor
that lacks a hardware instruction, gcc will include a software popcount
implementation that does not use the instruction. It might be slower
than the table lookups (or it might be faster) but it will certainly work.
So this isn't going to break anything.
On my POWER8 VM, this leads to a ~4.27% speedup.
Fir prefetch, the gcc builtin generates a 'dcbt' instruction, which is
supported at least as far back as the G5 (2002) and POWER4 (2001).
Smoothly change playing strength with skill level. (#2142)
The current skill levels (1-20) allow for adjusting playing strengths, but
do so in big steps (e.g. level 10 vs level 11 is a ~143 Elo jump at STC).
Since the 'Skill Level' input can already be a floating point number, this
patch uses the fractional part of the input to provide the user with
fine control, allowing for varying the playing strength essentially
continuously.
The implementation internally still uses integer skill levels (needed since they pick Depths),
but non-deterministically rounds up or down the used skill level such that the average integer
skill corresponds to the input floating point one. As expected, intermediate
(fractional) skill levels yield intermediate playing strenghts.
Tested at STC, playing level 10 against levels between 10 and 11 for 10000 games
Introduce coordination between searching threads (#2204)
this patch improves threading performance by introducing some coordination between threads.
The observation is that threading is an area where a lot of Elo can potentially be gained:
https://github.com/glinscott/fishtest/wiki/UsefulData#elo-from-threading
At STC, 8 threads gain roughly 320 Elo, vs sequential at the same time,
however, loses 66 Elo against a single thread with 8x more time.
This 66 Elo should be partially recoverable with improved threading.
To improve threading, this patch introduces some LMR at nodes that are already being searched by other threads.
This requires some coordination between threads, avoiding however synchronisation.
To do so, threads leave a trail of breadcrumbs to mark the nodes they are searching.
These breadcrumbs are stored in a small hash table, which is only probed at low plies (currently ply < 8).
A couple of variants of this patch passed both STC and LTC threaded tests.
I picked the simpler, more robust version.
I expect that further tests can find further improvements.
For the sequential code there is no change in bench, and an earlier version of this patch passed a non-regression test.
STC (10+0.1 @ 1 thread)
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 10471 W: 2364 L: 2220 D: 5887
http://tests.stockfishchess.org/tests/view/5d087ee20ebc5925cf0a6381
passed the additional non-regression tests at 2 and 4 threads 20+0.2 TC. The code was rebased on master prior to testing.
Jörg Oster [Sun, 30 Jun 2019 13:16:20 +0000 (15:16 +0200)]
Try to get a more precise bench time (#2211)
Initialization of larger hash sizes can take some time.
Don't include this time in the bench by resetting the timer after Search::clear().
Also move 'ucinewgame' command down in the list, so that it is processed
after the configuration of Threads and Hash size.
interesting result ! I would have expected that search would resolve such positions
correctly on the very next move. This is not a very common pattern, and when it happens,
it will quickly disappear. So I'm quite surprised that it passed LTC.
I would be even more surprised if this would resist a simplification.
Anyway, let's try to imagine a few cases.
a) If you have White d5 f5 against Black e6, and White to move
last move by Black was probably a capture on e6 and White is about to recapture on e6
b) If you have White d5 f5 against e6, and Black to move
last move by White was possibly a capture on d5 or f5
or the pawn on e6 was pinned or could not move for some reason.
and white wants to blast open the position and just pushed d4-d5 or f4-f5
Some possible follow-ups
a) Motif is so rare that the popcount() can be safely replaced with a bool()
But this would not pass a SPRT[0,4],
So try a simplification with bool() and also without the & ~theirAttacks
b) If it works, we probably can simply have this in the loop
if (lever) score += S(0, 20);
c) remove all this and tweak something in search for pawn captures (priority, SEE, extension,..)
Vizvezdenec [Thu, 27 Jun 2019 07:21:50 +0000 (09:21 +0200)]
Introduce attacks on space area
This patch introduces a small malus for every square in our space mask
that is attacked by enemy. The value of the malus is completely arbitrary
and is something we can tweak, also maybe we can gain some elo with tweaking
space threshold after this addition.
joergoster [Thu, 27 Jun 2019 06:56:35 +0000 (08:56 +0200)]
Improve multiPV mode
Skip all moves during the Non-PV (zero-window) search which will be
searched as PV moves later anyways. We also wake sure the moves will
be reported to the GUI despite they're not being searched — some GUIs
may get confused otherwise, and it would unnecessarily complicate the
code.
Vizvezdenec [Fri, 21 Jun 2019 08:04:31 +0000 (10:04 +0200)]
Rewrite "More bonus for free passed pawn"
-removes wideUnsafeSquares bitboard
-removes a couple of bitboard operations
-removes one if operator
-updates comments so they actually represent what this part of code is doing now.
Normally I would think such changes are risky, specially for PvNodes,
but after trying a few other versions, it seems this version is more
sound than I initially thought.
Aside from this, a small funtional change is made to return
singularBeta instead of beta to be more consistent with the fail-soft
logic used in other parts of search.
=============================
How to continue from there ?
We could try to audit other parts of the search where the "cutNode"
variable is used, and try to use dynamic info based on heuristic
eval rather than on this variable, to check if the idea behind this
patch could also be applied successfuly.
Fixes issues #2126 and #2189 where no progress in rootDepth is made for particular fens:
8/8/3P3k/8/1p6/8/1P6/1K3n2 b - - 0 1
8/1r1rp1k1/1b1pPp2/2pP1Pp1/1pP3Pp/pP5P/P5K1/8 w - - 79 46
the cause are the shuffle extensions. Upon closer analysis, it appears that in these cases a shuffle extension is made for every node searched, and progess can not be made. This patch implements a fix, namely to limit the number of extensions relative to the number of nodes searched. The ratio employed is 1/4, which fixes the issues seen so far, but it is a heuristic, and I expect that certain positions might require an even smaller fraction.
Furthermore, to confirm that the shuffle extension in this form indeed still brings Elo, one more test at VLTC was performed:
VLTC:
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 142022 W: 20963 L: 20435 D: 100624
http://tests.stockfishchess.org/tests/view/5d03630d0ebc5925cf0a011a
syzygy1 [Tue, 18 Jun 2019 21:27:34 +0000 (23:27 +0200)]
Partial revert of "Assorted trivial cleanups 5/2019".
Since root_probe() and root_probe_wdl() do not reset all tbRank values if they fail,
it is necessary to do this in rank_root_move(). This fixes issue #2196.
Alternatively, the loop could be moved into both root_probe() and root_probe_wdl().
VoyagerOne [Fri, 14 Jun 2019 17:59:17 +0000 (13:59 -0400)]
Simplify SEE Pruning (#2191)
Simplify SEE Pruning
Note this should also be a speedup...
If givesCheck is extended we know (except for DC) that it will have a positive SEE. So this new logic will be triggered before doing another expensive SEE function.
Increase size of the pawns table by the factor 8. This decreases the number of recalculations of pawn structure information significantly (at least at LTC).
I have done measurements for different depths and pawn cache sizes.
First are given the number of pawn entry calculations are done (in parentheses is the frequency that a call to probe triggers a pawn entry calculation). The delta% are the percentage of less done pawn entry calculations in comparison to master
STC: (non-regression test because at STC the effect is smaller than at LTC)
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 22767 W: 5160 L: 5040 D: 12567
http://tests.stockfishchess.org/tests/view/5d00f6040ebc5925cf09c3e2
original idea from early 2018 by @jerrydonaldwatson
Code slightly rewritten to be shorter and more logical, no functinal changes
compared to passed patch.
protonspring [Mon, 3 Jun 2019 13:16:33 +0000 (07:16 -0600)]
Simplify Outposts #2176
This is a functional simplification. This is NOT the exact version that was tested. Beyond the testing, an assignment was removed and a piece changes for consistency.
Instead of rewarding ANY square past an opponent pawn as an "outpost," only use squares that are protected by our pawn. I believe this is more consistent with what the chess world calls an "outpost."
31m059 [Sun, 9 Jun 2019 12:19:07 +0000 (08:19 -0400)]
Simplify k-value for passers. Bench: 3854907 (#2182)
Stockfish evaluates passed pawns in part based on a variable k, which shapes the passed pawn bonus based on the number of squares between the current square and promotion square that are attacked by enemy pieces, and the number defended by friendly ones. Prior to this commit, we gave a large bonus when all squares between the pawn and the promotion square were defended, and if they were not, a somewhat smaller bonus if at least the pawn's next square was. However, this distinction does not appear to provide any Elo at STC or LTC.
Where do we go from here? Many promising Elo-gaining patches were attempted in the past few months to refine passed pawn calculation, by altering the definitions of unsafe and defended squares. Stockfish uses these definitions to choose the value of k, so those tests interact with this PR. Therefore, it may be worthwhile to retest previously promising but not-quite-passing tests in the vicinity of this patch.
protonspring [Wed, 29 May 2019 08:00:32 +0000 (02:00 -0600)]
Simplify semiopen_file (#2165)
This is a non-functional simplification. Since our file_bb handles either Files or Squares, using Square here removes some code. Not likely any performance difference despite the test.
We evaluate defended and unsafe squares for a passed pawn push based on friendly and enemy rooks and queens on the passed pawn's file. Prior to this patch, we further required that these rooks and queens be able to directly attack the passed pawn. However, this restriction appears unnecessary and worth almost exactly 0 Elo at LTC.
The simplified code allows rooks and queens to attack/defend the passed pawn through other pieces of either color.
protonspring [Thu, 16 May 2019 12:13:16 +0000 (06:13 -0600)]
Score and Select Best Thread in same loop (#2125)
This is a non-functional simplification that combines vote counting and thread selecting in the same loop.
It is possible that the best thread would be updated more frequently than master, but I'm not sure it matters here. Perhaps "mostVotes" is a better name than "bestVote?"
xoto10 [Sun, 31 Mar 2019 15:33:32 +0000 (16:33 +0100)]
Update failedHighCnt rule #2063
Treat all threads the same as main thread and increment
failedHighCnt on fail highs. This makes the search try
again at lower depth.
@vondele suggested also changing the reset of failedHighCnt
when there is a fail low. Tests including this passed so the
branch has been updated to include both changes. failedHighCnt
is now handled exactly the same in helper threads and the main
thread. Thanks vondele :-)
mstembera [Wed, 15 May 2019 08:41:58 +0000 (01:41 -0700)]
Remove per thread instances of Endgames. (#2056)
Similar to PSQT we only need one instance of the Endgames resource. The current per thread copies are identical and read only(after initialization) so from a design point of view it doesn't make sense to have them.
Tested for no slowdown.
http://tests.stockfishchess.org/tests/view/5c94377a0ebc5925cfff43ca
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 17320 W: 3487 L: 3359 D: 10474
This is a functional simplification that simplifies
some of the math for connected pawns. The bench is
different because I moved a /2 from opposed into
the connected array.
svivanov72 [Wed, 15 May 2019 08:22:21 +0000 (11:22 +0300)]
Precompute repetition info (#2132)
Store repetition info in StateInfo instead of recomputing it in
three different places. This saves some work in has_game_cycle()
where this info is needed for positions before the root.
Tested for non-regression at STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 34104 W: 7586 L: 7489 D: 19029
http://tests.stockfishchess.org/tests/view/5cd0676e0ebc5925cf044b56
xoto10 [Sun, 5 May 2019 13:22:40 +0000 (14:22 +0100)]
Remove pawn count in space() calculation #2139
Simplification. Various attempts to optimise the pawn
count bonus showed little effect, so remove pawn count
altogether and compensate by subtracting 1 instead of 4.
High rootDepths, selDepths and generally searches are increasingly
common with long time control games, analysis, and improving hardware.
In this case, depths of MAX_DEPTH/MAX_PLY (128) can be reached,
and the search tree is truncated.
In principle MAX_PLY can be easily increased, except for a technicality
of storing depths in a signed 8 bit int in the TT. This patch increases
MAX_PLY by storing the depth in an unsigned 8 bit, after shifting by the
most negative depth stored in TT (DEPTH_NONE).
No regression at STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 42235 W: 9565 L: 9484 D: 23186
http://tests.stockfishchess.org/tests/view/5cdb35360ebc5925cf0595e1
Verified to reach high depths on
k1b5/1p1p4/pP1Pp3/K2pPp2/1P1p1P2/3P1P2/5P2/8 w - -
info depth 142 seldepth 154 multipv 1 score cp 537 nodes 26740713110 ...
Miguel Lahoz [Tue, 7 May 2019 15:55:56 +0000 (23:55 +0800)]
Remove PvNode template from reduction
This functional simplification removes the PvNode reduction and adjusts
the ttPv lmr condition accordingly. Their definitions only differ by the
inclusions of ttPv. Aside from this, shallow move pruning definition
will be the only other functional difference, but this does not seem to
matter too much.
Sergei Ivanov [Sun, 14 Apr 2019 12:50:37 +0000 (15:50 +0300)]
Fix cycle detection in presence of repetitions
In master search() may incorrectly return a draw score in the following
corner case: there was a 2-fold repetition during the game, and the
current position can be reached by a move from a repeated one. This case
is treated as an upcoming 3-fold repetition, which it is not.
Here is a testcase demonstrating the issue (note that the moves
after FEN are required). The input:
position fen 8/8/8/8/8/8/p7/2k4K b - - 0 1 moves c1b1 h1g1 b1c1 g1h1 c1b1 h1g1 b1a1 g1h1
go movetime 1000
saying that the game will be drawn by repetion. However the other possible
move for black, Kb2, avoids repetitions and wins. The patch fixes this behavior.
In particular it finds mate in 10 in the above position.
this patch proposes to simplify back to this basic and easier to
understand version. In case there is a need to run a [-3, 1] VLTC on
this one, it can be done, but it is resource intensive, and not needed
IMO.
Looking at this 2 ideas being recent clean elo gainers I have a feeling that we can add also rook and queen protection bonuses or overall move this stuff in pieces loop in the same way as we do pieces attacking bonuses on their kingring... :) Thx fisherman for original idea.
The "move" class variable is Movepick is removed (removes some abstraction) which saves a few assignment operations, and the effects of "filter" is limited to the current move (movePtr). The resulting code is a bit more verbose, but it is also more clear what is going on. This version is NOT tested, but is substantially similar to:
We can remove the values in Pawns if we just use the piece arrays in Position. This reduces the size of a pawn entry. This simplification passed individually, and in concert with ps_passedcount100 (removes passedCount storage in pawns.).
This is very dangerous code by itself because triggers **at the leafs**
and in the above position keeps extending endlessly. In normal games
time deadline makes the search to stop sooner or later, but in fixed
seacrch we just hang possibly for a very long time. This is not acceptable
because 'go depth 13' shall not be a surprise for any position.
Introduce a new search extension when pushing an advanced passed pawn is
also suggested by the first killer move. There have been previous tests
which have similar ideas, mostly about pawn pushes, but it seems to be
overkill to extend too many moves. My idea is to limit the extension to
when a move happens to be noteworthy in some other way as well, such as
in this case, when it is also a killer move.
For future tests, it looks like this will interact heavily with passed
pawn evaluation. It may be good to try more variants of some of the more
promising evaluations tests/tweaks.
Instead of looping through kfrom,kto, rfrom, rto, we can use BetweenBB. This is less lines of code and it is more clear what castlingPath actually is. Personal benchmarks are all over the place. However, this code is only executed when loading a position, so performance doesn't seem that relevant.
Raise kingDanger threshold and adjust constant term #2087
The kingDanger term is intended to give a penalty which increases rapidly in the middlegame but less so in the endgame. To this end, the middlegame component is quadratic, and the endgame component is linear. However, this produces unintended consequences for relatively small values of kingDanger: the endgame penalty will exceed the middlegame penalty. This remains true up to kingDanger = 256 (a S(16, 16) penalty), so some of these inaccurate penalties are actually rather large.
In this patch, we increase the threshold for applying the kingDanger penalty to eliminate some of this unintended behavior. This was very nearly, but not quite, sufficient to pass on its own. The patch was finally successful by integrating a second kingDanger tweak by @Vizvezdenec, increasing the kingDanger constant term slightly and improving both STC and LTC performance.
Where do we go from here? I propose that in the future, any attempts to tune kingDanger coefficients should also consider tuning the kingDanger threshold. The evidence shows clearly that it should not be automatically taken to be zero.
Special thanks to @Vizvezdenec for the kingDanger constant tweak. Thanks also to all the approvers and CPU donors who made this possible!
Marco Costalba [Sat, 6 Apr 2019 10:43:41 +0000 (12:43 +0200)]
Fix sed for OS X (#2080)
The sed command is a bit different in Mac OS X (why not!).
The ‘-i’ option required a parameter to tell what extension to add for the
backup file. To fix it, just add extension for backup file, for example ‘.bak’
Use average bestMoveChanges across all threads #2072
The current update only by main thread depends on the luck of
whether main thread sees any/many changes to the best move or not.
It then makes large, lumpy changes to the time to be
used (1x, 2x, 3x, etc) depending on that sample of 1.
Use the average across all threads to get a more reliable
number with a smoother distribution.
Further work:
Similar changes might be possible for the fallingEval and timeReduction calculations (and elsewhere?), using either the min, average or max values across all threads.
protonspring [Mon, 25 Mar 2019 19:04:14 +0000 (13:04 -0600)]
Use simple array for Pawns Connected bonus #2061
Simplification which removes the pawns connected array.
Instead of storing the values in an array, the values are
calculated real-time. This is about 1.6% faster on my machines.
Moez Jellouli [Sun, 31 Mar 2019 08:51:08 +0000 (10:51 +0200)]
Shuffle detection #2064
Shuffle detection procedure :
Shuffling positions are detected if
the last 36 moves are reversible (rule50_count() > 36),
the position have been already in the TT,
there is a still a pawn on the board (to avoid special endings like KBN vs K).
The position is then judged as a draw.
An extension is realized if we already made 14 successive reversible moves in PV to accelerate the detection of the eventual draw.
To go further : we can still improve the idea. The length of the tests need a lot of ressources.
the limit of 36 is logic but must be checked again for special zugzwang positions,
this limit can be decreased in special positions,
the limit of 14 moves for extension has not been tuned.
protonspring [Sun, 31 Mar 2019 08:43:20 +0000 (02:43 -0600)]
Accessor for SquareBB #2067
This is a non-functional code style change.
If we add an accessor function for SquareBB we can consolidate all of the asserts. This is also a bit cleaner because all SquareBB accesses go through this method making future changes easier to manage.
This is covered by the line just before. If we would like to protect
against the piece value of e.g. a N == B, this could be done by an
assert, no need to do this at runtime.
protonspring [Sun, 24 Mar 2019 16:37:38 +0000 (10:37 -0600)]
Simplify Passed Pawns (#2058)
This is a non-functional simplification/speedup.
The truth-table for popcount(support) >= popcount(lever) - 1 is:
------------------lever
------------------0-------1---------2
support--0------X-------X---------0
-----------1------X-------X---------X
-----------2------X-------X---------X
Thus, it is functionally equivalent to just do: support || !more_than_one(lever) which removes the expensive popcounts and the -1.
Result of 20 runs:
base (...h_master.exe) = 1451680 +/- 8202
test (./stockfish ) = 1454781 +/- 8604
diff = +3101 +/- 931
xoto10 [Sun, 10 Mar 2019 21:57:48 +0000 (21:57 +0000)]
Remove !extension check #2045
While looking at pruning using see_ge() (which is very valuable)
it became apparent that the !extension test is not adding any
value - simplify it away.
Further work could be to optimize the remaining see_ge() test. The idea of less pruning at higher depths is valuable, but perhaps the test (-PawnValueEg * depth) can be improved.
CoffeeOne [Wed, 20 Mar 2019 13:50:41 +0000 (14:50 +0100)]
Skip skipping thread scheme (#1972)
Several simplification tests (all with the bounds [-3,1]) were run:
5+0.05 8 threads, failed very quickly:
http://tests.stockfishchess.org/tests/view/5c439a020ebc5902bb5d3970
20+0.2 8 threads, also failed, but needed a lot more games:
http://tests.stockfishchess.org/tests/view/5c44b1b70ebc5902bb5d4e34
Marco Costalba [Tue, 12 Mar 2019 07:35:10 +0000 (08:35 +0100)]
Increase thread stack for OS X (#2035)
On OS X threads other than the main thread are created with a reduced stack
size of 512KB by default, this is dangerously low for deep searches, so
adjust it to TH_STACK_SIZE. The implementation calls pthread_create() with
proper stack size parameter.
Verified for no regression at STC enabling the patch on all platforms where
pthread is supported.