This patch is a simplification and a fix to dealing with null moves scores that returns proven mates or TB scores by preventing 'null move pruning' if the nullvalue is in that range.
Current solution downgrades nullValues on the non-PV node but the value can be used in a transposed PV-node to the same position afterwards (Triangulation), the later is prone to propagate a wrong score (96.05) to root that will not be refuted unless we search further.
Score of (96.05) can be obtained be two methods,
maxim static-eval returned on Pv update (mostly qSearch)
this downgrade (clamp) in NMP
and theoretically can happen with or without TBs but the second scenario is more dangerous than the first.
This fixes the reproducible case in very common scenarios with TBs as shown in the debugging at discord.
Michael Chaly [Sat, 14 Oct 2023 14:41:41 +0000 (17:41 +0300)]
Use more continuation histories.
This patch allows stats updates and movepicker bonuses for continuation history 3 plies deep - so counter counter move.
Updates and movepicker usage are done with 1/4 multiplier compared to other histories.
Simplify collection of bad moves for history updates.
1. collect only the first 32 moves searched and ignore the rest. So late bad moves get no further negative history updates.
2. collect now for quiet moves also at most 32 bad moves
mstembera [Sun, 1 Oct 2023 06:12:02 +0000 (23:12 -0700)]
Optimize the most common update accumalator cases w/o tiling
In the most common case where we only update a single state
it's faster to not use temporary accumulation registers and tiling.
(Also includes a couple of small cleanups.)
A simpler version
https://tests.stockfishchess.org/tests/view/65190dfacff46e538ee00155
also passed but this version is stronger still
https://tests.stockfishchess.org/tests/view/6519b95fcff46e538ee00fa2
This is a later epoch from the same experiment that led to the previous
master net. In training stage 6, max-epoch was raised to 1,200 near the
end of the first 1,000 epochs.
For more details, see https://github.com/official-stockfish/Stockfish/pull/4795
Local elo at 25k nodes per move (vs. L1-2048 nn-1ee1aba5ed4c.nnue)
ep1079 : 15.6 +/- 1.2
Linmiao Xu [Fri, 12 May 2023 22:07:20 +0000 (18:07 -0400)]
Update NNUE architecture to SFNNv8: L1-2560 nn-ac1dbea57aa3.nnue
Creating this net involved:
- a 6-stage training process from scratch. The datasets used in stages 1-5 were fully minimized.
- permuting L1 weights with https://github.com/official-stockfish/nnue-pytorch/pull/254
A strong epoch after each training stage was chosen for the next. The 6 stages were:
```
1. 400 epochs, lambda 1.0, default LR and gamma
UHOx2-wIsRight-multinet-dfrc-n5000 (135G)
nodes5000pv2_UHO.binpack
data_pv-2_diff-100_nodes-5000.binpack
wrongIsRight_nodes5000pv2.binpack
multinet_pv-2_diff-100_nodes-5000.binpack
dfrc_n5000.binpack
5. 960 epochs, end-lambda 0.7, LR 4.375e-4, gamma 0.995, skip 28
Increased max-epoch to 960 near the end of the first 800 epochs 5af11540bbfe dataset: https://github.com/official-stockfish/Stockfish/pull/4635
6. 1000 epochs, end-lambda 0.7, LR 4.375e-4, gamma 0.995, skip 28
Increased max-epoch to 1000 near the end of the first 800 epochs 1ee1aba5ed dataset: https://github.com/official-stockfish/Stockfish/pull/4782
```
After removing classic evaluation VALUE_KNOWN_WIN is not anymore returned explicit evaluation. So remove and replace it with VALUE_TB_WIN_IN_MAX_PLY.
Measurement on my big bench (bench 16 1 16 pos1000.fen) verifies that at least with current net the calculated evaluation lies always in the open interval (-VALUE_KNOWN_WIN, VALUE_KNOWN_WIN).
So i consider this a non-functional change. But to be safe i tested this also at LTC as requested by Stephane Nicolet.
The commit adds a CI workflow that uses the included-what-you-use (IWYU)
tool to check for missing or superfluous includes in .cpp files and
their corresponding .h files. This means that some .h files (especially
in the nnue folder) are not checked yet.
The CI setup looks like this:
- We build IWYU from source to include some yet unreleased fixes.
This IWYU version targets LLVM 17. Thus, we get the latest release
candidate of LLVM 17 from LLVM's nightly packages.
- The Makefile now has an analyze target that just build the object
files (without linking)
- The CI uses the analyze target with the IWYU tool as compiler to
analyze the compiled .cpp file and its corresponding .h file.
- If IWYU suggests a change the build fails (-Xiwyu --error).
- To avoid false positives we use LLVM's libc++ as standard library
- We have a custom mappings file that adds some mappings that are
missing in IWYU's default mappings
We also had to add one IWYU pragma to prevent a false positive in
movegen.h.
uses a posix compatible script to find the native arch.
(based on ppigazzini's https://github.com/ppigazzini/stockfish-downloader )
use that native arch by default, no changes if ARCH is specified explicitly.
SF can now be compiled in an optimal way simply using
Now that qsearch has its own repetition detection we can flip the order of lines and remove the guard of depth < 0 which is not needed after reordering (i.e. it was there to prevent checking repetition again at depth ==0).
Cleanup code after dropping ICC support in favor of ICX
The commit removes all uses of ICC's __INTEL_COMPILER macro and other
references to ICC. It also adds ICX info to the compiler command and
fixes two typos in Makefile's help output.
Created by retraining the master net on a dataset composed by:
- adding Leela data from T60 jul-dec 2020, T77 nov 2021, T80 jun-jul 2023
- deduplicating and unminimizing parts of the dataset before interleaving
Trained initially with max epoch 800, then increased near the end of training
twice. First to 960, then 1200. After training, post-processing involved:
- greedy permuting L1 weights with https://github.com/official-stockfish/Stockfish/pull/4620
- greedy 2- and 3- cycle permuting with https://github.com/official-stockfish/Stockfish/pull/4640
In the list of datasets below, periods in the filename represent the sequence of
steps applied to arrive at the particular binpack. For example:
test77-dec2021-16tb7p.filter-v6-dd.min-mar2023.unminimized.binpack
1. test77 dec2021 data rescored with 16 TB of syzygy tablebases during data conversion
2. filtered with csv_filter_v6_dd.py - v6 filtering and deduplication in one step
3. minimized with the original mar2023 implementation of `minimize_binpack` in
the tools branch
4. unminimized by removing all positions with score == 32002 (`VALUE_NONE`)
Michael Chaly [Mon, 11 Sep 2023 12:37:18 +0000 (15:37 +0300)]
Do more futility pruning in qsearch
This patch introduces a third futility pruning heuristic in qsearch. The idea is
that the static exchange evaluation is much worse than the difference between
futility base and alpha. Thus we can assume that the probability of the move
being good enough to beat alpha is low so it can be pruned.
Fixes a bug introduced in 2f2f45f, where with AVX-512 the weights and input to
the last layer were being read out of bounds. Now AVX-512 is only used for the
layers it can be used for. Additional static assertions have been added to
prevent more errors like this in the future.
This is a cleanup PR that prepares the automatic checking of missing or
superfluous #include directives via the include-what-you-use (IWYU) tool
on the CI. Unfortunately, IWYU proposes additional includes for
"namespace std" although we don't need them.
To avoid the problem, the commit removes all "using namespace std"
statements from the code and directly uses the std:: prefix instead.
Alternatively, we could add specific usings (e.g. "using std::string")
foreach used type. Also, a mix of both approaches would be possible.
I decided for the prefix approach because most of the files were already
using the std:: prefixes despite the "using namespace std".
This patch implements the pure materialistic evaluation called simple_eval()
to gain a speed-up during Stockfish search.
We use the so-called lazy evaluation trick: replace the accurate but slow
NNUE network evaluation by the super-fast simple_eval() if the position
seems to be already won (high material advantage). To guard against some
of the most obvious blunders introduced by this idea, this patch uses the
following features which will raise the lazy evaluation threshold in some
situations:
- avoid lazy evals on shuffling branches in the search tree
- avoid lazy evals if the position at root already has a material imbalance
- avoid lazy evals if the search value at root is already winning/losing.
Moreover, we add a small random noise to the simple_eval() term. This idea
(stochastic mobility in the minimax tree) was worth about 200 Elo in the pure
simple_eval() player on Lichess.
Overall, the current implementation in this patch evaluates about 2% of the
leaves in the search tree lazily.
FauziAkram [Fri, 25 Aug 2023 12:42:44 +0000 (15:42 +0300)]
Rename one variable
To enhance code clarity and prevent potential confusion with the
'r' variable assigned to reduction later in the code, this pull
request renames it to 'reductionScale' when we use the same name
in the reduction() function.
Using distinct variable names for separate functions improves code
readability and maintainability.
Disservin [Sat, 26 Aug 2023 07:49:04 +0000 (09:49 +0200)]
Simplify README
The UCI protocol is rather technical and has little value in our README. Instead
it should be explained in our wiki. "Contributing" is moved above "Compiling
Stockfish" to make it more prominent.
Also move the CONTRIBUTING.md into the root directory and include it in the
distributed artifacts/releases.
Disservin [Wed, 23 Aug 2023 17:36:55 +0000 (19:36 +0200)]
Cleanup includes
Reorder a few includes, include "position.h" where it was previously missing
and apply include-what-you-use suggestions. Also make the order of the includes
consistent, in the following way:
1. Related header (for .cpp files)
2. A blank line
3. C/C++ headers
4. A blank line
5. All other header files
Play turbulent when defending, simpler when attacking
This patch decays a little the evaluation (up to a few percent) for
positions which have a large complexity measure (material imbalance,
positional compensations, etc).
This may have nice consequences on the playing style, as it modifies
the search differently for attack and defense, both effects being
desirable:
- to see the effect on positions when Stockfish is defending, let us
suppose for instance that the side to move is Stockfish and the nnue
evaluation on the principal variation is -100 : this patch will decay
positions with an evaluation of -103 (say) to the same level, provided
they have huge material imbalance or huge positional compensation.
In other words, chaotic positions with an evaluation of -103 are now
comparable in our search tree to stable positions with an evaluation
of -100, and chaotic positions with an evaluation of -102 are now
preferred to stable positions with an evaluation of -100.
- the effect on positions when Stockfish is attacking is the opposite.
Let us suppose for instance that the side to move is Stockfish and the
nnue evaluation on the principal variation is +100 : this patch will
decay the evaluation to +97 if the positions on the principal variation
have huge material imbalance or huge positional compensation. In other
words, stable positions with an evaluation of +97 are now comparable
in our search tree to chaotic positions with an evaluation of +100,
and stable positions with an evaluation of +98 are now preferred to
chaotic positions with an evaluation of +100.
So the effect of this small change of evaluation on the playing style
is that Stockfish should now play a little bit more turbulent when
defending, and choose slightly simpler lines when attacking.
Shahin M. Shahin [Sat, 19 Aug 2023 22:15:22 +0000 (01:15 +0300)]
Reduce repetitions branches
Increase reduction on retrying a move we just retreated that falls in a repetition:
if current move can be the same move from previous previous turn then we retreated
that move on the previous turn, this patch increases reduction if retrying that move
results in a repetition.
How to continue from there? Maybe we some variants of this idea could bring Elo too
(only testing the destination square, or triangulations, etc.)
cj5716 [Sat, 12 Aug 2023 06:56:23 +0000 (14:56 +0800)]
Do more full window searches
Remove the value < beta condition for doing full window searches.
As an added bonus the condition for full-window search is now much
more similar to other fail-soft engines.
Squared numbers are never negative, so barring any wraparound there
is no need to clamp to 0. From reading the code, there's no obvious
way to get wraparound, so the entire operation can be simplified
away. Updated original truncated code comments to be sensible.
Verified by running ./stockfish bench 128 1 24 and by the following test:
Matthies [Wed, 16 Aug 2023 09:11:27 +0000 (11:11 +0200)]
Allow compilation on Raspi (for ARMv8)
Current master fails to compile for ARMv8 on Raspi cause gcc (version 10.2.1)
does not like to cast between signed and unsigned vector types. This patch
fixes it by using unsigned vector pointer for ARM to avoid implicite cast.
Disservin [Mon, 14 Aug 2023 11:49:41 +0000 (13:49 +0200)]
Add -funroll-loops to CXXFLAGS
Optimize profiling data accuracy by enabling -funroll-loops during the profile
generation phase, in addition to its default activation by -fprofile-use.
This seems to produce a slightly faster binary, for most compilers.
Besides rebasing I replaced PawnValueMg w/ 126 explicitly to decouple from https://tests.stockfishchess.org/tests/view/64d212de5b17f7c21c0debbb by @peregrineshahin which also passed. #4734
Stéphane Nicolet [Thu, 10 Aug 2023 04:31:48 +0000 (06:31 +0200)]
Simplify SEE pruning for captures
It seems that the current search is smart enough to allow us to remove
(again) the block of code that checks for discovered attacks after the
first pruning condition for captures.
cj5716 [Tue, 8 Aug 2023 09:06:31 +0000 (17:06 +0800)]
Simplify a depth condition
As the negative extension term has sensitive scaling, it would make more sense to allow more negative extension also at lower depth, and not just a region between low and high depth.
a) Add further tests to CI to cover most features. This uncovered a potential race
in case setoption was sent between two searches. As the UCI protocol requires
this sent to be went the engine is not searching, setoption now ensures that
this is the case.
Tomasz Sobczyk [Mon, 7 Aug 2023 11:11:09 +0000 (13:11 +0200)]
Fix Makefile for incorrect nnue file
If an incorrect network file is present at the start of the compilation stage, the
Makefile script now correctly removes it before trying to download a clean version.
Michael Chaly [Sun, 6 Aug 2023 23:32:38 +0000 (02:32 +0300)]
Adjust futility pruning base in qsearch
Current master used value from transposition table there if it existed,
this patch uses minimum between this tt value and the static eval instead
(this thus is closer to the main search function, which uses the static eval).
Based on vondele's deletepsqt branch:
https://github.com/vondele/Stockfish/commit/369f5b051
This huge simplification uses a weighted material differences instead of
the positional piece square tables (psqt) in the semi-classical complexity
calculation. Tuned weights using spsa at 45+0.45 with:
int pawnMult = 100;
int knightMult = 325;
int bishopMult = 350;
int rookMult = 500;
int queenMult = 900;
TUNE(SetRange(0, 200), pawnMult);
TUNE(SetRange(0, 650), knightMult);
TUNE(SetRange(0, 700), bishopMult);
TUNE(SetRange(200, 800), rookMult);
TUNE(SetRange(600, 1200), queenMult);
The values obtained via this tuning session were for a model where
the psqt replacement formula was always from the point of view of White,
even if the side to move was Black. We re-used the same values for an
implementation with a psqt replacement from the point of view of the side
to move, testing the result both on our standard book on positions with
a strong White bias, and an alternate book with positions with a strong
Black bias.
We note that with the patch the last use of the venerable "Score" type
disappears in Stockfish codebase (the Score type was used in classical
evaluation to get a tampered eval interpolating values smoothly from the
early midgame stage to the endgame stage). We leave it to another commit
to clean all occurrences of Score in the code and the comments.
AndrovT [Tue, 1 Aug 2023 12:43:37 +0000 (14:43 +0200)]
Implement AffineTransformSparseInput for armv8
Implements AffineTransformSparseInput layer for the NNUE evaluation
for the armv8 and armv8-dotprod architectures. We measured some nice
speed improvements via 10 runs of our benchmark:
Malus during move ordering for putting pieces en prise
The original idea is the reverse of a previous patch [1] which added bonuses
in our move picker to moves escaping threats. In this patch, in addition to
bonuses for evading threats, we apply penalties to moves moving to threatened
squares.
Further tweaks of that basic idea resulted in this specific version which
further increases the penalty of moves moving to squares threatend depending
on the piece threatening it. So for example a queen moving to a square attacked
by a pawn would receive a larger penalty than a queen moving to square attacked
by a rook.
Also make two get_weight_index() static methods constexpr, for
consistency with the other static get_hash_value() method right above.
Tested for speed by user Torom (thanks).
This patch changes the frequency with which the time is checked, changing
frequency from every 1024 counted nodes to every 512 counted nodes. The
master value was tuned for the old classical eval, the patch takes the
roughly 2x slowdown in nps with SFNNUEv7 into account. This could reduce
a bit the losses on time on fishtest, but they are probably unrelated.
Michael Chaly [Wed, 19 Jul 2023 08:58:02 +0000 (11:58 +0300)]
Do more futility pruning for cutNodes that are not in TT
This is somewhat similar to IIR for cutnodes but instead of reducing
depth for cutnodes that don't have tt move we reduce margin multiplier
in futility pruning for cutnodes that are not in TT.
Explicitly describe the architecture as deprecated,
it remains available as its current alias x86-64-sse41-popcnt
CPUs that support just this instruction set are now years old,
any few years old Intel or AMD CPU supports x86-64-avx2. However,
naming things 'modern' doesn't age well, so instead use explicit names.
Adjust CI accordingly. Wiki, fishtest, downloader done as well.
use a fixed compiler on Linux and Windows (right now gcc 11).
build avxvnni on Windows (Linux needs updated core utils)
build x86-32 on Linux (Windows needs other mingw)
fix a Makefile issue where a failed PGOBENCH would not stop the build
reuse the WINE_PATH for SDE as we do for QEMU
use WINE_PATH variable also for the signature
verify the bench for each of the binaries
do not build x86-64-avx2 on macos