Malus during move ordering for putting pieces en prise
The original idea is the reverse of a previous patch [1] which added bonuses
in our move picker to moves escaping threats. In this patch, in addition to
bonuses for evading threats, we apply penalties to moves moving to threatened
squares.
Further tweaks of that basic idea resulted in this specific version which
further increases the penalty of moves moving to squares threatend depending
on the piece threatening it. So for example a queen moving to a square attacked
by a pawn would receive a larger penalty than a queen moving to square attacked
by a rook.
Also make two get_weight_index() static methods constexpr, for
consistency with the other static get_hash_value() method right above.
Tested for speed by user Torom (thanks).
This patch changes the frequency with which the time is checked, changing
frequency from every 1024 counted nodes to every 512 counted nodes. The
master value was tuned for the old classical eval, the patch takes the
roughly 2x slowdown in nps with SFNNUEv7 into account. This could reduce
a bit the losses on time on fishtest, but they are probably unrelated.
Michael Chaly [Wed, 19 Jul 2023 08:58:02 +0000 (11:58 +0300)]
Do more futility pruning for cutNodes that are not in TT
This is somewhat similar to IIR for cutnodes but instead of reducing
depth for cutnodes that don't have tt move we reduce margin multiplier
in futility pruning for cutnodes that are not in TT.
Explicitly describe the architecture as deprecated,
it remains available as its current alias x86-64-sse41-popcnt
CPUs that support just this instruction set are now years old,
any few years old Intel or AMD CPU supports x86-64-avx2. However,
naming things 'modern' doesn't age well, so instead use explicit names.
Adjust CI accordingly. Wiki, fishtest, downloader done as well.
use a fixed compiler on Linux and Windows (right now gcc 11).
build avxvnni on Windows (Linux needs updated core utils)
build x86-32 on Linux (Windows needs other mingw)
fix a Makefile issue where a failed PGOBENCH would not stop the build
reuse the WINE_PATH for SDE as we do for QEMU
use WINE_PATH variable also for the signature
verify the bench for each of the binaries
do not build x86-64-avx2 on macos
use intel's Software Development Emulator (SDE) in the actions that build the binaries.
This allows for building on Windows and Linux binaries for
- x86-64-avx512
- x86-64-vnni256
- x86-64-vnni512
(x86-64-avxvnni needs more recent gcc in the actions)
since the introduction of NNUE (first released with Stockfish 12), we
have maintained the classical evaluation as part of SF in frozen form.
The idea that this code could lead to further inputs to the NN or
search did not materialize. Now, after five releases, this PR removes
the classical evaluation from SF. Even though this evaluation is
probably the best of its class, it has become unimportant for the
engine's strength, and there is little need to maintain this
code (roughly 25% of SF) going forward, or to expend resources on
trying to improve its integration in the NNUE eval.
Indeed, it had still a very limited use in the current SF, namely
for the evaluation of positions that are nearly decided based on
material difference, where the speed of the classical evaluation
outweights its inaccuracies. This impact on strength is small,
roughly 2Elo, and probably decreasing in importance as the TC grows.
Potentially, removal of this code could lead to the development of
techniques to have faster, but less accurate NN evaluation,
for certain positions.
peregrineshahin [Tue, 27 Jun 2023 03:07:20 +0000 (06:07 +0300)]
Fix pruning to (in TB loss) in Null move pruning.
Current logic can apply Null move pruning
on a dead-lost position returning an unproven loss
(i.e. in TB loss score or mated in losing score) on nonPv nodes.
on a default bench, this can be observed by adding this debugging line:
```
if (nullValue >= beta)
{
// Do not return unproven mate or TB scores
nullValue = std::min(nullValue, VALUE_TB_WIN_IN_MAX_PLY-1);
dbg_hit_on(nullValue <= VALUE_TB_LOSS_IN_MAX_PLY); // Hit #0: Total 73983 Hits 1 Hit Rate (%) 0.00135166
if (thisThread->nmpMinPly || depth < 14)
return nullValue;
```
This fixes this very rare issue (happens at ~0.00135166% of the time) by
eliminating the need to try Null Move Pruning with dead-lost positions
and leaving it to be determined by a normal searching flow.
The previous try to fix was not as safe enough because it was capping
the returned value to (out of TB range) thus reviving the dead-lost position
based on an artificial clamp (i.e. the in TB score/mate score can be lost on that nonPv node):
https://tests.stockfishchess.org/tests/view/649756d5dc7002ce609cd794
In CI, it is typical for the process to halt immediately when an error
is encountered. However, with our `shell: bash {0}` configuration,
the process continues despite errors for posix shells.
This commit updates the behavior of posix and msys2 shells to ensure
consistency in terms of pipeline exit codes and stop conditions.
We adopt the most appropriate default behavior as recommended
by the GitHub documentation.
Update the code that searches for the bench value in the git log:
- to be compatible with the new shell settings
- to retry the value from the first line that contains
only the template and spaces/tabs/newlines
Linmiao Xu [Sun, 25 Jun 2023 21:44:28 +0000 (17:44 -0400)]
Update NNUE architecture to SFNNv7 with larger L1 size of 2048
Creating this net involved:
- a 5-step training process from scratch
- greedy permuting L1 weights with https://github.com/official-stockfish/Stockfish/pull/4620
- leb128 compression with https://github.com/glinscott/nnue-pytorch/pull/251
- greedy 2- and 3- cycle permuting with https://github.com/official-stockfish/Stockfish/pull/4640
The 5 training steps were:
1. 400 epochs, lambda 1.0, lr 9.75e-4
UHOx2-wIsRight-multinet-dfrc-n5000-largeGensfen-d9.binpack (178G)
nodes5000pv2_UHO.binpack
data_pv-2_diff-100_nodes-5000.binpack
wrongIsRight_nodes5000pv2.binpack
multinet_pv-2_diff-100_nodes-5000.binpack
dfrc_n5000.binpack
large_gensfen_multipvdiff_100_d9.binpack
ep399 chosen as start model for step2
2. 800 epochs, end-lambda 0.75, skip 16
LeelaFarseer-T78juntoaugT79marT80dec.binpack (141G)
T60T70wIsRightFarseerT60T74T75T76.binpack
test78-junjulaug2022-16tb7p.no-db.min.binpack
test79-mar2022-16tb7p.no-db.min.binpack
test80-dec2022-16tb7p.no-db.min.binpack
ep559 chosen as start model for step3
3. 800 epochs, end-lambda 0.725, skip 20
leela96-dfrc99-v2-T80dectofeb-sk20-mar-v6-T77decT78janfebT79apr.binpack (223G)
leela96-filt-v2.min.binpack
dfrc99-16tb7p-eval-filt-v2.min.binpack
test80-dec2022-16tb7p-filter-v6-sk20.min-mar2023.binpack
test80-jan2023-16tb7p-filter-v6-sk20.min-mar2023.binpack
test80-feb2023-16tb7p-filter-v6-sk20.min-mar2023.binpack
test80-mar2023-2tb7p-filter-v6.min.binpack
test77-dec2021-16tb7p.no-db.min.binpack
test78-janfeb2022-16tb7p.no-db.min.binpack
test79-apr2022-16tb7p.no-db.min.binpack
ep499 chosen as start model for step4
4. 800 epochs, end-lambda 0.7, skip 24 0dd1cebea57 dataset https://github.com/official-stockfish/Stockfish/pull/4606
ep599 chosen as start model for step5
5. 800 epochs, end-lambda 0.7, skip 28
same dataset as step4
ep619 became nn-1b951f8b449d.nnue
SF training data components for the step1 dataset:
https://drive.google.com/drive/folders/1yLCEmioC3Xx9KQr4T7uB6GnLm5icAYGU
Leela training data for steps 2-5 can be found at:
https://robotmoon.com/nnue-training-data/
Due to larger L1 size and slower inference, the speed penalty loses elo
at STC. Measurements from 100 bench runs at depth 13 with x86-64-modern
on Intel Core i5-1038NG7 2.00GHz:
cj5716 [Mon, 26 Jun 2023 11:40:22 +0000 (19:40 +0800)]
Negative extension on cutNodes based on depth
This patch was inspired by candirufish original attempt at negative extensions
here that failed yellow: https://tests.stockfishchess.org/tests/view/6486529065ffe077ca124f32
I tested some variations of the idea and tuned a depth condition for
a modified version of it here https://tests.stockfishchess.org/tests/view/648db80a91c58631ce31fe00
after noticing abnormal scaling (ie many passed STC but not LTC)
After some small tweaks I got the final version here
Activation data taken from https://drive.google.com/drive/folders/1Ec9YuuRx4N03GPnVPoQOW70eucOKngQe?usp=sharing
Permutation found using https://github.com/Ergodice/nnue-pytorch/blob/836387a0e5e690431d404158c46648710f13904d/ftperm.py
See also https://github.com/glinscott/nnue-pytorch/pull/254
The algorithm greedily selects 2- and 3-cycles that can be permuted to increase the number of runs of zeroes. The percent of zero runs from the master net increased from 68.46 to 70.11 from 2-cycles and only increased to 70.32 when considering 3-cycles. Interestingly, allowing both halves of L1 to intermix when creating zero runs can give another 0.5% zero-run density increase with this method.
Measured speedup:
```
CPU: 16 x AMD Ryzen 9 3950X 16-Core Processor
Result of 50 runs
base (./stockfish.master ) = 1561556 +/- 5439
test (./stockfish.patch ) = 1575788 +/- 5427
diff = +14231 +/- 2636
A new major release of Stockfish is now available at
https://stockfishchess.org/download/
*Quality of chess play*
Stockfish continues to demonstrate its ability to discover superior moves
with remarkable speed. In self-play against Stockfish 15, this new
release gains up to 50 Elo[1] and wins up to 12 times more game pairs[2]
than it loses. In major chess engine tournaments, Stockfish reliably tops
the rankings[3] winning the TCEC season 24 Superfinal, Swiss, Fischer
Random, and Double Random Chess tournaments and the CCC 19 Bullet,
20 Blitz, and 20 Rapid competitions. Leela Chess Zero[4] was the
challenger in most finals, putting top-engine chess now firmly in the
hands of teams embracing free and open-source software.
*Progress made*
This updated version of Stockfish introduces several enhancements,
including an upgraded neural net architecture (SFNNv6)[5], improved
implementation, and refined parameterization. The ongoing utilization
of Leela’s data combined with a novel inference approach exploiting
sparsity[6], and network compression[7] ensure a speedy evaluation and
modest binary sizes while allowing for more weights and higher accuracy.
The search has undergone more optimization, leading to improved
performance, particularly in longer analyses[8]. Additionally,
the Fishtest framework has been improved and is now able to run the
tests needed to validate new ideas with 10000s of CPU cores.
*Usability improvements*
Stockfish now comes with documentation, found in the wiki folder when
downloading it or on GitHub[9]. Additionally, Stockfish now includes
a clear and consistent forced tablebase win score, displaying a value
of 200 minus the number of plies required to reach a tablebase win[10].
Furthermore, the UCI_Elo option, to reduce its strength, has been
calibrated[11]. It is worth noting that the evaluation system remains
consistent with Stockfish 15.1[12], maintaining the choice that 100cp
means a 50% chance of winning the game against an equal opponent[13].
Finally, binaries of our latest development version are now provided
continuously as pre-releases on GitHub making it easier for
enthusiasts to download the latest and strongest version of
the program[14], we thank Roman Korba for having provided a similar
service for a long time.
*Thank you*
The success of the Stockfish project relies on the vibrant community
of passionate enthusiasts (we appreciate each and every one of you!)
who generously contribute their knowledge, time, and resources.
Together, this dedicated community works towards the common goal of
developing a powerful, freely accessible, and open-source chess engine.
We invite all chess enthusiasts to join the Fishtest testing framework
and contribute to the project[15].
disservin [Tue, 13 Jun 2023 17:30:01 +0000 (19:30 +0200)]
create prereleases upon push to master
using github actions, create a prerelease for the latest commit to master.
As such a development version will be available on github, in addition to the latest release.
Permutation found using https://gist.github.com/AndrovT/359c831b7223c637e9156b01eb96949e.
Uses a greedy algorithm that goes sequentially through the output positions and
chooses a neuron for that position such that the number of nonzero quartets is the smallest.
Andreas Matthies [Tue, 13 Jun 2023 04:24:04 +0000 (06:24 +0200)]
Fix for MSVC compilation.
MSVC needs two more explicit casts to compile new affine_transform_sparse_input.
See https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm256_castsi256_ps
and https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm_castsi128_ps
This commit includes a net with reordered weights, to increase the likelihood of block sparse inputs,
but otherwise equivalent to the previous master net (nn-ea57bea57e32.nnue).
Activation data collected with https://github.com/AndrovT/Stockfish/tree/log-activations, running bench 16 1 13 varied_1000.epd depth NNUE on this data. Net parameters permuted with https://gist.github.com/AndrovT/9e3fbaebb7082734dc84d27e02094cb3.
Created by retraining an earlier epoch (ep659) of the experiment that led to the first SFNNv6 net:
- First retrained on the nn-0dd1cebea573 dataset
- Then retrained with skip 20 on a smaller dataset containing unfiltered Leela data
- And then retrained again with skip 27 on the nn-0dd1cebea573 dataset
The equivalent 7-step training sequence from scratch that led here was:
1. max-epoch 400, lambda 1.0, constant LR 9.75e-4, T79T77-filter-v6-dd.min.binpack
ep379 chosen for retraining in step2
2. max-epoch 800, end-lambda 0.75, T60T70wIsRightFarseerT60T74T75T76.binpack
ep679 chosen for retraining in step3
3. max-epoch 800, end-lambda 0.75, skip 28, nn-e1fb1ade4432 dataset
ep799 chosen for retraining in step4
4. max-epoch 800, end-lambda 0.7, skip 28, nn-e1fb1ade4432 dataset
ep759 became nn-8d69132723e2.nnue (first SFNNv6 net)
ep659 chosen for retraining in step5
5. max-epoch 800, end-lambda 0.7, skip 28, nn-0dd1cebea573 dataset
ep759 chosen for retraining in step6
6. max-epoch 800, end-lambda 0.7, skip 20, leela-dfrc-v2-T77decT78janfebT79aprT80apr.binpack
ep639 chosen for retraining in step7
Created by retraining an earlier epoch of the experiment leading to the first SFNNv6 net
on a more-randomized version of the nn-e1fb1ade4432.nnue dataset mixed with unfiltered
T80 apr2023 data. Trained using early-fen-skipping 28 and max-epoch 960.
The trainer settings and epochs used in the 5-step training sequence leading here were:
1. train from scratch for 400 epochs, lambda 1.0, constant LR 9.75e-4, T79T77-filter-v6-dd.min.binpack
2. retrain ep379, max-epoch 800, end-lambda 0.75, T60T70wIsRightFarseerT60T74T75T76.binpack
3. retrain ep679, max-epoch 800, end-lambda 0.75, skip 28, nn-e1fb1ade4432 dataset
4. retrain ep799, max-epoch 800, end-lambda 0.7, skip 28, nn-e1fb1ade4432 dataset
5. retrain ep439, max-epoch 960, end-lambda 0.7, skip 28, shuffled nn-e1fb1ade4432 + T80 apr2023
This net was epoch 559 of the final (step 5) retraining:
During data preparation, most binpacks were unminimized by removing positions with
score 32002 (`VALUE_NONE`). This makes the tradeoff of increasing dataset filesize
on disk to increase the randomness of positions in interleaved datasets.
The code used for unminimizing is at:
https://github.com/linrock/Stockfish/tree/tools-unminify
For preparing the dataset used in this experiment:
Guenther Demetz [Wed, 31 May 2023 09:48:18 +0000 (11:48 +0200)]
Simplify away SEE verification
After 4 simplificatons over PR#4453 the idea does not yield significant
improvement anymore. Maybe also
https://tests.stockfishchess.org/tests/view/640c88092644b62c3394c1c5 was
a fluke.
Muzhen Gaming [Fri, 2 Jun 2023 11:55:25 +0000 (19:55 +0800)]
Search tuning at very long time control with new net
The most significant change would be the singularBeta formula.
It was first tested by cj5716 (see https://tests.stockfishchess.org/tests/view/647317c9d29264e4cfa74ec7),
and I took much inspiration from that idea.
Linmiao Xu [Fri, 12 May 2023 22:07:20 +0000 (18:07 -0400)]
Update NNUE architecture to SFNNv6 with larger L1 size of 1536
Created by training a new net from scratch with L1 size increased from 1024 to 1536.
Thanks to Vizvezdenec for the idea of exploring larger net sizes after recent
training data improvements.
A new net was first trained with lambda 1.0 and constant LR 8.75e-4. Then a strong net
from a later epoch in the training run was chosen for retraining with start-lambda 1.0
and initial LR 4.375e-4 decaying with gamma 0.995. Retraining was performed a total of
3 times, for this 4-step process:
1. 400 epochs, lambda 1.0 on filtered T77+T79 v6 deduplicated data
2. 800 epochs, end-lambda 0.75 on T60T70wIsRightFarseerT60T74T75T76.binpack
3. 800 epochs, end-lambda 0.75 and early-fen-skipping 28 on the master dataset
4. 800 epochs, end-lambda 0.7 and early-fen-skipping 28 on the master dataset
In the training sequence that reached the new nn-8d69132723e2.nnue net,
the epochs used for the 3x retraining runs were:
1. epoch 379 trained on T77T79-filter-v6-dd.min.binpack
2. epoch 679 trained on T60T70wIsRightFarseerT60T74T75T76.binpack
3. epoch 799 trained on the master dataset
windfishballad [Tue, 23 May 2023 00:13:44 +0000 (20:13 -0400)]
Removed quadratic term in optimism
Remove term which is quadratic in optimism in the eval.
Simplifies and should also remove the bias towards side to move making the eval better for analysis.
update CPU contributors list, the previous update was a couple of months ago,
and unfortunately, was not quite accurate for the number of games played.
This version is based clean calculation from the DB and
an updated script that tracks things (see https://github.com/glinscott/fishtest/pull/1702).
xoto10 [Fri, 19 May 2023 18:58:18 +0000 (19:58 +0100)]
Simplify optimism calculation
This change removes one of the constants in the calculation of optimism. It also changes the 2 constants used with the scale value so that they are independent, instead of applying a constant to the scale and then adjusting it again when it is applied to the optimism. This might make the tuning of these constants cleaner and more reliable in the future.
STC 10+0.1 (accidentally run as an Elo gainer:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 154080 W: 41119 L: 40651 D: 72310
Ptnml(0-2): 375, 16840, 42190, 17212, 423
https://tests.stockfishchess.org/tests/live_elo/64653eabf3b1a4e86c317f77
Remove the constant term of the history threshold which lowers the chance that pruning occurs.
As compensation allow pruning at a slightly higher depth.
Passed LTC: retest on top of VLTC tuning PR 4571 because this changes the history depth factor (use this new factor here)
https://tests.stockfishchess.org/tests/view/6467300e165c4b29ec0afd3f
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 99270 W: 26840 L: 26707 D: 45723
Ptnml(0-2): 36, 9753, 29928, 9878, 40
Michael Chaly [Sun, 7 May 2023 20:33:04 +0000 (23:33 +0300)]
Refine deeper post-lmr searches
This patch improves logic conditions for performing deeper searches after passed LMR.
Instead of exceeding alpha by some margin now it requires to exceed the
current best value - which may be lower than alpha (but never bigger since we
update alpha with bestvalue if it exceeds alpha).