peregrineshahin [Tue, 27 Jun 2023 03:07:20 +0000 (06:07 +0300)]
Fix pruning to (in TB loss) in Null move pruning.
Current logic can apply Null move pruning
on a dead-lost position returning an unproven loss
(i.e. in TB loss score or mated in losing score) on nonPv nodes.
on a default bench, this can be observed by adding this debugging line:
```
if (nullValue >= beta)
{
// Do not return unproven mate or TB scores
nullValue = std::min(nullValue, VALUE_TB_WIN_IN_MAX_PLY-1);
dbg_hit_on(nullValue <= VALUE_TB_LOSS_IN_MAX_PLY); // Hit #0: Total 73983 Hits 1 Hit Rate (%) 0.00135166
if (thisThread->nmpMinPly || depth < 14)
return nullValue;
```
This fixes this very rare issue (happens at ~0.00135166% of the time) by
eliminating the need to try Null Move Pruning with dead-lost positions
and leaving it to be determined by a normal searching flow.
The previous try to fix was not as safe enough because it was capping
the returned value to (out of TB range) thus reviving the dead-lost position
based on an artificial clamp (i.e. the in TB score/mate score can be lost on that nonPv node):
https://tests.stockfishchess.org/tests/view/649756d5dc7002ce609cd794
In CI, it is typical for the process to halt immediately when an error
is encountered. However, with our `shell: bash {0}` configuration,
the process continues despite errors for posix shells.
This commit updates the behavior of posix and msys2 shells to ensure
consistency in terms of pipeline exit codes and stop conditions.
We adopt the most appropriate default behavior as recommended
by the GitHub documentation.
Update the code that searches for the bench value in the git log:
- to be compatible with the new shell settings
- to retry the value from the first line that contains
only the template and spaces/tabs/newlines
Linmiao Xu [Sun, 25 Jun 2023 21:44:28 +0000 (17:44 -0400)]
Update NNUE architecture to SFNNv7 with larger L1 size of 2048
Creating this net involved:
- a 5-step training process from scratch
- greedy permuting L1 weights with https://github.com/official-stockfish/Stockfish/pull/4620
- leb128 compression with https://github.com/glinscott/nnue-pytorch/pull/251
- greedy 2- and 3- cycle permuting with https://github.com/official-stockfish/Stockfish/pull/4640
The 5 training steps were:
1. 400 epochs, lambda 1.0, lr 9.75e-4
UHOx2-wIsRight-multinet-dfrc-n5000-largeGensfen-d9.binpack (178G)
nodes5000pv2_UHO.binpack
data_pv-2_diff-100_nodes-5000.binpack
wrongIsRight_nodes5000pv2.binpack
multinet_pv-2_diff-100_nodes-5000.binpack
dfrc_n5000.binpack
large_gensfen_multipvdiff_100_d9.binpack
ep399 chosen as start model for step2
2. 800 epochs, end-lambda 0.75, skip 16
LeelaFarseer-T78juntoaugT79marT80dec.binpack (141G)
T60T70wIsRightFarseerT60T74T75T76.binpack
test78-junjulaug2022-16tb7p.no-db.min.binpack
test79-mar2022-16tb7p.no-db.min.binpack
test80-dec2022-16tb7p.no-db.min.binpack
ep559 chosen as start model for step3
3. 800 epochs, end-lambda 0.725, skip 20
leela96-dfrc99-v2-T80dectofeb-sk20-mar-v6-T77decT78janfebT79apr.binpack (223G)
leela96-filt-v2.min.binpack
dfrc99-16tb7p-eval-filt-v2.min.binpack
test80-dec2022-16tb7p-filter-v6-sk20.min-mar2023.binpack
test80-jan2023-16tb7p-filter-v6-sk20.min-mar2023.binpack
test80-feb2023-16tb7p-filter-v6-sk20.min-mar2023.binpack
test80-mar2023-2tb7p-filter-v6.min.binpack
test77-dec2021-16tb7p.no-db.min.binpack
test78-janfeb2022-16tb7p.no-db.min.binpack
test79-apr2022-16tb7p.no-db.min.binpack
ep499 chosen as start model for step4
4. 800 epochs, end-lambda 0.7, skip 24 0dd1cebea57 dataset https://github.com/official-stockfish/Stockfish/pull/4606
ep599 chosen as start model for step5
5. 800 epochs, end-lambda 0.7, skip 28
same dataset as step4
ep619 became nn-1b951f8b449d.nnue
SF training data components for the step1 dataset:
https://drive.google.com/drive/folders/1yLCEmioC3Xx9KQr4T7uB6GnLm5icAYGU
Leela training data for steps 2-5 can be found at:
https://robotmoon.com/nnue-training-data/
Due to larger L1 size and slower inference, the speed penalty loses elo
at STC. Measurements from 100 bench runs at depth 13 with x86-64-modern
on Intel Core i5-1038NG7 2.00GHz:
cj5716 [Mon, 26 Jun 2023 11:40:22 +0000 (19:40 +0800)]
Negative extension on cutNodes based on depth
This patch was inspired by candirufish original attempt at negative extensions
here that failed yellow: https://tests.stockfishchess.org/tests/view/6486529065ffe077ca124f32
I tested some variations of the idea and tuned a depth condition for
a modified version of it here https://tests.stockfishchess.org/tests/view/648db80a91c58631ce31fe00
after noticing abnormal scaling (ie many passed STC but not LTC)
After some small tweaks I got the final version here
Activation data taken from https://drive.google.com/drive/folders/1Ec9YuuRx4N03GPnVPoQOW70eucOKngQe?usp=sharing
Permutation found using https://github.com/Ergodice/nnue-pytorch/blob/836387a0e5e690431d404158c46648710f13904d/ftperm.py
See also https://github.com/glinscott/nnue-pytorch/pull/254
The algorithm greedily selects 2- and 3-cycles that can be permuted to increase the number of runs of zeroes. The percent of zero runs from the master net increased from 68.46 to 70.11 from 2-cycles and only increased to 70.32 when considering 3-cycles. Interestingly, allowing both halves of L1 to intermix when creating zero runs can give another 0.5% zero-run density increase with this method.
Measured speedup:
```
CPU: 16 x AMD Ryzen 9 3950X 16-Core Processor
Result of 50 runs
base (./stockfish.master ) = 1561556 +/- 5439
test (./stockfish.patch ) = 1575788 +/- 5427
diff = +14231 +/- 2636
A new major release of Stockfish is now available at
https://stockfishchess.org/download/
*Quality of chess play*
Stockfish continues to demonstrate its ability to discover superior moves
with remarkable speed. In self-play against Stockfish 15, this new
release gains up to 50 Elo[1] and wins up to 12 times more game pairs[2]
than it loses. In major chess engine tournaments, Stockfish reliably tops
the rankings[3] winning the TCEC season 24 Superfinal, Swiss, Fischer
Random, and Double Random Chess tournaments and the CCC 19 Bullet,
20 Blitz, and 20 Rapid competitions. Leela Chess Zero[4] was the
challenger in most finals, putting top-engine chess now firmly in the
hands of teams embracing free and open-source software.
*Progress made*
This updated version of Stockfish introduces several enhancements,
including an upgraded neural net architecture (SFNNv6)[5], improved
implementation, and refined parameterization. The ongoing utilization
of Leela’s data combined with a novel inference approach exploiting
sparsity[6], and network compression[7] ensure a speedy evaluation and
modest binary sizes while allowing for more weights and higher accuracy.
The search has undergone more optimization, leading to improved
performance, particularly in longer analyses[8]. Additionally,
the Fishtest framework has been improved and is now able to run the
tests needed to validate new ideas with 10000s of CPU cores.
*Usability improvements*
Stockfish now comes with documentation, found in the wiki folder when
downloading it or on GitHub[9]. Additionally, Stockfish now includes
a clear and consistent forced tablebase win score, displaying a value
of 200 minus the number of plies required to reach a tablebase win[10].
Furthermore, the UCI_Elo option, to reduce its strength, has been
calibrated[11]. It is worth noting that the evaluation system remains
consistent with Stockfish 15.1[12], maintaining the choice that 100cp
means a 50% chance of winning the game against an equal opponent[13].
Finally, binaries of our latest development version are now provided
continuously as pre-releases on GitHub making it easier for
enthusiasts to download the latest and strongest version of
the program[14], we thank Roman Korba for having provided a similar
service for a long time.
*Thank you*
The success of the Stockfish project relies on the vibrant community
of passionate enthusiasts (we appreciate each and every one of you!)
who generously contribute their knowledge, time, and resources.
Together, this dedicated community works towards the common goal of
developing a powerful, freely accessible, and open-source chess engine.
We invite all chess enthusiasts to join the Fishtest testing framework
and contribute to the project[15].
disservin [Tue, 13 Jun 2023 17:30:01 +0000 (19:30 +0200)]
create prereleases upon push to master
using github actions, create a prerelease for the latest commit to master.
As such a development version will be available on github, in addition to the latest release.
Permutation found using https://gist.github.com/AndrovT/359c831b7223c637e9156b01eb96949e.
Uses a greedy algorithm that goes sequentially through the output positions and
chooses a neuron for that position such that the number of nonzero quartets is the smallest.
Andreas Matthies [Tue, 13 Jun 2023 04:24:04 +0000 (06:24 +0200)]
Fix for MSVC compilation.
MSVC needs two more explicit casts to compile new affine_transform_sparse_input.
See https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm256_castsi256_ps
and https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm_castsi128_ps
This commit includes a net with reordered weights, to increase the likelihood of block sparse inputs,
but otherwise equivalent to the previous master net (nn-ea57bea57e32.nnue).
Activation data collected with https://github.com/AndrovT/Stockfish/tree/log-activations, running bench 16 1 13 varied_1000.epd depth NNUE on this data. Net parameters permuted with https://gist.github.com/AndrovT/9e3fbaebb7082734dc84d27e02094cb3.
Created by retraining an earlier epoch (ep659) of the experiment that led to the first SFNNv6 net:
- First retrained on the nn-0dd1cebea573 dataset
- Then retrained with skip 20 on a smaller dataset containing unfiltered Leela data
- And then retrained again with skip 27 on the nn-0dd1cebea573 dataset
The equivalent 7-step training sequence from scratch that led here was:
1. max-epoch 400, lambda 1.0, constant LR 9.75e-4, T79T77-filter-v6-dd.min.binpack
ep379 chosen for retraining in step2
2. max-epoch 800, end-lambda 0.75, T60T70wIsRightFarseerT60T74T75T76.binpack
ep679 chosen for retraining in step3
3. max-epoch 800, end-lambda 0.75, skip 28, nn-e1fb1ade4432 dataset
ep799 chosen for retraining in step4
4. max-epoch 800, end-lambda 0.7, skip 28, nn-e1fb1ade4432 dataset
ep759 became nn-8d69132723e2.nnue (first SFNNv6 net)
ep659 chosen for retraining in step5
5. max-epoch 800, end-lambda 0.7, skip 28, nn-0dd1cebea573 dataset
ep759 chosen for retraining in step6
6. max-epoch 800, end-lambda 0.7, skip 20, leela-dfrc-v2-T77decT78janfebT79aprT80apr.binpack
ep639 chosen for retraining in step7
Created by retraining an earlier epoch of the experiment leading to the first SFNNv6 net
on a more-randomized version of the nn-e1fb1ade4432.nnue dataset mixed with unfiltered
T80 apr2023 data. Trained using early-fen-skipping 28 and max-epoch 960.
The trainer settings and epochs used in the 5-step training sequence leading here were:
1. train from scratch for 400 epochs, lambda 1.0, constant LR 9.75e-4, T79T77-filter-v6-dd.min.binpack
2. retrain ep379, max-epoch 800, end-lambda 0.75, T60T70wIsRightFarseerT60T74T75T76.binpack
3. retrain ep679, max-epoch 800, end-lambda 0.75, skip 28, nn-e1fb1ade4432 dataset
4. retrain ep799, max-epoch 800, end-lambda 0.7, skip 28, nn-e1fb1ade4432 dataset
5. retrain ep439, max-epoch 960, end-lambda 0.7, skip 28, shuffled nn-e1fb1ade4432 + T80 apr2023
This net was epoch 559 of the final (step 5) retraining:
During data preparation, most binpacks were unminimized by removing positions with
score 32002 (`VALUE_NONE`). This makes the tradeoff of increasing dataset filesize
on disk to increase the randomness of positions in interleaved datasets.
The code used for unminimizing is at:
https://github.com/linrock/Stockfish/tree/tools-unminify
For preparing the dataset used in this experiment:
Guenther Demetz [Wed, 31 May 2023 09:48:18 +0000 (11:48 +0200)]
Simplify away SEE verification
After 4 simplificatons over PR#4453 the idea does not yield significant
improvement anymore. Maybe also
https://tests.stockfishchess.org/tests/view/640c88092644b62c3394c1c5 was
a fluke.
Muzhen Gaming [Fri, 2 Jun 2023 11:55:25 +0000 (19:55 +0800)]
Search tuning at very long time control with new net
The most significant change would be the singularBeta formula.
It was first tested by cj5716 (see https://tests.stockfishchess.org/tests/view/647317c9d29264e4cfa74ec7),
and I took much inspiration from that idea.
Linmiao Xu [Fri, 12 May 2023 22:07:20 +0000 (18:07 -0400)]
Update NNUE architecture to SFNNv6 with larger L1 size of 1536
Created by training a new net from scratch with L1 size increased from 1024 to 1536.
Thanks to Vizvezdenec for the idea of exploring larger net sizes after recent
training data improvements.
A new net was first trained with lambda 1.0 and constant LR 8.75e-4. Then a strong net
from a later epoch in the training run was chosen for retraining with start-lambda 1.0
and initial LR 4.375e-4 decaying with gamma 0.995. Retraining was performed a total of
3 times, for this 4-step process:
1. 400 epochs, lambda 1.0 on filtered T77+T79 v6 deduplicated data
2. 800 epochs, end-lambda 0.75 on T60T70wIsRightFarseerT60T74T75T76.binpack
3. 800 epochs, end-lambda 0.75 and early-fen-skipping 28 on the master dataset
4. 800 epochs, end-lambda 0.7 and early-fen-skipping 28 on the master dataset
In the training sequence that reached the new nn-8d69132723e2.nnue net,
the epochs used for the 3x retraining runs were:
1. epoch 379 trained on T77T79-filter-v6-dd.min.binpack
2. epoch 679 trained on T60T70wIsRightFarseerT60T74T75T76.binpack
3. epoch 799 trained on the master dataset
windfishballad [Tue, 23 May 2023 00:13:44 +0000 (20:13 -0400)]
Removed quadratic term in optimism
Remove term which is quadratic in optimism in the eval.
Simplifies and should also remove the bias towards side to move making the eval better for analysis.
update CPU contributors list, the previous update was a couple of months ago,
and unfortunately, was not quite accurate for the number of games played.
This version is based clean calculation from the DB and
an updated script that tracks things (see https://github.com/glinscott/fishtest/pull/1702).
xoto10 [Fri, 19 May 2023 18:58:18 +0000 (19:58 +0100)]
Simplify optimism calculation
This change removes one of the constants in the calculation of optimism. It also changes the 2 constants used with the scale value so that they are independent, instead of applying a constant to the scale and then adjusting it again when it is applied to the optimism. This might make the tuning of these constants cleaner and more reliable in the future.
STC 10+0.1 (accidentally run as an Elo gainer:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 154080 W: 41119 L: 40651 D: 72310
Ptnml(0-2): 375, 16840, 42190, 17212, 423
https://tests.stockfishchess.org/tests/live_elo/64653eabf3b1a4e86c317f77
Remove the constant term of the history threshold which lowers the chance that pruning occurs.
As compensation allow pruning at a slightly higher depth.
Passed LTC: retest on top of VLTC tuning PR 4571 because this changes the history depth factor (use this new factor here)
https://tests.stockfishchess.org/tests/view/6467300e165c4b29ec0afd3f
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 99270 W: 26840 L: 26707 D: 45723
Ptnml(0-2): 36, 9753, 29928, 9878, 40
Michael Chaly [Sun, 7 May 2023 20:33:04 +0000 (23:33 +0300)]
Refine deeper post-lmr searches
This patch improves logic conditions for performing deeper searches after passed LMR.
Instead of exceeding alpha by some margin now it requires to exceed the
current best value - which may be lower than alpha (but never bigger since we
update alpha with bestvalue if it exceeds alpha).
Michael Chaly [Fri, 5 May 2023 04:54:12 +0000 (07:54 +0300)]
Reduce more if current node has a lot of refuted moves.
This patch refines idea of cutoff count - in master we reduce more if current node has at least 4 moves that are refuted by search, this patch increases this count by 1 if refutation happened without having a tt move.
Created by retraining nn-dabb1ed23026.nnue with a dataset composed of:
* The previous best dataset (nn-1ceb1a57d117.nnue dataset)
* Adding de-duplicated T80 data from feb2023 and the last 10 days of jan2023, filtered with v6-dd
Initially trained with the same options as the recent master net (nn-1ceb1a57d117.nnue).
Around epoch 890, training was manually stopped and max epoch increased to 1000.
Created by retraining the master net with these changes to the dataset:
* Extending v6 filtering to data from T77 dec2021, T79 may2022, and T80 nov2022
* Reducing the number of duplicate positions, prioritizing position scores seen later in time
* Using a binpack minimizer to reduce the overall data size
Trained the same way as the previous master net, aside from the dataset changes:
The new v6-dd filtering reduces duplicate positions by iterating over hourly data files within leela test runs, starting with the most recent, then keeping positions the first time they're seen and ignoring positions that are seen again. This ordering was done with the assumption that position scores seen later in time are generally more accurate than scores seen earlier in the test run. Positions are de-duplicated based on piece orientations, the first token in fen strings.
The binpack minimizer was run with default settings after first merging monthly data into single binpacks.
This idea is a result of my second condition combination tuning for reductions:
https://tests.stockfishchess.org/tests/view/643ed5573806eca398f06d61
There were used two parameters per combination: one for the 'sign' of the first and the second condition in a combination. Values >= 50 indicate using a condition directly and values <= -50 means use the negation of a condition.
Each condition pair (X,Y) had two occurances dependent of the order of the two conditions:
- if X < Y the parameters used for more reduction
- if X > Y the parameters used for less reduction
- if X = Y then only one condition is present and A[X][X][0]/A[X][X][1] stands for using more/less reduction for only this condition.
The parameter pair A[7][2][0] (value = -94.70) and A[7][2][1] (value = 93.60) was one of the strongest signals with values near 100/-100.
Here condition nr. 7 was '(ss+1)->cutoffCnt > 3' and condition nr. 2 'move == ttMove'. For condition nr. 7 the negation is used because A[7][2][0] is negative.
This translates finally to less reduction (because 7 > 2) for tt moves if child cutoffs <= 3.
Michael Chaly [Tue, 11 Apr 2023 16:53:36 +0000 (19:53 +0300)]
Simplify stats assignment for Pv nodes
This patch is a simplification of my recent elo gainer.
Logically the Elo gainer didn't make much sense and this patch simplifies it into smth more logical.
Instead of assigning negative bonuses to all non-first moves that enter PV nodes
we assign positive bonuses in full depth search after LMR only for moves that
will result in a fail high - thus not assigning positive bonuses
for moves that will go to pv search - so doing "almost" the same as we do in master now for them.
Logic differs for some other moves, though, but this removes some lines of code.
peregrineshahin [Thu, 16 Mar 2023 21:04:41 +0000 (00:04 +0300)]
Fix capturing underpromotions issue
Fix underpromotion captures are generated amongst quiets although dealt with as a capture_stage in search, this makes not skipping them when move count pruning kicks-in consistent with updating their histories amongst captures.
Reduce the number of overloads for pieces() by using a more general template implementation.
Secondly simplify some code in search.cpp using the new general functionality.
The current implementation generates warnings on MSVC. However, we have
no real use cases for double-typed UCI option values now. Also parameter
tuning only accepts following three types:
The calculation of rootComplexity can't call eval when in check.
Doing so triggers an assert if compiled in debug mode when
the rootpos is evaluated using classical eval.
If the position is not in TT, decrease depth by 2
or by 4 if the TT entry for the current position was hit
and the stored depth is greater than or equal to the current depth.
Many thanks to Vizvezdenec as the main idea was his.
Maxim Masiutin [Fri, 31 Mar 2023 15:16:50 +0000 (18:16 +0300)]
Replace deprecated icc with icx
Replace the deprecated Intel compiler icc with its newer icx variant.
This newer compiler is based on clang, and yields good performance.
As before, currently only linux is supported.
MinetaS [Thu, 30 Mar 2023 11:14:30 +0000 (11:14 +0000)]
Simplify away complexityAverage
Instead of tracking the average of complexity values, calculate
complexity of root position at the beginning of the search and use it as
a scaling factor in time management.
Maxim Masiutin [Sun, 12 Mar 2023 13:16:51 +0000 (15:16 +0200)]
Improve compatibility
this makes it easier to compile under MSVC, even though we recommend gcc/clang for production compiles at the moment.
In Win32 API, by default, most null-terminated character strings arguments are of wchar_t (UTF16, formerly UCS16-LE) type, i.e. 2 bytes (at least) per character. So, src/misc.cpp should have proper type. Respectively, for src/syzygy/tbprobe.cpp, in Widows, file paths should be std::wstring rather than std::string. However, this requires a very big number of changes, since the config files are also keeping the 8-bit-per-character std::string strings. Therefore, just one change of using 8-byte-per-character CreateFileA make it compile under MSVC.