From 85f8ee6199f8578fbc082fc0f37e1985813e637a Mon Sep 17 00:00:00 2001 From: Joost VandeVondele Date: Fri, 1 Jul 2022 15:07:49 +0200 Subject: [PATCH] Update default net to nn-3c0054ea9860.nnu MIME-Version: 1.0 Content-Type: text/plain; charset=utf8 Content-Transfer-Encoding: 8bit First things first... this PR is being made from court. Today, Tord and Stéphane, with broad support of the developer community are defending their complaint, filed in Munich, against ChessBase. With their products Houdini 6 and Fat Fritz 2, both Stockfish derivatives, ChessBase violated repeatedly the Stockfish GPLv3 license. Tord and Stéphane have terminated their license with ChessBase permanently. Today we have the opportunity to present our evidence to the judge and enforce that termination. To read up, have a look at our blog post https://stockfishchess.org/blog/2022/public-court-hearing-soon/ and https://stockfishchess.org/blog/2021/our-lawsuit-against-chessbase/ This PR introduces a net trained with an enhanced data set and a modified loss function in the trainer. A slight adjustment for the scaling was needed to get a pass on standard chess. passed STC: https://tests.stockfishchess.org/tests/view/62c0527a49b62510394bd610 LLR: 2.94 (-2.94,2.94) <0.00,2.50> Total: 135008 W: 36614 L: 36152 D: 62242 Ptnml(0-2): 640, 15184, 35407, 15620, 653 passed LTC: https://tests.stockfishchess.org/tests/view/62c17e459e7d9997a12d458e LLR: 2.94 (-2.94,2.94) <0.50,3.00> Total: 28864 W: 8007 L: 7749 D: 13108 Ptnml(0-2): 47, 2810, 8466, 3056, 53 Local testing at a fixed 25k nodes resulted in Test run1026/easy_train_data/experiments/experiment_2/training/run_0/nn-epoch799.nnue localElo: 4.2 +- 1.6 The real strength of the net is in FRC and DFRC chess where it gains significantly. Tested at STC with slightly different scaling: FRC: https://tests.stockfishchess.org/tests/view/62c13a4002ba5d0a774d20d4 Elo: 29.78 +-3.4 (95%) LOS: 100.0% Total: 10000 W: 2007 L: 1152 D: 6841 Ptnml(0-2): 31, 686, 2804, 1355, 124 nElo: 59.24 +-6.9 (95%) PairsRatio: 2.06 DFRC: https://tests.stockfishchess.org/tests/view/62c13a5702ba5d0a774d20d9 Elo: 55.25 +-3.9 (95%) LOS: 100.0% Total: 10000 W: 2984 L: 1407 D: 5609 Ptnml(0-2): 51, 636, 2266, 1779, 268 nElo: 96.95 +-7.2 (95%) PairsRatio: 2.98 Tested at LTC with identical scaling: FRC: https://tests.stockfishchess.org/tests/view/62c26a3c9e7d9997a12d6caf Elo: 16.20 +-2.5 (95%) LOS: 100.0% Total: 10000 W: 1192 L: 726 D: 8082 Ptnml(0-2): 10, 403, 3727, 831, 29 nElo: 44.12 +-6.7 (95%) PairsRatio: 2.08 DFRC: https://tests.stockfishchess.org/tests/view/62c26a539e7d9997a12d6cb2 Elo: 40.94 +-3.0 (95%) LOS: 100.0% Total: 10000 W: 2215 L: 1042 D: 6743 Ptnml(0-2): 10, 410, 3053, 1451, 76 nElo: 92.77 +-6.9 (95%) PairsRatio: 3.64 This is due to the mixing in a significant fraction of DFRC training data in the final training round. The net is trained using the easy_train.py script in the following way: ``` python easy_train.py \ --training-dataset=../Leela-dfrc_n5000.binpack \ --experiment-name=2 \ --nnue-pytorch-branch=vondele/nnue-pytorch/lossScan4 \ --additional-training-arg=--param-index=2 \ --start-lambda=1.0 \ --end-lambda=0.75 \ --gamma=0.995 \ --lr=4.375e-4 \ --start-from-engine-test-net True \ --tui=False \ --seed=$RANDOM \ --max_epoch=800 \ --auto-exit-timeout-on-training-finished=900 \ --network-testing-threads 8 \ --num-workers 12 ``` where the data set used (Leela-dfrc_n5000.binpack) is a combination of our previous best data set (mix of Leela and some SF data) and DFRC data, interleaved to form: The data is available in https://drive.google.com/drive/folders/1S9-ZiQa_3ApmjBtl2e8SyHxj4zG4V8gG?usp=sharing Leela mix: https://drive.google.com/file/d/1JUkMhHSfgIYCjfDNKZUMYZt6L5I7Ra6G/view?usp=sharing DFRC: https://drive.google.com/file/d/17vDaff9LAsVo_1OfsgWAIYqJtqR8aHlm/view?usp=sharing The training branch used is https://github.com/vondele/nnue-pytorch/commits/lossScan4 A PR to the main trainer repo will be made later. This contains a revised loss function, now computing the loss from the score based on the win rate model, which is a more accurate representation than what we had before. Scaling constants are tweaked there as well. closes https://github.com/official-stockfish/Stockfish/pull/4100 Bench: 5186781 --- src/evaluate.cpp | 2 +- src/evaluate.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/evaluate.cpp b/src/evaluate.cpp index 6d2e8da8..53613794 100644 --- a/src/evaluate.cpp +++ b/src/evaluate.cpp @@ -1102,7 +1102,7 @@ Value Eval::evaluate(const Position& pos, int* complexity) { if (useNNUE && !useClassical) { int nnueComplexity; - int scale = 1092 + 106 * pos.non_pawn_material() / 5120; + int scale = 1064 + 106 * pos.non_pawn_material() / 5120; Value optimism = pos.this_thread()->optimism[stm]; Value nnue = NNUE::evaluate(pos, true, &nnueComplexity); diff --git a/src/evaluate.h b/src/evaluate.h index e25bd52e..36d3b2b2 100644 --- a/src/evaluate.h +++ b/src/evaluate.h @@ -39,7 +39,7 @@ namespace Eval { // The default net name MUST follow the format nn-[SHA256 first 12 digits].nnue // for the build process (profile-build and fishtest) to work. Do not change the // name of the macro, as it is used in the Makefile. - #define EvalFileDefaultName "nn-3c0aa92af1da.nnue" + #define EvalFileDefaultName "nn-3c0054ea9860.nnue" namespace NNUE { -- 2.39.2