Guenther Demetz [Wed, 31 May 2023 09:48:18 +0000 (11:48 +0200)]
Simplify away SEE verification
After 4 simplificatons over PR#4453 the idea does not yield significant
improvement anymore. Maybe also
https://tests.stockfishchess.org/tests/view/640c88092644b62c3394c1c5 was
a fluke.
Muzhen Gaming [Fri, 2 Jun 2023 11:55:25 +0000 (19:55 +0800)]
Search tuning at very long time control with new net
The most significant change would be the singularBeta formula.
It was first tested by cj5716 (see https://tests.stockfishchess.org/tests/view/647317c9d29264e4cfa74ec7),
and I took much inspiration from that idea.
Linmiao Xu [Fri, 12 May 2023 22:07:20 +0000 (18:07 -0400)]
Update NNUE architecture to SFNNv6 with larger L1 size of 1536
Created by training a new net from scratch with L1 size increased from 1024 to 1536.
Thanks to Vizvezdenec for the idea of exploring larger net sizes after recent
training data improvements.
A new net was first trained with lambda 1.0 and constant LR 8.75e-4. Then a strong net
from a later epoch in the training run was chosen for retraining with start-lambda 1.0
and initial LR 4.375e-4 decaying with gamma 0.995. Retraining was performed a total of
3 times, for this 4-step process:
1. 400 epochs, lambda 1.0 on filtered T77+T79 v6 deduplicated data
2. 800 epochs, end-lambda 0.75 on T60T70wIsRightFarseerT60T74T75T76.binpack
3. 800 epochs, end-lambda 0.75 and early-fen-skipping 28 on the master dataset
4. 800 epochs, end-lambda 0.7 and early-fen-skipping 28 on the master dataset
In the training sequence that reached the new nn-8d69132723e2.nnue net,
the epochs used for the 3x retraining runs were:
1. epoch 379 trained on T77T79-filter-v6-dd.min.binpack
2. epoch 679 trained on T60T70wIsRightFarseerT60T74T75T76.binpack
3. epoch 799 trained on the master dataset
windfishballad [Tue, 23 May 2023 00:13:44 +0000 (20:13 -0400)]
Removed quadratic term in optimism
Remove term which is quadratic in optimism in the eval.
Simplifies and should also remove the bias towards side to move making the eval better for analysis.
update CPU contributors list, the previous update was a couple of months ago,
and unfortunately, was not quite accurate for the number of games played.
This version is based clean calculation from the DB and
an updated script that tracks things (see https://github.com/glinscott/fishtest/pull/1702).
xoto10 [Fri, 19 May 2023 18:58:18 +0000 (19:58 +0100)]
Simplify optimism calculation
This change removes one of the constants in the calculation of optimism. It also changes the 2 constants used with the scale value so that they are independent, instead of applying a constant to the scale and then adjusting it again when it is applied to the optimism. This might make the tuning of these constants cleaner and more reliable in the future.
STC 10+0.1 (accidentally run as an Elo gainer:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 154080 W: 41119 L: 40651 D: 72310
Ptnml(0-2): 375, 16840, 42190, 17212, 423
https://tests.stockfishchess.org/tests/live_elo/64653eabf3b1a4e86c317f77
Remove the constant term of the history threshold which lowers the chance that pruning occurs.
As compensation allow pruning at a slightly higher depth.
Passed LTC: retest on top of VLTC tuning PR 4571 because this changes the history depth factor (use this new factor here)
https://tests.stockfishchess.org/tests/view/6467300e165c4b29ec0afd3f
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 99270 W: 26840 L: 26707 D: 45723
Ptnml(0-2): 36, 9753, 29928, 9878, 40
Michael Chaly [Sun, 7 May 2023 20:33:04 +0000 (23:33 +0300)]
Refine deeper post-lmr searches
This patch improves logic conditions for performing deeper searches after passed LMR.
Instead of exceeding alpha by some margin now it requires to exceed the
current best value - which may be lower than alpha (but never bigger since we
update alpha with bestvalue if it exceeds alpha).
Michael Chaly [Fri, 5 May 2023 04:54:12 +0000 (07:54 +0300)]
Reduce more if current node has a lot of refuted moves.
This patch refines idea of cutoff count - in master we reduce more if current node has at least 4 moves that are refuted by search, this patch increases this count by 1 if refutation happened without having a tt move.
Created by retraining nn-dabb1ed23026.nnue with a dataset composed of:
* The previous best dataset (nn-1ceb1a57d117.nnue dataset)
* Adding de-duplicated T80 data from feb2023 and the last 10 days of jan2023, filtered with v6-dd
Initially trained with the same options as the recent master net (nn-1ceb1a57d117.nnue).
Around epoch 890, training was manually stopped and max epoch increased to 1000.
Created by retraining the master net with these changes to the dataset:
* Extending v6 filtering to data from T77 dec2021, T79 may2022, and T80 nov2022
* Reducing the number of duplicate positions, prioritizing position scores seen later in time
* Using a binpack minimizer to reduce the overall data size
Trained the same way as the previous master net, aside from the dataset changes:
The new v6-dd filtering reduces duplicate positions by iterating over hourly data files within leela test runs, starting with the most recent, then keeping positions the first time they're seen and ignoring positions that are seen again. This ordering was done with the assumption that position scores seen later in time are generally more accurate than scores seen earlier in the test run. Positions are de-duplicated based on piece orientations, the first token in fen strings.
The binpack minimizer was run with default settings after first merging monthly data into single binpacks.
This idea is a result of my second condition combination tuning for reductions:
https://tests.stockfishchess.org/tests/view/643ed5573806eca398f06d61
There were used two parameters per combination: one for the 'sign' of the first and the second condition in a combination. Values >= 50 indicate using a condition directly and values <= -50 means use the negation of a condition.
Each condition pair (X,Y) had two occurances dependent of the order of the two conditions:
- if X < Y the parameters used for more reduction
- if X > Y the parameters used for less reduction
- if X = Y then only one condition is present and A[X][X][0]/A[X][X][1] stands for using more/less reduction for only this condition.
The parameter pair A[7][2][0] (value = -94.70) and A[7][2][1] (value = 93.60) was one of the strongest signals with values near 100/-100.
Here condition nr. 7 was '(ss+1)->cutoffCnt > 3' and condition nr. 2 'move == ttMove'. For condition nr. 7 the negation is used because A[7][2][0] is negative.
This translates finally to less reduction (because 7 > 2) for tt moves if child cutoffs <= 3.
Michael Chaly [Tue, 11 Apr 2023 16:53:36 +0000 (19:53 +0300)]
Simplify stats assignment for Pv nodes
This patch is a simplification of my recent elo gainer.
Logically the Elo gainer didn't make much sense and this patch simplifies it into smth more logical.
Instead of assigning negative bonuses to all non-first moves that enter PV nodes
we assign positive bonuses in full depth search after LMR only for moves that
will result in a fail high - thus not assigning positive bonuses
for moves that will go to pv search - so doing "almost" the same as we do in master now for them.
Logic differs for some other moves, though, but this removes some lines of code.
peregrineshahin [Thu, 16 Mar 2023 21:04:41 +0000 (00:04 +0300)]
Fix capturing underpromotions issue
Fix underpromotion captures are generated amongst quiets although dealt with as a capture_stage in search, this makes not skipping them when move count pruning kicks-in consistent with updating their histories amongst captures.
Reduce the number of overloads for pieces() by using a more general template implementation.
Secondly simplify some code in search.cpp using the new general functionality.
The current implementation generates warnings on MSVC. However, we have
no real use cases for double-typed UCI option values now. Also parameter
tuning only accepts following three types:
The calculation of rootComplexity can't call eval when in check.
Doing so triggers an assert if compiled in debug mode when
the rootpos is evaluated using classical eval.
If the position is not in TT, decrease depth by 2
or by 4 if the TT entry for the current position was hit
and the stored depth is greater than or equal to the current depth.
Many thanks to Vizvezdenec as the main idea was his.
Maxim Masiutin [Fri, 31 Mar 2023 15:16:50 +0000 (18:16 +0300)]
Replace deprecated icc with icx
Replace the deprecated Intel compiler icc with its newer icx variant.
This newer compiler is based on clang, and yields good performance.
As before, currently only linux is supported.
MinetaS [Thu, 30 Mar 2023 11:14:30 +0000 (11:14 +0000)]
Simplify away complexityAverage
Instead of tracking the average of complexity values, calculate
complexity of root position at the beginning of the search and use it as
a scaling factor in time management.
Maxim Masiutin [Sun, 12 Mar 2023 13:16:51 +0000 (15:16 +0200)]
Improve compatibility
this makes it easier to compile under MSVC, even though we recommend gcc/clang for production compiles at the moment.
In Win32 API, by default, most null-terminated character strings arguments are of wchar_t (UTF16, formerly UCS16-LE) type, i.e. 2 bytes (at least) per character. So, src/misc.cpp should have proper type. Respectively, for src/syzygy/tbprobe.cpp, in Widows, file paths should be std::wstring rather than std::string. However, this requires a very big number of changes, since the config files are also keeping the 8-bit-per-character std::string strings. Therefore, just one change of using 8-byte-per-character CreateFileA make it compile under MSVC.
Miguel Lahoz [Mon, 27 Mar 2023 16:06:24 +0000 (00:06 +0800)]
Clean up repetitive declarations for see_ge
The occupied bitboard is only used in one place and is otherwise thrown away.
To simplify use, see_ge function can instead be overloaded.
Repetitive declarations for occupied bitboard can be removed.
Created by retraining the master net with these modifications:
* New filtering methods for existing data from T80 sep+oct2022, T79 apr2022, T78 jun+jul+aug+sep2022, T77 dec2021
* Adding new filtered data from T80 aug2022 and T78 apr+may2022
* Increasing early-fen-skipping from 28 to 30
The v3 filtering used for data from T77dec 2021 differs from v2 filtering in that:
* To improve binpack compression, positions after ply 28 were skipped during training by setting position scores to VALUE_NONE (32002) instead of removing them entirely
* All early-game positions with ply <= 28 were removed to maximize binpack compression
* Only bestmove captures at d6pv2 search were skipped, not 2nd bestmove captures
* Binpack compression was repaired for the remaining positions by effectively replacing bestmoves with "played moves" to maintain contiguous sequences of positions in the training game data
After improving binpack compression, The T77 dec2021 data size was reduced from 95G to 19G.
The v6 filtering used for data from T80augsepoctT79aprT78aprtosep 2022 differs from v2 in that:
* All positions with only one legal move were removed
* Tighter score differences at d6pv2 search were used to remove more positions with only one good move than before
* d6pv2 search was not used to remove positions where the best 2 moves were captures
MinetaS [Wed, 22 Mar 2023 07:50:55 +0000 (16:50 +0900)]
Remove non_pawn_material in NNUE::evaluate
After "Use NNUE complexity in search, retune related parameters" commit,
the effect of non-pawn material adjustment has been nearly diminished.
This patch removes pos.non_pawn_material as a simplification, which
passed non-regression tests with both STC and LTC.
Michael Chaly [Fri, 24 Mar 2023 04:45:22 +0000 (07:45 +0300)]
Simplify statScore initialization
This patch simplifies initialization of statScore to "always set it up to 0" instead of setting it up to 0 two plies deeper.
Reason for why it was done in previous way partially was because of LMR usage of previous statScore which was simplified long time ago so it makes sense to make in more simple there.
FauziAkram [Tue, 21 Mar 2023 20:58:25 +0000 (23:58 +0300)]
Update Elo estimates for terms in search
Setting the Elo value of some functions which were not set before.
All tests run at 10+0.1 (STC), 25000 games (Same as #4294).
Book used: UHO_XXL_+0.90_+1.19.epd
Values are rounded to the nearest non-negative integer.
pb00067 [Mon, 20 Mar 2023 07:56:44 +0000 (08:56 +0100)]
Verified SEE pruning for capturing and checking moves.
Patch analyzes field after SEE exchanges concluded with a recapture by
the opponent:
if opponent Queen/Rook/King results under attack after the exchanges, we
consider the move sharp and don't prune it.
Important note:
By accident I forgot to adjust 'occupied' when the king takes part in
the exchanges.
As result of this a move is considered sharp too, when opponent king
apparently can evade check by recapturing.
Surprisingly this seems contribute to patch's strength.
pb00067 [Mon, 13 Mar 2023 17:32:40 +0000 (18:32 +0100)]
Remove 'si' StateInfo variable/parameter.
Since st is a member of position we don't need to pass it separately as
parameter.
While being there also remove some line in pos_is_ok, where
a copy of StateInfo was made by using default copy constructor and
then verified it's correctedness by doing a memcmp.
There is no point in doing that.
The clang 16 release will remove the -fexperimental-new-pass-manager
flag (see https://github.com/llvm/llvm-project/commit/69b2b7282e92a1b576b7bd26f3b16716a5027e8e).
Thus, the commit adapts the Makefile to use this flag only for older
clang versions.
Michael Chaly [Mon, 13 Mar 2023 12:57:58 +0000 (15:57 +0300)]
Do more singular extensions
This patch continues trend of last VLTC tuning - as measured by dubslow
most of it gains was in lowering marging in calculation of singularBeta.
This patch is a manual adjustment on top of it - it lowers multiplier
of depth in calculation of singularBeta even further, from 2/3 to 1,5/2,5.
Dubslow [Fri, 10 Mar 2023 00:33:13 +0000 (18:33 -0600)]
More negative extensions on nonsingular nonpv nodes.
Following up the previous gainer also in this nonsingular node
section of code. Credit shared with @FauziAkram for realizing
this nonsingular node stuff had some potential, and @XInTheDark
for reminding us that !PvNodes better handle extensions/reductions
than Pv.
Michael Chaly [Mon, 6 Mar 2023 22:54:20 +0000 (01:54 +0300)]
Do more negative extensions
This patch does negatively extend transposition table move if singular search failed and tt value is not bigger than alpha.
Logic is close to what we had before recent simplification of negative extensions but uses or condition instead of and condition.
https://github.com/official-stockfish/Stockfish/commit/5c75c1c2fbb7bb4f0bf7c44fb855c415b788cbf7
introduced a capture_stage() function, but TB usage needs a pure capture() function.
disservin [Sat, 4 Mar 2023 15:34:34 +0000 (16:34 +0100)]
Add wiki to artifacts
snapshot the wiki https://github.com/official-stockfish/stockfish/wiki as part of the artifacts generated.
This will allow future release to include the wiki pages as a form of documentation
Michael Chaly [Fri, 24 Feb 2023 09:09:45 +0000 (12:09 +0300)]
Fix duplicated moves generation in movepicker
in a some of cases movepicker returned some moves more than once which lead
to them being searched more than once. This bug was possible because of how
we use queen promotions - they are generated as a captures but are not
included in position function which checks if move is a capture. Thus if
any refutation (killer or countermove) was a queen promotion it was
searched twice - once as a capture and one as a refutation.
This patch affects various things, namely stats assignments for queen promotions
and other moves if best move is queen promotion,
also some heuristics in search and qsearch.
With this patch every queen promotion is now considered a capture.
After this patch number of found duplicated moves is 0 during normal 13 depth bench run.
Created by retraining the master net with modifications to the previous best dataset:
* Improving T80 oct+nov 2022 endgame lambda accuracy by rescoring with 12-16tb of syzygy 7p tablebases
* Filtering T78 jun+jul+aug 2022 with d6pv2 search to remove positions with bestmove captures or one good move
* Adding T80 sep 2022 data, rescored with 16tb of 7p tablebases, unfiltered
Trained with max-epoch 900, end-lambda 0.7, and early-fen-skipping 28.
Dubslow [Wed, 22 Feb 2023 11:45:43 +0000 (05:45 -0600)]
Late counter bonus: boost underestimated moves
The idea here is very intuitive: since we've just proven that the move is good, then if it previously had poor stats, boost those stats more than otherwise.
Call the recently added hint function for NNUE accumulator update after a failed probcut search.
In this case we already searched at least some captures and tt move which, however, is not sufficient for a cutoff.
So it seems we have a greater chance that the full search will also have no cutoff and hence all moves have to be searched.
pb00067 [Sun, 26 Feb 2023 08:59:35 +0000 (09:59 +0100)]
Use common_parent_position hint also at PVNodes TT hits.
Credits to Stefan Geschwentner (locutus2) showing that the hint
is useful on PvNodes. In contrast to his test,
this version avoids to use the hint when in check.
I believe checking positions aren't good candidates for the hint
because:
- evasion moves are rather few, so a checking pos. has much less childs
than a normal position
- if the king has to move the NNUE eval can't use incremental updates,
so the child nodes have to do a full refresh anyway.
Michael Chaly [Fri, 24 Feb 2023 15:25:24 +0000 (18:25 +0300)]
Search tuning at very long time control
This patch is a result of tuning session of approximately 100k games at 120+1.2.
Biggest changes are in extensions, stat bonus and depth reduction for nodes without a tt move.
Linmiao Xu [Tue, 21 Feb 2023 16:17:59 +0000 (11:17 -0500)]
Reintroduce nnue pawn scaling with lower lazy thresholds
Params found with the nevergrad TBPSA optimizer via nevergrad4sf modified to:
* use SPRT LLR with fishtest STC elo gainer bounds [0, 2] as the objective function
* increase the game batch size after each new optimal point is found
The params were the optimal point after TBPSA iteration 7 and 160 nevergrad evaluations with:
* initial batch size of 96 games per evaluation
* batch size increase of 64 games after each iteration
* a budget of 512 evaluations
* TC: fixed 1.5 million nodes per move, no time limit
nevergrad4sf enables optimizing stockfish params with TBPSA:
https://github.com/vondele/nevergrad4sf
Using pentanomial game results with smaller game batch sizes was inspired by:
Use of SPRT LLR calculated from pentanomial game results as the objective function was an experiment at maximizing the information from game batches to reduce the computational cost for TBPSA to converge on good parameters.
For the exact code used to find the params:
https://github.com/linrock/tuning-fork
This patch introduces `hint_common_parent_position()` to signal that potentially several child nodes will require an NNUE eval. By populating explicitly the accumulator, these subsequent evaluations can be performed more efficiently.
This was based on the observation that calculating the evaluation in an excluded move position yielded a significant Elo gain, even though the evaluation itself was already available (work by pb00067).
Sopel wrote the code to perform just the accumulator update. This PR is based on cleaned up code that
The sdot instruction computes (and accumulates) a signed dot product,
which is quite handy for Stockfish's NNUE code. The instruction is
optional for Armv8.2 and Armv8.3, and mandatory for Armv8.4 and above.
The commit adds a new 'arm-dotprod' architecture with enabled dot
product support. It also enables dot product support for the existing
'apple-silicon' architecture, which is at least Armv8.5.
The following local speed test was performed on an Apple M1 with
ARCH=apple-silicon. I had to remove CPU pinning from the benchmark
script. However, the results were still consistent: Checking both
binaries against themselves reported a speedup of +0.0000 and +0.0005,
respectively.
```
Result of 100 runs
==================
base (...ish.037ef3e1) = 1917997 +/- 7152
test (...fish.dotprod) = 2159682 +/- 9066
diff = +241684 +/- 2923
MinetaS [Mon, 13 Feb 2023 02:54:59 +0000 (11:54 +0900)]
Fix overflow in add_dpbusd_epi32x2
This patch fixes 16bit overflow in *_add_dpbusd_epi32x2 functions,
that can be triggered in rare cases depending on the NNUE weights.
While the code leads to some slowdown on affected architectures
(most notably avx2), the fix is simpler than some of the other
options discussed in
https://github.com/official-stockfish/Stockfish/pull/4394