The unusual result of (combined) +12.0 +- 3.7 in the 2 VVLTC simplification SPRTs ran was the result of base having only 64MB of hash instead of 512MB (Asymmetric hash).
Vizvezdenec was the one to notice this.
FauziAkram [Fri, 17 May 2024 22:22:41 +0000 (01:22 +0300)]
Early Exit in Bitboards::sliding_attack()
he original code checks for occupancy within the loop condition. By moving this check inside the loop and adding an early exit condition, we can avoid unnecessary iterations if a blocking piece is encountered.
Created by first retraining the spsa-tuned main net `nn-ae6a388e4a1a.nnue` with:
- using v6-dd data without bestmove captures removed
- addition of T80 mar2024 data
- increasing loss by 20% when Q is too high
- torch.compile changes for marginal training speed gains
And then SPSA tuning weights of epoch 899 following methods described in:
https://github.com/official-stockfish/Stockfish/pull/5149
This net was reached at 92k out of 120k steps in this 70+0.7 th 7 SPSA tuning run:
https://tests.stockfishchess.org/tests/view/66413b7df9f4e8fc783c9bbb
Thanks to @Viren6 for suggesting usage of:
- c value 4 for the weights
- c value 128 for the biases
Scripts for automating applying fishtest spsa params to exporting tuned .nnue are in:
https://github.com/linrock/nnue-tools/tree/master/spsa
Reduce more when improving and ttvalue is lower than alpha
More reduction if position is improving but value from TT doesn't
exceeds alpha but the tt move is excluded.
This idea is based on following LMR condition tuning
https://tests.stockfishchess.org/tests/view/66423a1bf9f4e8fc783cba37
by using only three of the four largest terms P[3], P[18] and P[12].
Michael Chaly [Tue, 14 May 2024 17:10:01 +0000 (20:10 +0300)]
Add extra bonus to pawn history for a move that caused a fail low
Basically the same idea as it is for continuation/main history, but it
has some tweaks.
1) it has * 2 multiplier for bonus instead of full/half bonus - for
whatever reason this seems to work better;
2) attempts with this type of big bonuses scaled somewhat poorly (or
were unlucky at longer time controls), but after measuring the fact
that average value of pawn history in LMR after adding this bonuses
increased by substantial number (for multiplier 1,5 it increased by
smth like 400~ from 8192 cap) attempts were made to make default pawn
history negative to compensate it - and version with multiplier 2 and
initial fill value -900 passed.
xoto10 [Mon, 13 May 2024 06:19:18 +0000 (07:19 +0100)]
Use 5% less time on first move
Stockfish appears to take too much time on the first move of a game and
then not enough on moves 2,3,4... Probably caused by most of the factors
that increase time usually applying on the first move.
Attempts to give more time to the subsequent moves have not worked so
far, but this change to simply reduce first move time by 5% worked.
Linmiao Xu [Thu, 9 May 2024 18:03:35 +0000 (14:03 -0400)]
Re-evaluate some small net positions for more accurate evals
Use main net evals when small net evals hint that higher eval
accuracy may be worth the slower eval speeds. With Finny caches,
re-evals with the main net are less expensive than before.
Original idea by mstembera who I've added as co-author to this PR.
Shawn Xu [Wed, 8 May 2024 21:26:01 +0000 (14:26 -0700)]
Simplify Away Negative Extension
This patch simplifies away the negative extension applied when the value returned by the transposition table is assumed to fail low over the value of reduced search.
Michael Chaly [Mon, 6 May 2024 17:18:12 +0000 (20:18 +0300)]
Simplify away conthist 3 from statscore
Following previous elo gainer that gained by making conthist 3 less important in pruning this patch simplifies away this history from calculation of statscore.
MinetaS [Tue, 7 May 2024 18:26:09 +0000 (03:26 +0900)]
Fix nodestime
1. The current time management system utilizes limits.inc and
limits.time, which can represent either milliseconds or node count,
depending on whether the nodestime option is active. There have been
several modifications which brought Elo gain for typical uses (i.e.
real-time matches), however some of these changes overlooked such
distinction. This patch adjusts constants and multiplication/division to
more accurately simulate real TC conditions when nodestime is used.
2. The advance_nodes_time function has a bug that can extend the time
limit when availableNodes reaches exact zero. This patch fixes the bug
by initializing the variable to -1 and make sure it does not go below
zero.
3. elapsed_time function is newly introduced to print PV in the UCI
output based on real time. This makes PV output more consistent with the
behavior of trivial use cases.
rn5f107s2 [Wed, 8 May 2024 20:08:56 +0000 (22:08 +0200)]
IIR on cutnodes if there is a ttMove but the ttBound is upper
If there is an upper bound stored in the transposition table, but we still have a ttMove, the upperbound indicates that the last time the ttMove was tried, it failed low. This fail low indicates that the ttMove may not be good, so this patch introduces a depth reduction of one for cutnodes with such ttMoves.
Michael Chaly [Wed, 8 May 2024 18:59:03 +0000 (21:59 +0300)]
Refactor quiet moves pruning in qsearch
Make it formula more in line with what we use in search - current formula is more or less the one we used years ago for search but since then it was remade, this patch remakes qsearch formula to almost exactly the same as we use in search - with sum of conthist 0, 1 and pawn structure history.
FauziAkram [Tue, 7 May 2024 12:03:58 +0000 (15:03 +0300)]
Tweak reduction formula based on depth
The idea came to me by checking for trends from the megafauzi tunes, since the values of the divisor for this specific formula were as follows:
stc: 15990
mtc: 16117
ltc: 14805
vltc: 12719
new vltc passed by Muzhen: 12076
This shows a clear trend related to time control, the higher it is, the lower the optimum value for the divisor seems to be.
So I tried a simple formula, using educated guesses based on some calculations, tests show it works pretty fine, and it can still be further tuned at VLTC in the future to scale even better.
Muzhen Gaming [Sat, 4 May 2024 23:36:48 +0000 (07:36 +0800)]
VVLTC search tune
This patch is the result of two tuning stages:
1. ~32k games at 60+0.6 th8:
https://tests.stockfishchess.org/tests/view/662d9dea6115ff6764c7f817
2. ~193k games at 80+0.8 th6, based on PR #5211:
https://tests.stockfishchess.org/tests/view/663587e273559a8aa857ca00.
Based on extensive VVLTC tuning and testing both before and after
#5211, it is observed that introduction of new extensions positively
affected the search tune results.
cj5716 [Wed, 1 May 2024 10:31:38 +0000 (18:31 +0800)]
Some history fixes and tidy-up
This adds the functions `update_refutations` and `update_quiet_histories` to better distinguish the two. `update_quiet_stats` now just calls both of these functions.
The functional side of this patch is two-fold:
1. Stop refutations being updated when we carry out multicut
2. Update pawn history every time we update other quiet histories
Viren6 [Sat, 4 May 2024 16:29:23 +0000 (17:29 +0100)]
Introduce Quadruple Extensions
This patch introduces quadruple extensions, with the new condition of not ttPv. It also generalises all margins, so that extensions can still occur if conditions are only partially fulfilled, but with a stricter margin.
Michael Chaly [Sat, 4 May 2024 07:33:26 +0000 (10:33 +0300)]
Add extra bonuses to some moves that forced a fail low
The previous patch on this idea was giving bonuses to this moves if best value of search is far below current static evaluation.
This patch does similar thing but adds extra bonus when best value of search is far below static evaluation before previous move.
1) Fixes a bug introduced in
https://github.com/official-stockfish/Stockfish/pull/5194. Only one
psqtOnly flag was used for two perspectives which was causing
wrong entries to be cleared and marked.
2) The finny caches should be cleared like histories and not at the
start of every search.
Saves a (currently) 800 KB allocation and deallocation when running
`eval`, not particularly significant and zero impact on play but not
necessary either.
Use capture history to better judge which sacrifices to explore
This idea has been bouncing around a while. @Vizvezdenec tried it a
couple years ago in Stockfish without results, but its recent arrival in
Ethereal inspired him and thence me to try it afresh in Stockfish.
(Also factor out the now-common code with futpruning for captures.)
More reduction at cut nodes which are not a former PV node
But the tt move and first killer are excluded.
This idea is based on following LMR condition tuning
https://tests.stockfishchess.org/tests/view/66228bed3fe04ce4cefc0c71 by
using only the two largest terms P[0] and P[1].
For each thread persist an accumulator cache for the network, where each
cache contains multiple entries for each of the possible king squares.
When the accumulator needs to be refreshed, the cached entry is used to more
efficiently update the accumulator, instead of rebuilding it from scratch.
This idea, was first described by Luecx (author of Koivisto) and
is commonly referred to as "Finny Tables".
When the accumulator needs to be refreshed, instead of filling it with
biases and adding every piece from scratch, we...
1. Take the `AccumulatorRefreshEntry` associated with the new king bucket
2. Calculate the features to activate and deactivate (from differences
between bitboards in the entry and bitboards of the actual position)
3. Apply the updates on the refresh entry
4. Copy the content of the refresh entry accumulator to the accumulator
we were refreshing
5. Copy the bitboards from the position to the refresh entry, to match
the newly updated accumulator
Parameters Tune, adding also another tunable parameter (npmDiv) to be
variable for different nets (bignet, smallnet, psqtOnly smallnet). P.s:
The changed values are only the parameters where there is agreement
among the different time controls, so in other words, the tunings are
telling us that changing these specific values to this specific
direction is good in all time controls, so there shouldn't be a high
risk of regressing at longer time controls.
Previously it was possible to also get the node counter after running a bench with perft, i.e.
`./stockfish bench 1 1 5 current perft`, caused by a small regression from the uci refactoring.
We change the definition of "age" from "age of this position" to "age of this TT entry".
In this way, despite being on the same position, when we save into TT, we always prefer the new entry as compared to the old one.
It makes more sense to not (potentially) change the developers alsr entropy setting to make the test run through. This should be an active choice even if the test then might fail locally for them.
the recent refactoring has shown some limitations of our testing, hence we add a couple of more tests including:
* expected mate score
* expected mated score
* expected in TB win score
* expected in TB loss score
* expected info line output
* expected info line output (wdl)
Small improvement of the elapsed time usage in search, makes the code easier to read overall.
Also Search::Worker::iterative_deepening() now only checks the elapsed time once, instead of 3 times in a row.
* TB values can have a distance of 0, mainly when we are in a tb position but haven't found mate.
* Add a missing whitespace to UCIEngine::on_update_no_moves()
A side note, it is still required for the static functions,
but these should be moved to a different namespace/class
later on, since sf kinda relies on them.
The assignment (ss + 1)->excludedMove = Move::none() can be simplified away because when that line is reached, (ss + 1)->excludedMove is always already none. The only moment stack[x]->excludedMove is modified, is during singular search, but it is reset to none right after the singular search is finished.
The same functionality is available by using COMPCXX and having another variable which does the same is just confusing.
There was only one mention on Stockfish Wiki about this which has been changed to COMPCXX.
* Using an earlier L1-3072 net, and with triple extension margin manually set to 0: https://tests.stockfishchess.org/tests/view/65ffdf5d0ec64f0526c544f2 (~30k games)
* Continue tuning, but with the previous master net (L1-2560). https://tests.stockfishchess.org/tests/view/660663f00ec64f0526c59c41 (~27k games)
* Starting with the parameters from step 2, use the current L1-3072 net, and allow the triple extension margin to be tuned starting from 0: https://tests.stockfishchess.org/tests/view/660c16b8216a13d9498e7536 (40k games)
Disservin [Sat, 23 Mar 2024 09:22:20 +0000 (10:22 +0100)]
Transform search output to engine callbacks
Part 2 of the Split UCI into UCIEngine and Engine refactor.
This creates function callbacks for search to use when an update should occur.
The benching in uci.cpp for example does this to extract the total nodes
searched.
Disservin [Sun, 17 Mar 2024 11:33:14 +0000 (12:33 +0100)]
Split UCI into UCIEngine and Engine
This is another refactor which aims to decouple uci from stockfish. A new engine
class manages all engine related logic and uci is a "small" wrapper around it.
In the future we should also try to remove the need for the Position object in
the uci and replace the options with an actual options struct instead of using a
map. Also convert the std::string's in the Info structs a string_view.
Update NNUE architecture to SFNNv9 and net nn-ae6a388e4a1a.nnue
Part 1: PyTorch Training, linrock
Trained with a 10-stage sequence from scratch, starting in May 2023:
https://github.com/linrock/nnue-tools/blob/master/exp-sequences/3072-10stage-SFNNv9.yml
While the training methods were similar to the L1-2560 training sequence,
the last two stages introduced min-v2 binpacks,
where bestmove capture and in-check position scores were not zeroed during minimization,
for compatibility with skipping SEE >= 0 positions and future research.
Training data can be found at:
https://robotmoon.com/nnue-training-data
This net was tested at epoch 679 of the 10th training stage:
https://tests.stockfishchess.org/tests/view/65f32e460ec64f0526c48dbc
Part 2: SPSA Training, Viren6
The net was then SPSA tuned.
This consisted of the output weights (32 * 8) and biases (8)
as well as the L3 biases (32 * 8) and L2 biases (16 * 8), totalling 648 params in total.
The SPSA tune can be found here:
https://tests.stockfishchess.org/tests/view/65fc33ba0ec64f0526c512e3
With the help of Disservin , the initial weights were extracted with:
https://github.com/Viren6/Stockfish/tree/new228
The net was saved with the tuned weights using:
https://github.com/Viren6/Stockfish/tree/new241
Earlier nets of the SPSA failed STC compared to the base 3072 net of part 1:
https://tests.stockfishchess.org/tests/view/65ff356e0ec64f0526c53c98
Therefore it is suspected that the SPSA at VVLTC has
added extra scaling on top of the scaling of increasing the L1 size.
current master triggers a gcc note:
parameter passing for argument of type 'std::pair<double, double>' when C++17 is enabled changed to match C++14 in GCC 10.1
while this is inconsequential, and just informative https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111516 we can easily avoid it.
Disservin [Fri, 29 Mar 2024 09:49:53 +0000 (10:49 +0100)]
Improve prerelease creation workflow
In the last couple of months we sometimes saw duplicated prereleases uploaded to GitHub, possibly due to some racy behavior when concurrent jobs create a prerelease. This now creates an empty prerelease at the beginning of the CI and the binaries are later just attached to this one.
Michael Chaly [Thu, 28 Mar 2024 21:17:37 +0000 (00:17 +0300)]
Adjust best value after a pruned quiet move
Logic somewhat similar to how we adjust best value after pruned captures
in qsearch, but in search this patch does it after pruned quiet moves
and also to not full scale of futility value but to smth in between
best value and futility value.
Values were found using two tunes with the final values taken from the ltc tune after 62k games :
stc - https://tests.stockfishchess.org/tests/view/65fb526b0ec64f0526c50694
ltc - https://tests.stockfishchess.org/tests/view/65fd36e60ec64f0526c5214b
Ideas for future work;
* tune these values with the other TM adjustments
* try narrower bands
* calculate adjustment for exact eval by interpolation
Muzhen Gaming [Sun, 17 Mar 2024 03:20:41 +0000 (11:20 +0800)]
VVLTC search tune
This set of parameters was derived from 3 tuning attempts:
https://tests.stockfishchess.org/tests/view/65d19ab61d8e83c78bfd8436 (80+0.8 x8, ~40k games)
Then tuned with one of linrock's early L1-3072 nets:
https://tests.stockfishchess.org/tests/view/65def7b04b19edc854ebdec8 (VVLTC, ~36k games)
Starting from the result of this tuning, the parameters were then tuned with the current master net:
https://tests.stockfishchess.org/tests/view/65f11c420ec64f0526c46fc4 (VVLTC, ~45k games)
Additionally, at the start of the third tuning phase, 2 parameters were manually changed:
Notably, the triple extension margin was decreased from 78 to 22. This idea was given by Vizvezdenec:
https://tests.stockfishchess.org/tests/view/65f0a2360ec64f0526c46752.
The PvNode extension margin was also adjusted from 50 to 40.
This tune also differs from previous tuning attempts by tuning the evaluation thresholds for smallnet and psqt-only.
The former was increased through the tuning, and this is hypothesized to scale better at VVLTC,
although there is not much evidence of it.
Robert Nurnberg [Sun, 17 Mar 2024 14:39:01 +0000 (15:39 +0100)]
Base WDL model on material count and normalize evals dynamically
This PR proposes to change the parameter dependence of Stockfish's
internal WDL model from full move counter to material count. In addition
it ensures that an evaluation of 100 centipawns always corresponds to a
50% win probability at fishtest LTC, whereas for master this holds only
at move number 32. See also
https://github.com/official-stockfish/Stockfish/pull/4920 and the
discussion therein.
The new model was fitted based on about 340M positions extracted from
5.6M fishtest LTC games from the last three weeks, involving SF versions
from e67cc979fd2c0e66dfc2b2f2daa0117458cfc462 (SF 16.1) to current
master.
The involved commands are for
[WDL_model](https://github.com/official-stockfish/WDL_model) are:
```
./updateWDL.sh --firstrev e67cc979fd2c0e66dfc2b2f2daa0117458cfc462
python scoreWDL.py updateWDL.json --plot save --pgnName update_material.png --momType "material" --momTarget 58 --materialMin 10 --modelFitting optimizeProbability
```
The anchor `58` for the material count value was chosen to be as close
as possible to the observed average material count of fishtest LTC games
at move 32 (`43`), while not changing the value of
`NormalizeToPawnValue` compared to the move-based WDL model by more than
1.
The patch only affects the displayed cp and wdl values.
Michael Chaly [Fri, 15 Mar 2024 15:55:40 +0000 (18:55 +0300)]
Clamp history bonus to stats range
Before, one always had to keep track of the bonus one assigns to a history to stop
the stats from overflowing. This is a quality of life improvement. Since this would often go unnoticed during benching.