Marco Costalba [Sat, 16 Feb 2013 11:42:22 +0000 (12:42 +0100)]
Account for gamePly after each move
Rename startPosPly to gamePly and increment/decrement
the variable after each do/undo move. This adds a little
overhead in do_move() but we will need to have the
game ply during the search for the next patches now
under test.
Currently we don't increment gamePly in do_null_move()
becuase it is not needed at the moment. Could change
in the future.
As a nice side effect we can now remove an hack in
startpos_ply_counter().
Marco Costalba [Fri, 15 Feb 2013 15:21:15 +0000 (07:21 -0800)]
Merge Gary's king safety tweak
Still well within error bars, so probably not a big improvement,
but may be worthwhile. I will let the test keep running. The idea
for the tweak came from the TCEC game against Houdini. Stockfish saw
it as a draw, well past when it should have seen problems. King safety
was off, since it was QN and pawns only, but in fact the king was
quite vulnerable.
PGN of game against houdini. Moves 55-59 it was seeing a draw, and
Houdini was seeing a good sized advantage for black. With the change,
Stockfish now recognizes that moving the king there is a bad idea.
Marco Costalba [Fri, 15 Feb 2013 07:56:04 +0000 (08:56 +0100)]
Further speed up bitbase generation
Another trick, along the same lines of previous
patch. This time we first check positions with
white side to move that, becuase we start with
pawn on rank 7, are easily classified as wins,
then black ones.
Number of cycles reduced to 15 !
Becuase now it is faster we can remove a lot of
code to detect theoretical draws. We will calculate
them anyhow, although a bit slower, but the speed
up trick more than compensates it.
Verified that generated bitbases match original ones.
Marco Costalba [Wed, 13 Feb 2013 11:26:08 +0000 (12:26 +0100)]
Speedup KPK bitbase of 25%
Change the way the index is coded so that
now looping from 0 to IndexMax generates
the pawns from RANK_7 down to RANK2.
Becuase positions with pawns at RANK_7
are easily classified as wins/draws, this
small trick allows to reduce the number
of needed iterations from 30 down to 26!
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Sun, 10 Feb 2013 18:10:01 +0000 (19:10 +0100)]
Rename and de-templetize sort()
Rename to insertion_sort so to avoid confusion
with std::sort, also move it to movepicker.cpp
and use the bit slower std::stable_sort in
search.cpp where it is used in not performance
critical paths.
Marco Costalba [Sat, 9 Feb 2013 15:22:47 +0000 (16:22 +0100)]
Further simplify first_entry()
We can encode the ClusterSize directly in the
hashMask, this allows to skip the left shift.
There is no real change, but bench number is now
different because instead of using the lowest order
bits of the key to index the start of the cluster,
now we don't use the last two lsb bits that are
always set to zero (cluster size is 4). So for
instance, if 10 bits are used to index the cluster,
instead of bits [9..0] now we use bits [11..2].
This changes the positions that end up in the same
cluster affecting TT hits and so bench is different.
Marco Costalba [Fri, 8 Feb 2013 07:49:36 +0000 (08:49 +0100)]
Workaround value-initialization in MSVC
The syntax splitPoints() should force the compiler to
value-initialize the array and because there is no
user defined c'tor it falls back on zero-initialization.
Unfortunatly this is broken in MSVC compilers, because
value initialization for non-POD types is not supported,
so left splitPoints un-initialized and add in split()
initialization of slavesPositions, that is the only
member not already set at split time.
This fixes an assert under MSVC when running with
more than one thread.
Marco Costalba [Sun, 3 Feb 2013 10:14:21 +0000 (11:14 +0100)]
Be clear about not LMR the ttMove
Currently a ttMove is reduced with ss->reduction = DEPTH_ZERO,
so it is actually not reduced (as it should be), but the
trick works just becuase it happens that ttMove is the first
to be tried and
reduction(depth, 1)
Always returns zero. So explicitly forbid reduction of ttMove
in the LMR condition. This is much clear and self-documented.
Marco Costalba [Sun, 27 Jan 2013 17:48:27 +0000 (18:48 +0100)]
Rewrite do_castle_move()
And handle the castle directly in do/undo_move().
This allow to greatly simplify the code.
Here the beast is the nasty Chess960 that is
really tricky to get it right because could be
that 'from' and 'to' squares are the same or
that king's 'to' square is rook's 'from' square.
Anyhow should work: verified on all Chess960
starting positions.
No functional and no speed change also in Chess960.
Marco Costalba [Sat, 26 Jan 2013 13:23:37 +0000 (14:23 +0100)]
Clarify slavesMask usage
When a thread is allocated a bit is set in slavesMask.
This bit corresponds to the thread's index field that,
because it happens to be the position in the threads
array, eventually it is equal to the loop index 'i'.
But instead of relying on this 'coincidence', explicitly
use the 'idx' field so to clarify slavesMask usage.
Marco Costalba [Wed, 16 Jan 2013 08:26:10 +0000 (09:26 +0100)]
Fix race while exiting
Fix again TimerThread::idle_loop() to prevent a
theoretical race with 'exit' flag in ~Thread().
Indeed in Thread d'tor we raise 'exit' and then
call notify() that is lock protected, so we
have to check again for 'exit' before going to
sleep in idle_loop().
Also same change in Thread::idle_loop() where we
now check for 'exit' before to go to sleep.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Lucas Braesch [Tue, 1 Jan 2013 03:51:35 +0000 (11:51 +0800)]
Remove Threat Extension
Great code simplification: - instead do not futility
prune threat refutations. allows_move() is therefore removed.
4000 games at 50,000 nodes/move:
1085-989-1926 [51.2%] LOS=98.3%
4000 games in 10"+0.1"
756-751-2493 [50.1%] LOS=55.1%
EDIT: I have retested the patch of Lucas in a slightly different form
(without pruning in PvNode) and test mre or less confirms that
60 lines of code are totally unuseful:
After 6195 games at 15"+0.05"
1333 - 1325 - 3537 ELO 0
Marco Costalba [Sun, 13 Jan 2013 23:32:30 +0000 (00:32 +0100)]
Polymorphic Thread hierarchy
Subclass MainThread and TimerThread and declare
idle_loop() virtual. This allow us to cleanly
remove a good bunch of hacks, relying on C++
polymorphism to do the job.
I misunderstood here. Actually it can happen that
thread is created but still not entered idle_loop
and at the same time start_searching() is called.
Becuase 'do_sleep' is set start_searching() will
set it to false and start the search, but when,
at last, the thread enters idle_loop(), resets
the flag and goes to sleep: not what we want.
Revert the hack waiting for a better solution
in the next patches.
Marco Costalba [Sun, 13 Jan 2013 13:28:22 +0000 (14:28 +0100)]
Small change to "ponderhit" handling
Reset Limits.ponder only if search continue, but if
we are going to stop the search there is no need
(and is also confusing) to clear the 'ponder' flag.
This mimics the behaviour upon rceiving 'stop' when
pondering.
Marco Costalba [Sat, 12 Jan 2013 12:19:06 +0000 (13:19 +0100)]
Clarify SAN disambiguation in case of a pinned piece
In SAN notation when two pieces of the same type can
move to a given destination square, a disambiguation
additional info (like starting file) shall be added
to the SAN move.
If one of the two pieces is pinned, the corresponding
move _could_ be illegal and in this case disambiguation
is not needed. But to be pinned alone it is not enough
to deduce that the move is illegal, for instance in this
position:
R3rk2/2r6/8/8/8/8/8/K7 b - - 0 1
The move Rc8 is ambiguous although the rook in e8 is pinned
and the correct SAN notation should be Rcc8.
Marco Costalba [Sun, 6 Jan 2013 11:49:01 +0000 (12:49 +0100)]
Async 'stop' command
Don't wait for the search to finish after a 'stop'
command, but keep processing the GUI input if any.
Also explicitly wake up the main thread (that could be
sleeping) after a 'stop' or 'quit' command and do not
rely on wait_for_search_finished() doing it for us.
This patch cleans up the code and functions's definitions,
but it is risky and needs a good test under different
conditions to be sure it does not introduces hungs up.
Seems a regression after testing from Gary:
ELO: 7.24 +- 99%: 17.03 95%: 12.93
LOS: 97.86%
Wins: 439 Losses: 381 Draws: 1962
And mine:
After 5410 games at 15"+0.05
Wins: 936 Losses: 1141 Draws: 3333 ELO -13
Moreover we know that there is a regression in the range
of patches which include the fromNull patch.
Probably this is not the only regression since 2.3.1 and
perhaps the idea under fromNull is good, but at the moment,
while in deep regression hunting, better to be on the safe
side and revert it entirely.
My guess on why this is a regression is that using the
negated evaluation of previous ply in case of null search
fails to take in account the king safety asymmetry between
the two colors. This is of course just a guess.
Marco Costalba [Fri, 4 Jan 2013 15:29:13 +0000 (16:29 +0100)]
Retire 'mate in x' hack
Sometimes is faster, but not always and on very long mates
produces strange scores probably due to truncation of PV
artifacts.
So simply perform normal search also in case of UCI 'mate x'
command, with the only difference that when a mate in x is
found search returns immediately.
Marco Costalba [Fri, 4 Jan 2013 13:52:21 +0000 (14:52 +0100)]
Don't exit if unable to find bench file
Now that we can call 'bench' command also from
interactive terminal it makes no more sense to
exit the application if the user types a wrong
file name.
Marco Costalba [Mon, 31 Dec 2012 10:26:12 +0000 (11:26 +0100)]
Micro-optimization in evaluate_space()
Since &-ing with SpaceMask restricts the set to the home
half of the board, it is possible to use just one popcount
instead of 2 by shifting "safe" to the other half of the
board. This gives a small speedup especially on systems
where hardware popcount is not available.
Marco Costalba [Sun, 30 Dec 2012 10:40:20 +0000 (11:40 +0100)]
Handle UCI command "mate in x moves"
Following a user request I added the handling of UCI:
go mate x
Currently we just return from a PV node if x moves have been
done. Probably not the best approach. I have looked at Fruit/Toga
sources and there is even simpler: engine falls back on a fixed
depth search.
Marco Costalba [Thu, 27 Dec 2012 11:13:31 +0000 (12:13 +0100)]
Revert evaluation cache
And return on using TT as backing store for position
evaluations.
Tests (even on single thread) show eval cache was a regression.
In multi thread result should be even worst because eval cache
is a per-thread struct, while TT is shared.
After 4957 games at 15"+0.05 (single thread)
eval cache vs master 969 - 1093 - 2895 -9 ELO
So previous reported result of +18 ELO was probably due to an
issue in the testing framework (a bug in cutechess-cli) that
has been fixed in the meanwhile.
Marco Costalba [Mon, 24 Dec 2012 08:07:17 +0000 (09:07 +0100)]
Introduce Null Threat extension
In case of null search at low depths returns a fail low
due to a threat then, rather than return beta-1 (to cause
a re-search at full depth in the parent node), we set a flag
threatExtension = true (false by default) that will cause
moves that prevent the threat to be extended of one ply
in the following search.
Idea and patch is by Lucas Braesch.
Lucas also did the tests:
1500 games in 5"+0.05":
SF_threatExtension vs SF_20121222: 366 - 331 - 803 [51.2%] LOS=90.8%
3000 games in 10"+0.1":
SF_threatExtension vs SF_20121222: 610 - 559 - 1831 [50.8%] LOS=93.2%
Tests confirmed by Gary after 10570 games,
ELO: 2.79 +- 99%: 8.72 95%: 6.63
LOS: 94.08%
Wins: 1523 Losses: 1438 Draws: 7607
And finally by me at 15"+0.05, single thread, 3824 games
threatExtension vs master 768 - 692 - 2364 +7 ELO
Marco Costalba [Tue, 25 Dec 2012 10:22:55 +0000 (11:22 +0100)]
Small tweak in is_pseudo_legal()
This is difficult code becuase a bug here could lead
to very subtle crashes in case of SMP games where we
have TT move corruption due to concurrent access.
Anyhow I have fully verified te code throwing at it
random moves. It shoudl work.
Marco Costalba [Sun, 9 Dec 2012 10:08:37 +0000 (11:08 +0100)]
Store distinct upper and lower bound scores
This is more complex than what I'd like but I
was unable to split in small chunks.
Here we add 2 slots to TTEntry (valueUpper and depthUpper)
so that sizeof(TTEntry) returns to the original 16 bytes
and we can pack exactly 4 entries in a 64 bytes cache line.
Now we save an upper bound score alongside a lower (exact)
score. The idea is to increase TT cut-offs rates becuase
there is now an higher probability for a node to use TT info.
This patch is highly experimental and probably needs further
steps as is hinted by an unrealistic bench number:
Marco Costalba [Sat, 1 Dec 2012 14:12:18 +0000 (15:12 +0100)]
Don't use TT just to save a node evaluation
In search(), after we evalute the position, in case there
isn't any TT entry we create one with just the evaluation
score.
This patches removes that code. The reason becuase the patch
deserves a single commit it is becuase introduces a (very small)
functional change due to the fact that the total number of
TT stores is less now and this slightly alters the TT hits
of our benchmark.
Marco Costalba [Sat, 1 Dec 2012 13:51:49 +0000 (14:51 +0100)]
Don't read eval from TT anymore
Rely fully on eval cache. Note that we still save eval
info to TT, this is not needed at this moment and will be
removed in future patches. We keep it so to have a "non
functional change" patch.