Marco Costalba [Tue, 29 Sep 2009 09:14:09 +0000 (10:14 +0100)]
Print RootMoveList startup scoring
This satisfies a specific user request of 28/8/2009
"The only issue I have is that during multiPV analysis, the depth 1
best move score is not reported by the engine (reporting for the best
move begins at depth 2). I need it at depth 1 also. Would it be
possible to make this modification in future versions? This would be
of great help as otherwise I will have to use a lesser engine.
The goal of my project is to calculate the ELO performance in a game
and also the ELO rating of individual moves. For this I need depth 1
scores for lower rated performances. I intend to distribute the program
for free upon completion.
Thanks, Jack Welbourne"
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Mon, 28 Sep 2009 09:46:55 +0000 (10:46 +0100)]
Unroll color loops in evaluate
Use templates to manually unroll the loops so that
many values could be calculated at compile time or at
runtime but with a fast direct memory access instead of
an indirect one.
This change gives a speed up of 3.5 % on pgo build !!! :-)
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Mon, 21 Sep 2009 17:07:11 +0000 (18:07 +0100)]
Code style and subtle fix in move_is_legal()
A bunch of trivial code style and comment fixes.
Among them there is a real fix for a subtle case
involving promotion moves.
We currently check that a pawn push to 8/1th rank
must be a promotion, but we don't check the contary,
i.e. that a pawn push on a different rank must NOT be
a promotion. Note that, funny enough, we perform this
control for all the other pieces, but not for the pawns!
This patch fixes this really corner case.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Mon, 21 Sep 2009 10:54:25 +0000 (11:54 +0100)]
Simplify move legality check for uncommon cases
Remove a bunch of difficult and tricky code to test
legality of castle and ep moves and instead use a slower
but simpler check against the list of generated legal moves.
Because these moves are very rare the performance impact
is small but code semplification is ver big: almost 100 lines
of difficult code removed !
No functionality change. No performance change (strangely enough
there is no even minimal performance regression in pgo builds but
instead a slightly and unexpected increase).
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Mon, 21 Sep 2009 09:00:33 +0000 (10:00 +0100)]
When generating checks add possibly under-promotions
In qsearch at depth 0 we generate only captures and checks.
Queen promotion moves are generated among the captures, but
under-promotion moves (both captures and non-captures) are
never generated even if they could give a discovery check.
This patch fixes this limitation extending generate_pawn_noncaptures()
to generate also check moves when required.
Apart for adding the (rare) case of an under-promotion that gives
discovery check, the patch is also a good cleanup because removes
generate_pawn_checks() altoghter.
This patch does the code clean-up but not enables the functional
change so to allow an easier debug.
No functional change and no performance change (actually a very
very small speed increase).
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Sun, 20 Sep 2009 09:47:59 +0000 (10:47 +0100)]
Retire attackers_to(Square s, Color c)
Use the definition in the few places where is needed.
As a nice side effect there is also an optimization in
generate_evasions() where the bitboard of enemy pieces
is computed only once and out of a tight loop.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Sun, 20 Sep 2009 09:26:54 +0000 (10:26 +0100)]
Rename piece_attacks() in piece_attacks_from()
It is a bit longer but much easier to understand especially
for people new to the sources. I remember it was not trivial
for me to understand the returned attack bitboard refers to
attacks launched from the given square and not attacking the
given square.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Sun, 20 Sep 2009 07:43:25 +0000 (08:43 +0100)]
Clean up API for attack information
Remove undefined functions sliding_attacks() and ray_attacks()
and retire square_is_attacked(), use the corresponding definition
instead. It is more clear that we are computing full attack
info for the given square.
Alos fix some obsolete comments in move generation functions.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Sun, 13 Sep 2009 08:02:20 +0000 (09:02 +0100)]
Indirectly prefetch board[from]
One of the most time critical functions is move_is_check()
and in particular the call to type_of_piece_on(from) in the
switch statement.
This call lookups in board[] array and can be slow if board[from]
is not already cached. Few instructions before in the execution stream,
we check the move for legality with pl_move_is_legal().
This patch changes pl_move_is_legal() to use type_of_piece_on(from)
for checking for a king move so that board[from] is automatically
cached in L1 and ready to be used by the near follower move_is_check()
Another advantage is that the call to king_square(us) in pl_move_is_legal()
is avoided most of the times.
Speed up of this nice and tricky patch is 0.7% !
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Wed, 2 Sep 2009 09:57:38 +0000 (11:57 +0200)]
Second take at unifying bitboard representation access
This patch is built on Tord idea to use functions instead of
templates to access position's bitboards. This has the added advantage
that we don't need fallback functions for cases where the piece
type or the color is a variable and not a constant.
Also added Joona suggestion to workaround request for two types
of pieces like bishop_and_queens() and rook_and_queens().
No functionality or performance change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Bug fix for discovered checks in connected_moves().
Because of a hard-to-spot single-character bug in connected_moves(),
the discovered check code had no effect whatsoever. The condition
in the if (...) statement at the beginning of the code would always
return false.
Thanks to Edsel Apostol for pointing out this bug!
Marco Costalba [Mon, 31 Aug 2009 14:09:52 +0000 (16:09 +0200)]
Retire pieces_of_color_and_type()
It is used mainly in a bunch of inline oneliners
just below its definition. So substitute it with
the explicit definition and avoid information hiding.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Mon, 31 Aug 2009 08:59:33 +0000 (10:59 +0200)]
Document index[] and pieceList[] are not invariants
Array index[] and pieceList[] are not guaranteed to be
invariant to a do_move() + undo_move() sequence when a
capture move is involved.
The reason is that the captured piece is removed form
the list and substituted with the last one in do_move()
while in undo_move() is added again but at the end of
the list.
Because index[] and pieceList[] are used in move generation
to scan the pieces it means that moves will be generated
in a different order before and after a do_move() + undo_move()
sequence as, for instance, the one in Position::has_mate_threat()
After latest patches, move generation could now be invoked
also by MovePicker c'tor and this explains why order of
picked moves is different if MovePicker object is istantiated
before or after a Position::has_mate_threat() call.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Sun, 30 Aug 2009 18:12:08 +0000 (19:12 +0100)]
Workaround a bug in Position::has_mate_threat()
It seems that pos.has_mate_threat() changes the position !
So that calling MovePicker c'tor before or after the
has_mate_threat() call changes the things !
Bug was unhidden by previous patch that makes MovePicker c'tor
to generate, score and sort good captures under some circumstances.
Because scoring the captures is position dependent it seems that
the moves returned by MovePicker are different when c'tor is
called before has_mate_threat()
Of course this is only a workaround because the real bug is still
hidden :-(
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Sun, 30 Aug 2009 08:42:55 +0000 (09:42 +0100)]
Movepicker: take move's loop out of switch statement
This not only cleans up the code but gives another
speed boost of 1.8%
From revision 595a90dfd0 we have increased pgo compiled binary
speed of a whopping +5.2% without any functional change !!
This is really awsome considering that we have also
cut line count by 25 lines.
Sometime we spend days for getting an extra 1% from move
generation while instead the biggest optimizations come
from anonymous and apparently dull parts of the code.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Thu, 27 Aug 2009 19:22:20 +0000 (20:22 +0100)]
Try null move before captures
Always after TT move but before captures.
This seems a better setup against version before this
patch.
After 999 games at 1+0
Mod - Orig +252 =527 -220 +11 ELO
Unfortunatly it does not seems to improve on the standard
version, with null move outside of movepicker (595a90df) with
the latest speed-up patches added in.
After 999 games at 1+0
Mod - Standard +244 =506 -249 -2 ELO
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Thu, 27 Aug 2009 07:12:51 +0000 (09:12 +0200)]
Change the flow in wich moves are generated and picked
In MovePicker we get the next move with pick_move_from_list(),
then check if the return value is equal to MOVE_NONE and
in this case we update the state to the new phase.
This patch reorders the flow so that now from pick_move_from_list()
renamed get_next_move() we directly call go_next_phase() to
generate and sort the next bunch of moves when there are no more
move to try. This avoids to always check for pick_move_from_list()
returned value and the flow is more linear and natural.
Also use a local variable instead of a pointer dereferencing in a
time critical switch statement in get_next_move()
With this patch alone we have an incredible speed up of 3.2% !!!
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Mon, 24 Aug 2009 16:41:24 +0000 (17:41 +0100)]
Micro-optimze extension()
Explicitly write the conditions for pawn to 7th
and passed pawn instead of wrapping in redundant
helpers.
Also retire the now unused move_is_pawn_push_to_7th()
and the never used move_was_passed_pawn_push() and
move_is_deep_pawn_push()
Function extension() is so time critical that this
simple patch speeds up the pgo compile of 0.5% and
it is also more clear what actually happens there.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Sun, 23 Aug 2009 16:20:02 +0000 (17:20 +0100)]
Remove a local variable from pop_1st_bit()
Remove the 'b' uint32_t local variable.
Optimized assembly is more or less the same
(one 'mov' instruction less), but now it is
written in a way more similar to the final assembly
flow so it should be easier for compiler to optimize.
Also guarantee that BitTable[] is always aligned.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Tord Romstad [Fri, 21 Aug 2009 08:50:34 +0000 (10:50 +0200)]
Added a few new targets to the Makefile for OS X with icpc.
The following new targets were added:
* osx-icc32: 32-bit x86 compiled with icpc.
* osx-icc64: 64-bit x86 compiled with icpc.
* osx-icc32-profile: 32-bit x86 compiled with icpc and pgo.
* osx-icc64-profile: 64-bit x86 compiled with icpc and pgo.
Marco Costalba [Thu, 20 Aug 2009 15:30:34 +0000 (16:30 +0100)]
Fix some asserts raised by is_ok()
There were two asserts.
The first was raised because is_ok() was called at the
beginning of do_castle_move() and this is wrong after
the last code reformatting because at that point the state
is already modified by the caller do_move().
The second, raised by debugIncrementalEval, was due to a
rounding error in compute_value() that occurs because
TempoValueEndgame was updated in an odd number by patch
"Merge Joona Kiiski evaluation tweaks" (3ed603cd) of 13/3/2009
Marco Costalba [Sat, 15 Aug 2009 14:18:17 +0000 (15:18 +0100)]
L1/L2 friendly PhaseTable[]
In Movepicker c'tor we access during initialization one of
MainSearchPhaseIndex..QsearchWithoutChecksPhaseIndex globals.
Postpone definition of PhaseTable[] just after them so that
when PhaseTable[] will be accessed later in get_next_move()
it will be already present in L1/L2.
It works like an implicit prefetching of PhaseTable[].
Also shrink PhaseTable[] to fit an L1 cache line of 16 bytes
using uint8_t instead of int.
This apparentely innocuous patch gives an astonish speed
up of 1.6% under MSVC 2010 beta, pgo optimized !
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Sun, 9 Aug 2009 23:20:54 +0000 (01:20 +0200)]
Enable prefetch also for gcc
This fix a compile error under Linux with gcc when
there aren't the intel dev libraries.
Also simplify the previous patch moving TT definition
from search.cpp to tt.cpp so to avoid using passing a
pointer to TT to the current position.
Finally simplify do_move(), now we miss a prefetch in the
rare case of setting an en-passant square but code is
much cleaner and performance penalty is almost zero.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Sun, 9 Aug 2009 14:53:51 +0000 (15:53 +0100)]
Try to prefetch as soon as position key is ready
Move prefetching code inside do_move() so to allow a
very early prefetching and to put as many instructions
as possible between prefetching and following retrieve().
With this patch retrieve() times are cutted of another 25%
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Sun, 9 Aug 2009 03:19:32 +0000 (04:19 +0100)]
Use 32 bit key in TT
Shrink key to 32 bits instead of 64. To still avoid
collisions use the high 32 bits of position key as the
TT key and the low 32 bits to retrieve the correct
cluster index in the table.
With this patch size og TTentry shrinks to 96 bits instead
of 128 and the cluster of 4 TTEntry sums to 48 bytes instead
of 64.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Marco Costalba [Sat, 8 Aug 2009 16:04:01 +0000 (17:04 +0100)]
Let LMR at root be independent of MultiPV value
Current formula enable LMR when
i + MultiPV >= LMRPVMoves
It means that, for instance, if MultiPV == 1 then LMR
will be started to be considered at move i = LMRPVMoves - 1,
while if MultiPV == 3 then it will start before,
at move i = LMRPVMoves - 3.
With this patch the formula becomes
i >= MultiPV + LMRPVMoves - 2
So that LMR will always start after LMRPVMoves - 1 moves
from the last PV move.
No functional change when MultiPV == 1
Signed-off-by: Marco Costalba <mcostalba@gmail.com>