78
M. Campbell et al. / Artificial Intelligence 134 (2002) 57–83
The extended book was introduced into Deep Thought 2 in 1991, and was used with
good success through the matches with Kasparov. In [6], an example is given of how the
extended book worked in game 2 of the 1997 Kasparov–Deep Blue match.
8.3. Endgame databases
The endgame databases in Deep Blue includes all chess positions with five or fewer
pieces
20
on the board, as well as selected positions with six pieces that included a pair
of blocked pawns. The primary sources for these databases were the Ken Thompson CD-
ROMs [27] and additional databases supplied by Lewis Stiller.
Endgames databases were used both off-line and on-line. The off-line usage was during
the design of the chess chips. Each chip had a ROM which stored patterns to help evaluate
certain frequently occurring chess endgames. The databases were used to verify and
evaluate these patterns. See Section 5 for more details.
The software search used the databases in on-line mode. Each of the 30 general purpose
processors in the Deep Blue system replicated the 4-piece and important 5-piece databases
on their local disk. The remaining databases, including those with 6 pieces, were duplicated
on two 20-GB RAID disk arrays, and were available to all the general purpose processors
through the SP switch.
Endgames were stored in the databases with one bit per position (indicating lose or does-
not-lose). If a position is reached during the search that had a known value, it received
a score composed of two parts: a high-order, game theoretic value, and a low-order,
tie-breaker value. The tiebreaker value is simply the value produced by the evaluation
function on the position in question. If this score is sufficient to cause a cutoff, the search
immediately backs up this score.
For example, suppose Deep Blue had to choose between various possible continuations
that resulted in it playing the weak side of a rook and pawn versus rook endgame. Deep
Blue would, of course, prefer drawn positions over lost ones. In addition, given the
choice between different drawn positions, it would choose the one with the best evaluation
function value.
The endgame databases did not play a critical role in the matches against Kasparov. In
the 1997 match, only game 4 approached an endgame that required access to the databases,
but the ROMs on the chess chips had sufficient knowledge to recognize the rook and pawn
versus rook draws that could have arisen.
8.4. Time control
Chess games typically have a set of requirements on the speed of play, termed the “time
control”. For example, the Deep Blue–Kasparov games initially required 40 moves to be
played in two hours. Failure to make the specified number of moves leads to forfeiting the
game.
The time control mechanism in Deep Blue is relatively straightforward. Before each
search, two time targets are set. The first is the normal time target, set to be approximately
20
We use the term “piece” to also include pawns.
M. Campbell et al. / Artificial Intelligence 134 (2002) 57–83
79
the time remaining to the next time control divided by the moves remaining. In practice,
a considerable time buffer is reserved, which allows for sufficient time to handle technical
difficulties, as well as saving time for a possible “sudden-death” phase. The second time
target is the panic time target, which is roughly one third of the remaining time.
If, at the normal time target, the situation is “normal”, a time-out occurs and the current
best move is played. There are a few conditions under which Deep Blue will go beyond its
normal target into “panic time”.
• The current best move has dropped 15 points or more from the score of the previous
iteration. In this case, the search continues until either a new move is found within the
15 point margin, the iteration is completed, or the panic time target is reached.
• The best move from the previous iteration is in a potential “fail-low” state. Continue
until this state is resolved, or the panic time target is reached. If this move ends up
dropping 15 points or more from its prior value, continue as in the previous case.
• A new move is in a potential “fail-high” state. Continue until this state is resolved, or
the panic time target is reached.
These conditions are triggered frequently during a game, but it is quite rare to actually
go all the way to the panic time target. In the 1997 match with Kasparov, this happened
only once.
9. Conclusion
The success of Deep Blue in the 1997 match was not the result of any one factor. The
large searching capability, non-uniform search, and complex evaluation function were all
critical. However other factors also played a role, e.g., endgame databases, the extended
book, and evaluation function tuning.
It is clear, however, that there were many areas where improvements could have been
made. With additional effort, the parallel search efficiency could have been increased.
The hardware search and evaluation could have been made more efficient and flexible
with the addition of an external FPGA. Current research (e.g., [11]) suggests that the
addition of pruning mechanisms to Deep Blue might have significantly improved the
search. Evaluation function tuning, both automatic and manual, was far from complete.
In the course of the development of Deep Blue, there were many design decisions that
had to be made. We made particular choices, but there were many alternatives that were
left unexplored. We hope this paper encourages further exploration of this space.
Acknowledgements
We would like to gratefully acknowledge the help of the following people: C.J. Tan, Jerry
Brody, Joel Benjamin, John Fedorowicz, Nick De Firmian, Miguel Illescas, Lewis Stiller,
Don Maddox, Gerald Tesauro, Joefon Jann, Thomas Anantharaman, Andreas Nowatzyk,
Peter Jansen, and Mike Browne. We would also like to thank the editor and the anonymous
referees for their helpful comments.
The following are trademarks of International Business Machines Corporation: IBM,
Deep Blue, RS/6000, SP, AIX, Micro Channel.
80
M. Campbell et al. / Artificial Intelligence 134 (2002) 57–83
Appendix A. Evaluation tables and registers
• Rooks on seventh: There are 12 8-bit registers, for the various combinations of
{White, Black}
× {single, doubled} × {regular, absolute-king-trapped, absolute-
king-not-trapped}.
• Bias: This register was designed for use with external (off-chip) hardware.
• Rook behind passed: A bonus is awarded for a rook behind a passed pawn of the same
color.
• Mpin and hung: A bonus is awarded when the opponent has a piece that is hung and
pinned against a piece equal in value to the pinner.
• Pinned and simple hung: A bonus is awarded when the opponent has a piece that is
pinned and hung.
• Hung: A bonus is awarded if the opponent has a piece that is hung.
• Xraying: A bonus is awarded for having an “xray attack”, i.e., an attack masked by
one’s own piece.
• Pinned and hung: A bonus is awarded when the opponent has a piece that is both
pinned and hung. “Hung” is distinguished from “simple hung” by a more detailed
analysis of the capture sequence.
• Permanent pin and simple hung. A bonus is awarded for a permanent pin of a piece.
• Knight trap: These 6 registers (3 for each side) provide penalties for some frequently
occurring situations where knights can get trapped.
• Rook trap: These 8 registers (4 for each side) provide penalties for some frequently
occurring situations where rooks can get trapped.
• Queen trap: These 2 registers (1 for each side) provide penalties for when there is no
safe queen mobility.
• Wasted pawns: A penalty is assessed for pawn groups that have “wasted” pawns.
A pawn group has a wasted pawn if it is unable to create a passed pawn against a
smaller group of enemy pawns. There are two values, one each for White and Black,
and the penalties are dynamically scaled by the amount of material on the board at the
time of evaluation.
• Bishop pair: A bonus may be awarded for having a pair of bishops. There are two
values, on each for White and Black.
• Separated passed: Widely separated passed pawns are advantageous if trying to win
in a bishops of opposite color endgame.
• Missing wing: Provides a penalty for the side that is ahead if it allows itself to be left
with pawns on only one side.
• BOC: A set of reductions on the evaluation for pure bishops of opposite color, and
various combinations of major pieces and bishops of opposite color.
• Side to move: A bonus may be given for being on move.
• Multiple pawns: There are two 80-entry tables, for the various combinations of
{White, Black}
× {backward, not backward} × {opposed, unopposed} × {doubled,
tripled+}
× {ah, bg, cf, de} × {isolated, pseudo-isolated, supported, duo}. Some
combinations are not legal. There is an 8-bit penalty value associated with each
realizable combination.
M. Campbell et al. / Artificial Intelligence 134 (2002) 57–83
81
• Minor on weak: This pair of tables gives bonuses for minor pieces on weak squares,
taking into account such factors as advancement, centrality, pawn support, minor
piece type, challengability, and screened status.
• Self block: This table assesses how each side’s bishops are restrained by their own
pawn.
• Opp block: This table assesses how each sides’s bishops are restrained by the
opponents’s pawns.
• Back block: This table assesses how doubled pawns can restrain a bishop’s mobility.
• Pinned: Pins are awarded bonuses depending on the strength of the pin and the amount
of pressure on the pinned piece.
• Mobility: The mobility to each square on the board is summed to produce on overall
mobility score. Each square has 16 possible levels of mobility, from maximum Black
control through maximum White control. The control level and the square location
determine each square’s contribution.
• Pawn structure: This table assesses various features of pawn structure not handled
elsewhere.
• Passed pawns: This table assesses the worth of passed pawns.
• Joint signature: This table allows adjustments to be made for particular piece
matchups.
• Rooks on files: There are two 192-entry tables, for the various combinations of
{White, Black}
× {opposed, unopposed} × {4 blockage types} × {semi-open, not
semi-open}
× {ah, bg, cf, de} × {0, 1, or 2 rooks}. There is an 6-bit value associated
with each combination, and two 2-bit flags indicating how the value is to be used in
the king-safety calculation.
• Bishops: Bishops are awarded bonuses for the value of the diagonals they control. The
diagonals are individually assessed, and include factors such as transparency (ability
to open at will), king safety, and target of attack.
• Pawn storm: This table helps to assess pawns being used to attack the opponent’s
king.
• Pawn shelter: This table evaluates the pawn shelter that protects or may protect one’s
own king.
• Development: This table measures the differences in development between the two
sides, factors in the king situation, and gives a bonus to the side that is ahead in
development/king safety.
• Trapped bishop: This table is used to detect situations where bishops can become
trapped.
• Signature: These tables, one for each side, allow adjustments to be made for particular
piece combinations that work well or poorly together. For example, queen and knight
are thought to cooperate better than queen and bishop.
• Contempt: The table gives the adjustment to the draw score. It is used to either prefer
or avoid draws, depending on the opponent and the match situation. The adjustment
is material dependent.
• Piece placement: This table, in common use in most chess programs, has one entry
for each piece type on each square of the board.
82
M. Campbell et al. / Artificial Intelligence 134 (2002) 57–83
Table B.1
Summary of systems
System name
First played
Processors
Nodes/second
Feature groups
ChipTest
1986
1
50K
1
ChipTest-m
1987
1
400K
1
Deep Thought
1988
2–6
700K–2M
4
Deep Thought 2
1991
14–24
4M–7M
4
Deep Blue I
1996
216
50M–100M
32
Deep Blue II
1997
480
100M–200M
38
Deep Blue Jr.
1997
24
20M–30M
38
Deep Blue Jr. demo
1997
1
2M
38
Appendix B. Summary of Deep Blue and predecessors
Table B.1 gives some characteristics of the various systems in the Deep Blue lineage.
There are a number of qualifications on the values in this table.
• The ChipTest and Deep Thought systems used software adjustments to account for
features not recognized by the evaluation hardware.
• The Deep Blue systems allowed the evaluation features to interact in ways that were
not possible in earlier systems.
• The demo version of Deep Blue Jr. was restricted to one second of computation
time on one processor, did not detect repetition with game history, and had a fixed
evaluation function that did not vary with game stage.
References
[1] T.S. Anantharaman, Extension heuristics, ICCA J. 14 (2) (1991) 135–143.
[2] T.S. Anantharman, M.S. Campbell, F.-h. Hsu, Singular extensions: Adding selectivity to brute-force
searching, Artificial Intelligence 43 (1) (1990) 99–110. Also published in: ICCA J. 11 (4) (1988) 135–143.
[3] D.F. Beal, Experiments with the null move, in: D.F. Beal (Ed.), Advances in Computer Chess 5, Elsevier
Science, Amsterdam, 1989, pp. 65–79.
[4] H. Berliner, Chess as problem solving: The development of a tactics analyzer, Ph.D. Thesis, Carnegie
Mellon University, Pittsburgh, PA, 1974.
[5] M.G. Brockington, A taxonomy of parallel game-tree search algorithms, ICCA J. 19 (3) (1996) 162–174.
[6] M. Campbell, Knowledge discovery in Deep Blue, Comm. ACM 42 (11) (November 1999) 65–67.
[7] J.H. Condon, K. Thompson, Belle chess hardware, in: Advances in Computer Chess 3, Pergamon Press,
Oxford, 1982, pp. 45–54.
[8] R. Feldmann, Spielbaumsuche mit massiv parallelen Systemen, Ph.D. Thesis, Universität-Gesamt-
hochschule Paderborn, Germany, 1993.
[9] G. Goetsch, M.S. Campbell, Experiments with the null-move heuristic, in: T.A. Marsland, J. Schaeffer
(Eds.), Computers, Chess, and Cognition, Springer, Berlin, 1990, pp. 159–168.
[10] W. Gropp, E. Lusk, A. Skjellum, Using MPI, 2nd Edition, MIT Press, Cambridge, MA, 1999.
[11] E.A. Heinz, Scalable Search in Computer Chess, Friedrick Vieweg & Son, 2000.
[12] F.-h. Hsu, A two-million moves/s CMOS single-chip chess move generator, IEEE J. Solid State
Circuits 22 (5) (1987) 841–846.
M. Campbell et al. / Artificial Intelligence 134 (2002) 57–83
83
[13] F.-h. Hsu, Large-scale parallelization of alpha-beta search: An algorithmic and architectural study, Ph.D.
Thesis, Carnegie Mellon, Pittsburgh, PA, 1990.
[14] F.-h. Hsu, IBM’s Deep Blue chess grandmaster chips, IEEE Micro (March–April 1999) 70–81.
[15] F.-h. Hsu, Behind Deep Blue, Princeton University Press, Princeton, NJ, 2002.
[16] F.-h. Hsu, T. Anantharman, M. Campbell, A. Nowatzyk, A Grandmaster chess machine, Scientific American
(October 1990) 44–50.
[17] F.-h. Hsu, T.S. Anantharman, M.S. Campbell, A. Nowatzyk, Deep Thought, in: T.A. Marsland, J. Schaeffer
(Eds.), Computers, Chess, and Cognition, Springer, Berlin, 1990, pp. 55–78.
[18] A. Junghanns, Are there practical alternatives to alpha-beta in computer chess?, ICCA J. 21 (1) (1998)
14–32.
[19] D.E. Knuth, R.W. Moore, An analysis of alpha-beta pruning, Artificial Intelligence 6 (4) (1975) 293–326.
[20] D.N.L. Levy, D. Broughton, M. Taylor, The SEX algorithm in computer chess, ICCA J. 12 (1) (1989) 10–21.
[21] D.A. McAllester, D. Yuret, Alpha-beta-conspiracy search, 1993. http://www.research.att.com/~dmac/abc.
ps.
[22] M. Newborn, Kasparov Versus Deep Blue: Computer Chess Comes of Age, Springer, Berlin, 1997.
[23] A. Reinefeld, An improvement to the scout tree-search algorithm, ICCA J. 6 (4) (1983) 4–14.
[24] D.J. Slate, L.R. Atkin, Chess 4.5—The Northwestern University chess program, in: P.W. Frey (Ed.), Chess
Skill in Man and Machine, Springer, New York, 1977, pp. 82–118.
[25] G. Tesauro, Connectionist learning of expert preferences by comparison training, in: D. Touretzky (Ed.),
Advances in Neural Information Processing Systems 1 (NIPS-88), Morgan Kaufmann, San Mateo, CA,
1989, pp. 99–106.
[26] G. Tesauro, Comparison training of chess evaluation functions, in: J. Furnkranz, M. Kumbat (Eds.),
Machines that Learn to Play Games, Nova Science Publishers, 2001, pp. 117–130.
[27] K. Thompson, Retrograde analysis of certain endgames, ICCA J. 9 (3) (1986) 594–597.
Dostları ilə paylaş: |