60
M. Campbell et al. / Artificial Intelligence 134 (2002) 57–83
Larry Christiansen and Michael Rohde, were both won by Deep Blue Jr. by a score
of 1.5–0.5.
2. System overview
Deep Blue is a massively parallel system designed for carrying out chess game tree
searches. The system is composed of a 30-node (30-processor) IBM RS/6000 SP computer
and 480 single-chip chess search engines, with 16 chess chips per SP processor. The SP
system consists of 28 nodes with 120 MHz P2SC processors, and 2 nodes with 135 MHz
P2SC processors. The nodes communicate with each other via a high speed switch. All
nodes have 1 GB of RAM, and 4 GB of disk. During the 1997 match with Kasparov, the
system ran the AIX
®
4.2 operating system. The chess chips in Deep Blue are each capable
of searching 2 to 2.5 million chess positions per second, and communicate with their host
node via a microchannel bus. The chess chips are described in Section 3.
Deep Blue is organized in three layers. One of the SP processors is designated as the
master, and the remainder as workers. The master searches the top levels of the chess
game tree, and then distributes “leaf” positions to the workers for further examination. The
workers carry out a few levels of additional search, and then distribute their leaf positions
to the chess chips, which search the last few levels of the tree.
Overall system speed varied widely, depending on the specific characteristics of the
positions being searched. For tactical positions, where long forcing move sequences exist,
Deep Blue would average about 100 million positions per second. For quieter positions,
speeds close to 200 million positions per second were typical. In the course of the 1997
match with Kasparov, the overall average system speed observed in searches longer than
one minute was 126 million positions per second. The maximum sustained speed observed
in this match was 330 million positions per second.
Deep Blue relies on many of the ideas developed in earlier chess programs, including
quiescence search, iterative deepening, transposition tables (all described in [24]), and
NegaScout [23]. These ideas and others formed a very sound basis for designing and
building a chess-playing system. Nonetheless, in creating a system as large and complex
as Deep Blue, one naturally runs into relatively unexplored territory. Before describing the
components of Deep Blue in detail (in Sections 3 through 8), it is worthwhile to discuss
those characteristics of Deep Blue that gave rise to new or unusual challenges.
1. Large searching capacity. Previous research in game tree search typically dealt with
systems that searched orders of magnitude fewer positions than Deep Blue. The best
way to take advantage of this additional searching power is not clear. Our work on
the Deep Blue search was guided by two main principles:
(a) The search should be highly non-uniform. It is well known that strong human
players are able to calculate well beyond the depth reachable by a uniform
searcher of any conceivable speed. Our preference for a highly selective search
arose from the loss of Deep Thought to Mike Valvo in a correspondence
match [22], where it was clear to us that Valvo searched significantly deeper than
Deep Thought. The fact that we were hoping to play Garry Kasparov, a chess
player known for his complex attacking style, also figured into this choice.
M. Campbell et al. / Artificial Intelligence 134 (2002) 57–83
61
(b) The search should provide “insurance”
2
against simple errors. We wanted to be
sure that all move sequences were explored to some reasonable minimum depth.
Early research into pruning algorithms (e.g., null move pruning [3,9]) did not
provide us enough evidence to warrant implementation in the hardware search of
Deep Thought 2 or Deep Blue. Even without pruning, and using highly selective
search,
3
we felt that Deep Blue had sufficient searching power to satisfy our
insurance needs. A three minute search on Deep Blue would reach a full-width
depth of 12.2 on average.
4
Sections 4 and 5 describe the Deep Blue search algorithm.
2.
Hardware evaluation. The Deep Blue evaluation function is implemented in
hardware. In a way, this simplifies the task of programming Deep Blue. In a software
chess program, one must carefully consider adding new features, always keeping in
mind that a “better” evaluation function may take too long to execute, slowing down
the program to the extent that it plays more weakly. In Deep Blue, one does not need
to constantly re-weigh the worth of a particular evaluation function feature versus
its execution time: time to execute the evaluation function is a fixed constant.
5
On
the other hand, it is not possible to add new features to the hardware evaluation,
6
and software patches are painful and problematic, as noted above about Deep
Thought 2. For the most part, one must learn to either get by without a desired new
feature, or manufacture some surrogate out of the features that are already available.
Additionally, the extra complexity that is possible in the hardware evaluation function
creates an “embarrassment of riches”. There are so many features (8000) that tuning
the relative values of the features becomes a difficult task. The evaluation function is
described in Section 7.
3. Hybrid software/hardware search. The Deep Blue search combines a software search,
implemented in compiled C code on a general purpose CPU, with a hardware search,
encoded in silicon on the chess chip. The software search is extremely flexible, and
can be changed as needed. The hardware search is parameterized, but the general
form of the search is fixed. Thus it suffers to some degree from the difficulties similar
to those associated with the hardware evaluation function: new search behaviors
cannot be introduced, and the existing parameters require careful tuning. There is
also an additional difficulty, namely choosing the best strategy for switching from
the software to the hardware search. The very fact that the two searches are different
(see Table 1) can lead to horizon effects [4].
4. Massively parallel search. Deep Blue is a massively parallel system, with over 500
processors available to participate in the game tree search. Although there has been
2
This is the terminology used in [18].
3
Our experiments showed that Deep Blue typically sacrificed two ply of full-width search in order to execute
the selective search algorithms.
4
This estimate is based on a linear least squares fit on all the (iteration, log(time)) data points from the 1997
match with Kasparov.
5
Actually there is a distinction between slow and fast evaluation, and feature values can have an impact here.
See Section 3.2.
6
There were limits on design time, chip area, and manufacturing cost/time which made it difficult to build new
chips.