sharpness and
jaggedness of the boundaries, proximity to other regions, and
information about the background in the vicinity of the region. Finally, stan-
dard learning techniques are applied to the resulting attribute vectors.
Several interesting problems were encountered. One is the scarcity of train-
ing data. Oil slicks are (fortunately) very rare, and manual classification is
extremely costly. Another is the unbalanced nature of the problem: of the many
dark regions in the training data, only a very small fraction are actual oil slicks.
A third is that the examples group naturally into batches, with regions drawn
from each image forming a single batch, and background characteristics vary
from one batch to another. Finally, the performance task is to serve as a filter,
and the user must be provided with a convenient means of varying the false-
alarm rate.
Load forecasting
In the electricity supply industry, it is important to determine future demand
for power as far in advance as possible. If accurate estimates can be made for
the maximum and minimum load for each hour, day, month, season, and year,
utility companies can make significant economies in areas such as setting the
operating reserve, maintenance scheduling, and fuel inventory management.
An automated load forecasting assistant has been operating at a major utility
supplier over the past decade to generate hourly forecasts 2 days in advance. The
first step was to use data collected over the previous 15 years to create a sophis-
ticated load model manually. This model had three components: base load for
the year, load periodicity over the year, and the effect of holidays. To normalize
for the base load, the data for each previous year was standardized by subtract-
ing the average load for that year from each hourly reading and dividing by the
standard deviation over the year. Electric load shows periodicity at three fun-
damental frequencies: diurnal, where usage has an early morning minimum and
midday and afternoon maxima; weekly, where demand is lower at weekends;
and seasonal, where increased demand during winter and summer for heating
and cooling, respectively, creates a yearly cycle. Major holidays such as Thanks-
giving, Christmas, and New Year’s Day show significant variation from the
normal load and are each modeled separately by averaging hourly loads for that
day over the past 15 years. Minor official holidays, such as Columbus Day, are
lumped together as school holidays and treated as an offset to the normal
diurnal pattern. All of these effects are incorporated by reconstructing a year’s
load as a sequence of typical days, fitting the holidays in their correct position,
and denormalizing the load to account for overall growth.
Thus far, the load model is a static one, constructed manually from histori-
cal data, and implicitly assumes “normal” climatic conditions over the year. The
final step was to take weather conditions into account using a technique that
2 4
C H A P T E R 1
|
W H AT ’ S I T A L L A B O U T ?
P088407-Ch001.qxd 4/30/05 11:11 AM Page 24
locates the previous day most similar to the current circumstances and uses the
historical information from that day as a predictor. In this case the prediction
is treated as an additive correction to the static load model. To guard against
outliers, the eight most similar days are located and their additive corrections
averaged. A database was constructed of temperature, humidity, wind speed,
and cloud cover at three local weather centers for each hour of the 15-year
historical record, along with the difference between the actual load and that
predicted by the static model. A linear regression analysis was performed to
determine the relative effects of these parameters on load, and the coefficients
were used to weight the distance function used to locate the most similar days.
The resulting system yielded the same performance as trained human fore-
casters but was far quicker—taking seconds rather than hours to generate a daily
forecast. Human operators can analyze the forecast’s sensitivity to simulated
changes in weather and bring up for examination the “most similar” days that
the system used for weather adjustment.
Diagnosis
Diagnosis is one of the principal application areas of expert systems. Although
the handcrafted rules used in expert systems often perform well, machine learn-
ing can be useful in situations in which producing rules manually is too labor
intensive.
Preventative maintenance of electromechanical devices such as motors and
generators can forestall failures that disrupt industrial processes. Technicians
regularly inspect each device, measuring vibrations at various points to deter-
mine whether the device needs servicing. Typical faults include shaft misalign-
ment, mechanical loosening, faulty bearings, and unbalanced pumps. A
particular chemical plant uses more than 1000 different devices, ranging from
small pumps to very large turbo-alternators, which until recently were diag-
nosed by a human expert with 20 years of experience. Faults are identified by
measuring vibrations at different places on the device’s mounting and using
Fourier analysis to check the energy present in three different directions at each
harmonic of the basic rotation speed. This information, which is very noisy
because of limitations in the measurement and recording procedure, is studied
by the expert to arrive at a diagnosis. Although handcrafted expert system rules
had been developed for some situations, the elicitation process would have to
be repeated several times for different types of machinery; so a learning
approach was investigated.
Six hundred faults, each comprising a set of measurements along with the
expert’s diagnosis, were available, representing 20 years of experience in the
field. About half were unsatisfactory for various reasons and had to be discarded;
the remainder were used as training examples. The goal was not to determine
1 . 3
F I E L D E D A P P L I C AT I O N S
2 5
P088407-Ch001.qxd 4/30/05 11:11 AM Page 25