Deja vu: Fingerprinting Network Problems
Bhavish Aggarwal
§
, Ranjita Bhagwan
∗
, Lorenzo De Carli
†
,
Venkat Padmanabhan
∗
, Krishna Puttaswamy
‡
∗
Microsoft Research India
†
University of California, Santa Barbara
‡
University of Wisconsin, Madison
§
Olacabs.com
ABSTRACT
We ask the question: can network problems experienced
by applications be identified based on symptoms contained
in a network packet trace? An answer in the affirmative
would open the doors to many opportunities, including non-
intrusive monitoring of such problems on the network and
matching a problem with past instances of the same prob-
lem.
To this end, we present Deja vu, a tool to condense the
manifestation of a network problem into a compact signa-
ture, which could then be used to match multiple instances
of the same problem. Deja vu uses as input a network-level
packet trace of an application’s communication and extracts
from it a set of features. During the training phase, each
application run is manually labeled as GOOD or BAD, de-
pending on whether the run was successful or not. Deja vu
then employs a novel learning technique to build a signa-
ture tree not only to distinguish between GOOD and BAD
runs but to also sub-classify the BAD runs, revealing the dif-
ferent classes of failures. The novelty lies in performing the
sub-classification without requiring any failure class-specific
labels.
We evaluate Deja vu in the context of the multiple web
browsers in a corporate environment and an email appli-
cation in a university environment, with promising results.
The signature generated by Deja vu based on the limited
GOOD/BAD labels is as effective as one generated using
full-blown classification with knowledge of the actual prob-
lem types.
1.
INTRODUCTION
Network communication is an integral part of many
applications. Therefore, network problems often impact
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
ACM CoNEXT 2011, December 6–9 2011, Tokyo, Japan.
Copyright 2011 ACM 978-1-4503-1041-3/11/0012 ...
$
10.00.
application behavior. The impact on network commu-
nication depends on the nature of the problem. If the
local name server is down, DNS requests will be sent
but no responses will be received. On the other hand, if
the firewall at the edge of a corporate network is block-
ing the https port, then SYN packets would be seen
but not any SYNACKs.
We ask the question: can network problems experi-
enced by applications be identified based on symptoms
contained in the application’s network packet trace?
There are several advantages to looking for symptoms
of network problems in a network packet trace. First,
it is not intrusive unlike tracing on an end system it-
self (e.g., system call tracing). So we could monitor the
health of applications running on several hosts without
requiring access to the hosts themselves. Second, net-
work communication represents the “narrow waist” of
network applications. Many versions of an application
(e.g., browser) and even OSes running on the end sys-
tems could exhibit consistent behavior at the level of
network protocol messages, thereby leading to similar
symptoms of problems at the network layer.
To answer the above question, we develop Deja vu, a
tool to condense the manifestation of a network prob-
lem into a compact signature. Each signature encapsu-
lates the symptoms corresponding to a particular prob-
lem. For instance, for a browser application that might
encounter the problems noted above, there would be
one signature corresponding to the local name server
problem and a different one corresponding to the fire-
wall problem. Although it might be tempting, based
on these simple examples, to employ a rule-based ap-
proach to constructing signatures, such an approach
suffers from the limitation of not being general enough
to accommodate new applications or even existing ap-
plications whose behavior is not fully understood or
documented.
Therefore, Deja vu uses a learning-based approach to
constructing signatures. We extract a set of features
from packet traces, using our domain knowledge to in-
form this. The features extracted correspond to proto-
cols such as DNS, IP, TCP, HTTP, etc. For instance,
there are features corresponding to the presence of a
DNS request, DNS reply, HTTP error code, etc.
Once these features have been extracted, designing
an algorithm to learn signatures is a key challenge. A
standard classification approach, such as decision trees,
would require labeled training data. Generating a train-
ing set with problem type-specific labels is onerous and
could even be infeasible when the failure cause for a
training run is unknown (e.g., a failure could occur in a
remote network component). At the same time, an un-
supervised learning approach, such as clustering, would
be vulnerable to noisy data. For instance, features ex-
tracted from unrelated background traffic might still get
picked for clustering.
To address this challenge, Deja vu employs a novel
approach. For training, we only assume coarse-grained
labels: GOOD when the training run of an application
was successful and BAD otherwise. These labels can be
determined based on the exhibited behavior of an appli-
cation, without the need to know, in the case of BAD,
the problem category. Then, by iteratively applying a
decision-tree learning algorithm, Deja vu automatically
learns different problem signatures for different cate-
gories of problems.
We evaluate the effectiveness of Deja vu in generat-
ing problem signatures for two classes of applications:
multiple web browsers and an email client. For each ap-
plication, we generate a training set by creating various
error conditions. Similarly we generate a test set. We
find that the problem signatures constructed by Deja vu
based on the training set are able to classify the traces
in the test set with 95% accuracy. In fact, the classifi-
cation performed by Deja vu using just the GOOD and
BAD labels is within 4.5% accuracy to that by a deci-
sion tree classifier operating with the benefit of problem
category labels attached to traces. We also show how
Deja vu learns new non-trivial problem signatures on-
the-fly, which a rule-based approach would have missed.
Finally we show the effectiveness of Deja vu’s signatures
in helping a human administrator match network packet
traces to problems.
2.
DESIGN OVERVIEW AND SCOPE
The input to Deja vu is a set of network packet traces,
each coarsely labeled as GOOD or BAD. A GOOD trace
corresponds to a working application run while a BAD
trace corresponds to a non-working run. We believe
that not assuming more fine-grained labeling is the right
choice because we have found that applications often fail
giving the same error messages for different networking
problems, thereby not allowing a user to correctly dif-
ferentiate between different bad runs. In our work, the
GOOD/BAD labeling is performed by us in the lab, but
we touch on alternative strategies in Section 7.
The coarsely-labeled traces are fed to Deja vu’s fea-
ture extractor, which uses domain knowledge to extract
a set of features, as discussed in Section 3. These feature
sets, together with the GOOD/BAD labels, are then fed
to Deja vu’s signature construction algorithm discussed
in Section 4. The novelty of this algorithm is that, al-
though it is just given the coarse GOOD/BAD labels
as input, it infers a sub-categorization of BAD corre-
sponding to the different categories of problems that an
application encounters.
Once Deja vu has learnt and associated signatures
with problems, these could be used in a range of appli-
cations, helping to match the problems in a test trace to
ones that have previously been seen and assigned signa-
tures. We discuss two simple applications in Section 6.
Note that the extracted signatures can only be as
good as the data input to the algorithm. The quality
of the signatures therefore depends significantly on the
choice of features, and the accuracy of the value of the
features. Also, the scope of Deja vu is limited to prob-
lems that manifest themselves in network traces. There
are several problems that applications experience which
may not show as abnormalities in network traces. Deja
vu does not address these problems. Consequently, the
input features to our algorithm are extracted only from
network traces, as we discuss in the next section.
3.
FEATURES
In this section we describe what information we ex-
tract from the raw network traces and input to the Deja
vu algorithm. As with any machine learning algorithm,
Deja vu requires as input a set of features. The fea-
ture set extractor reduces a network packet trace to
a compact set of features that summarizes the essential
characteristics of the trace. This process also makes the
input to Deja vu less noisy (e.g., features correspond-
ing to unrelated background traffic are excluded) and
strips it of privacy-sensitive information. For example,
the actual packet payload is discarded except for some
specific header fields in protocols such as HTTP and
SMB.
The choice of features is key. Features that are too
detailed often suffer from a lack of generality. To de-
termine what kind of features to extract, we manually
scrutinized and debugged traces for several networking
problems. Using our domain knowledge and experience,
we settled on the following broad categories of features
to extract:
1. Packet types: Often, problems manifest them-
selves as the presence or absence of packets of
a certain type.
To capture this, we use bi-
nary features to record the presence or absence
of certain packet types, where type is determined
based on the packet header fields.
By exam-
ining the headers of the packets contained in a
trace, we set the corresponding binary features