Root causes and corresponding Deja vu signatures and classiﬁer signatures for email traces.
993 (IMAP over SSL) [TCPSYN993-TCPSYNACK993
= 0] whereas the classiﬁer signature does not give us
4. Sometimes, Deja vu has multiple signatures corre-
sponding to the same root cause, whereas the classiﬁer
does not. Since Deja vu does not have access to ﬁne-
grained labels, it sometimes creates noisy splits in the
signature which the classiﬁer avoids. Such noisy splits
can be avoided to some extent by techniques noted in
Signature Stability and Adaptability
vu’s signatures are, and how eﬀective the algorithm is
at learning new signatures on-the-ﬂy. Would the signa-
tures learned from a training set still apply to a test set
gathered at a later time? Does the algorithm learn new
signatures when required?
To answer these questions, we collected a test dataset
in the corporate network for two browsers – IE and Fire-
fox – approximately 2 months after we had collected the
training dataset. We collected training data on Win-
dows 7, Mac OSX, and Ubuntu systems, and the test
dataset on a Windows XP machine. The test dataset
included 10 BAD traces (5 each for IE and Firefox) for
each of 6 root causes, giving us a total of 60 bad traces.
We could not collect data for the “Misconﬁgured outgo-
ing ﬁrewall” root cause because Windows XP does not
allow the conﬁguration of outgoing ﬁrewall rules.
For 5 out of the 6 root causes, an overwhelming ma-
jority (95%) of the traces in the test dataset matched
the signatures of the same root cause that had been
learnt earlier from the training dataset. This demon-
strates the stability of the Deja vu signatures for these
5 root causes. However, Deja vu misclassiﬁed all 10
traces for the “Wrong proxy” root cause, marking them
either as “internal site authentication error” or “name
To investigate this, we relearned the Deja vu signa-
tures by adding these 10 BAD traces to the initial train-
ing set of 878 traces. We found that Deja vu learned an
additional, new signature for the “Wrong proxy” root
cause, which all 10 new traces contributed to:
[HT T P R200 = 0] AND [N HT T P Q = 0] AND
[HT T P R502 = 0] AND [HT T P R500 = 0] AND
[HT T P R504 = 1]
To create the “Wrong proxy” root cause, we always
set the proxy to a non-existent IP address that we
were conﬁdent would not respond (e.g., 188.8.131.52). How-
ever, the new signature noted above indicates that not
only did requests to this IP address complete a success-
ful TCP handshake, it even responded with an HTTP
Gateway error! We communicated this to the relevant
network administrators, who investigated the matter
and then informed us that this strange behavior was
the result of some recent routing conﬁguration changes
made on the corporate network that directed traﬃc to
some non-existent IP addresses to a set of misconﬁg-
ured servers that were responding to the requests with
a gateway error.
Root causes and corresponding Deja vu signatures and classiﬁer signatures for browser traces.
This interesting anecdote shows that Deja vu signa-
tures are not just useful for failure classiﬁcation, but
can also be a component of a network problem diag-
nosis tool. Whenever Deja vu learns a new problem
signature, the tool can alert the administrators so they
can investigate it to see if the signature reveals anything
lems with a compact ﬁngerprint.
Such a ﬁngerprint
has several applications. A ﬁngerprint could be used
to search through a large dataset to ﬁnd instances of
a particular problem. It could also be used to recall
and match against previously seen instances of a prob-
lem. We brieﬂy describe applications in each of these
Packet tracing tools such as tcpdump and netmon pro-
vide a way to apply ﬁlters to ﬁnd speciﬁc packet types of
interest, either from a live capture or from a recorded
trace. However, what if we are interested in search-
ing for problem events rather than for speciﬁc pack-
ets? For instance, we might wish to ﬁnd instances in
a trace where a secure webpage access failed because
the ﬁrewall blocked port 443 traﬃc. To provide this ca-
pability, we have built a simple search tool using Deja
vu. The target trace is sliced into windows, either slid-
ing windows or jumping windows. Features extracted
from each slice are then fed into the signature tree con-
structed by Deja vu for the problem of interest.
One question is how wide a slice should be. Ideally,
the slice should be wide enough to accommodate the
problem event of interest but no wider. For instance,
consider a problem signature that comprises a successful
DNS request-response exchanged followed by a success-
ful TCP SYN handshake followed, in turn, by an HTTP
request that fails to elicit a response. To be able to cap-
ture the full signature, the slice must be wide enough
to span all of the above packet exchanges. However, if
the slice were too wide, then it risks being polluted by
noise in the form of features from packets belonging to
an unrelated transaction.
In our implementation of the search tool, we use a
slice size of 30 seconds. We tested the tool on a 4MB
network trace collected over a period of 40 minutes.
We recreated 5 diﬀerent problems from the set of root
causes shown in Figure 3. The search tool was success-
ful in ﬁnding 3 of these problems. It missed catching
one problem because the window size was too large and
one because the window size was too small to capture
an important feature later in the trace. This indicates
that such a search tool should ideally use windows of
varying sizes to catch all problems in the trace.
A second application of Deja vu is in the context of
a help desk tool. Our help desk application uses the
problem signatures generated by Deja vu to automati-
cally match the problem being experienced by the user
against a database of known issues, i.e., ones for which
there is a known ﬁx. Whenever a failure is encountered
(e.g., a browser error), the Deja vu component on the
client machine extracts features from the packet trace
in the recent past (tracing is an ongoing background
activity) and sends these to the Deja vu server. At the
server, these features are fed into the application’s sig-
nature tree and thereby matched against a known cate-
gory of failures. The problem notes associated with this
category would then guide the diagnosis and resolution
A more sophisticated version of the help desk appli-
cation could use Deja vu signatures to index WikiDo 
tasks instead of just indexing manually crafted notes.
We discuss the impact of noisy traces
on Deja vu’s signatures. Noise refers to packets that are
extraneous to the application of interest. Such noise
could arise from the network communication of other
applications or even other hosts, depending on where
the packet trace is captured. Deja vu’s feature extrac-
tor would then extract features from such background
traﬃc and include these with the (correct) features cor-
responding to the traﬃc of interest.
Such noisy features could be problematic in two ways:
(a) these could lead Deja vu to learn incorrect signa-
tures for problems, and (b) these could cause an incor-
rect match when an attempt is made to match the noisy
features against the signatures generated by Deja vu.
Deja vu’s use of GOOD/BAD labels helps mitigate
problem (a) because the noisy features are likely to
be uncorrelated with the success (GOOD) or failure
(BAD) of the application of interest and hence are
likely to be disregarded by Deja vu’s signature con-
struction algorithm. However, a noisy feature extracted
from background traﬃc (e.g., a successful DNS request-
response exchange) could still cause problems, as ex-
To alleviate the above problem, we could leverage
prior work on application traﬃc ﬁngerprinting (e.g., [11,
7, 15] to separate out just the subset of traﬃc in a
packet trace that corresponds to the application of in-
terest. Performing such separation thoroughly would
require the tracing to be performed on the end hosts,
so that traﬃc could be unambiguously tied to speciﬁc
Another source of inaccuracy in the traces is misla-
beling of GOOD and BAD traces. Previous work 
has shown that the C4.5 decision tree algorithm is ro-
bust to a certain degree of mislabeling in the context
of network diagnostics. However, no learning algorithm
can withstand large amounts of mislabeling. Applica-
tions that use Deja vu have to be designed in a way so
that the chances of mislabeling stays low. A discussion
of such application-level techniques are out of scope of
In our experiments, the Deja vu algo-
rithm took less than one second to complete processing
all the traces. For the applications we have discussed,
we expect practitioners to run Deja vu with a frequency
of approximately once a day, and we believe the current
performance is suitable for this design point. It is, how-
ever, possible that as the problem traces become more
diverse, Deja vu may learn a considerable number of
problem signatures in a single run. In such cases, sig-
natures can be prioritized based on the conﬁdence that
the C4.5 algorithm assigns onto them. Signatures that
are seen more often can be bubbled to the top of the
priority list, thereby allowing an administrator or sup-
port engineer to look at the more predominant problems
Network Trafﬁc Analysis
gerprint applications and infer the behavior of proto-
cols [15, 11]. While such analysis has used supervised
learning on coarse features such as packet size and ﬂow
length to distinguish between applications, Deja vu op-
erates on more ﬁne-grained features (e.g., features spe-
ciﬁc to DNS, TCP, HTTP, etc.) but with coarse-grained
GOOD vs. BAD labels.
Such analysis has also been used to discover the
session-level structure of applications , e.g., to dis-
cover that in an FTP session, a control connection is of-
ten followed by one or more data connections. However,
to our knowledge, such session structure has not been
used for constructing signatures for network problems.
Furthermore, discovering session structure is only semi-
automated, requiring the involvement of a human ex-
pert to actually reconstruct the session structure. Hu-
man involvement in Deja vu is limited to labeling train-
ing runs as GOOD or BAD, a much less onerous task.
Finally, such analysis has also been used to perform
network anomaly detection (e.g., ). The typical ap-
proach has been to construct a model of normal behav-
iors based on past traﬃc history and then look for sig-
niﬁcant changes in short-term behavior based that are
has focused on aggregate behavior, Deja vu focuses on
the network behavior of an individual application run.
DebugAdvisor  is a tool to search through source
control systems and bug databases to aid debugging.
Unlike Deja vu, it uses a standard text search tool over
call stack information and bug reports. Deja vu is closer
in spirit to work on automating the diagnosis of system
problems, which involves extracting signatures from in-
formation such as system call traces (e.g., ). The
approach is to employ supervised learning (e.g., SVM)
on a fully labeled database of known problems. In a
similar vein, Clarify  is a system that improves error
reporting by classifying application behavior. Clarify
generate a behavior proﬁle, i.e., a summary of the pro-
gram’s execution history, which is then labeled by a
human expert to enable learning-based classiﬁcation.
In comparison with the above approaches, which re-
quire a human expert to perform full labeling, Deja vu
operates only with coarse-grained labels. Also, since
Deja vu focuses on network problems, there are a num-
ber of domain-speciﬁc choices it incorporates, including
for feature selection.
STRIDER  and PeerPressure  analyze state
information in the Windows registry, to identify fea-
tures (e.g., registry key settings) that are indicative of
a problem. Unlike with Deja vu, the goal of this body
of work was not to develop problem-speciﬁc signatures
based on the behavior of the system. Rather it is to de-
tect anomalous state by performing state diﬀerencing
between a health machine and a sick machine. Also,
the features (e.g., registry key settings) were treated
as opaque entities whereas Deja vu uses networking
domain-speciﬁc knowledge to deﬁne features.
Similarly, NetPrints  analyzes network conﬁgura-
tion information to diagnose home network problems.
While being largely state-based, NetPrints also made
limited use of network problem signatures to address
the issue of hidden conﬁgurations that are not available
to the state-based analysis.
Compared to the above, Deja vu is not intrusive since
it operates on network traﬃc and hence does not require
any tracing to be performed on the end system itself.
While Deja vu seeks to extract network problem
signatures from existing application traﬃc, there is a
large body of work on characterizing network problems
through active probing [10, 17, 4, 8].
ing with a carefully-crafted set of tests enables detailed
characterization of a range of problems, often enabling
diagnosis. In contrast, Deja vu strives to produce a
problem ﬁngerprint based on the traﬃc that the appli-
cation generates anyway. These ﬁngerprints may not
contain the detail to directly enable diagnostics. Nev-
ertheless, these provide a generic way to match a prob-
lem instance with a previously seen instance, thereby
enabling diagnostics, as noted in Section 6.2.
with each category of network problem experienced
by an application. It uses a novel algorithm to learn
the signatures from coarse-grained GOOD/BAD la-
bels. Our experimental evaluation, including compar-
ison with a standard classiﬁer (which has the beneﬁt
of knowing ﬁne-grained labels) and a user study, has
demonstrated the eﬀectiveness of Deja vu signatures.
 B. Aggarwal, R. Bhagwan, T. Das, S. Eswaran,
V. Padmanabhan, and G. Voelker. NetPrints: Diagnosing
Home Network Misconﬁgurations using Shared Knowledge. In
 B. Ashok, J. Joy, H. Liang, S. Rajamani, G. Srinivasa, and
V. Vangala. DebugAdvisor: A Recommender System for
Debugging. In FSE, 2009.
 M. Dischinger, M. Marcon, S. Guha, K. P. Gummadi,
R. Mahajan, and S. Saroiu. Glasnost: Enabling End Users to
Detect Traﬃc Diﬀerentiation. In Networked Systems Design
and Implementation, 2010.
 J. Ha, C. J. Rossbach, J. V. Davis, I. Roy, H. E. Ramadan,
D. E. Porter, D. L. Chen, and E. Witchel. Improved Error
Reporting for Software that Uses Black-box Components. In
 S. Kandula, R. Mahajan, P. Verkaik, S. Agarwal, and
J. Padhye. Detailed diagnosis in computer networks. In
Sigcomm. ACM, 2010.
 J. Kannan, J. Jung, V. Paxson, and C. E. Koksal.
Semi-Automated Discovery of Application Session Structure. In
 C. Kreibich, N. Weaver, B. Nechaev, and V. Paxson. Netalyzr:
Illuminating The Edge Network. In IMC, 2010.
 N. Kushman, M. Brodsky, S. Branavan, D. Katabi, R. Barzilay,
and M. Rinard. WikiDo. In HotNets, 2009.
 R. Mahajan, N. Spring, D. Wetherall, and T. Anderson.
User-level Internet Path Diagnosis. In SOSP, October 2003.
 A. Moore and K. Papagiannaki. Toward the Accurate
Identiﬁcation of Network Applications. In PAM, 2005.
 J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan
 H. Wang, J. Platt, Y. Chen, R. Zhang, and Y. Wang.
Automatic Misconﬁguration Troubleshooting with
PeerPressure. In OSDI, 2004.
 Y.-M. Wang, C. Verbowski, J. Dunagan, Y. Chen, H. J. Wang,
C. Yuan, and Z. Zhang. STRIDER: A Black-box, State-based
Approach to Change and Conﬁguration Management and
Support. In LISA, 2003.
 C. V. Wright, F. Monrose, and G. M. Masson. On Inferring
Application Protocol Behaviors in Encrypted Network Traﬃc.
J. Machine Learning Research, Dec 2006.
 C. Yuan, N. Lao, J.-R. Wen, J. Li, Z. Zhang, Y.-M. Wang, and
W.-Y. Ma. Automated Known Problem Diagnosis with Event
Traces. In EuroSys, 2006.
 Y. Zhang, Z. M. Mao, and M. Zhang. Eﬀective Diagnosis of
Routing Disruptions from End Systems. In Networked Systems
Design and Implementation, 2008.
 Y. Zhang, S. Singh, S. Sen, N. Duﬃeld, and C. Lund.
Sketch-based Change Detection: Methods, Evaluation, and
Applications. In IMC, 2004.