A systematic Characterization of Application Sensitivity to Network Performance

Yüklə 0,74 Mb.

Pdf görüntüsü

səhifə	38/51
tarix	15.10.2018
ölçüsü	0,74 Mb.
	#74178

1 ... 34 35 36 37 38 39 40 41 ... 51

Generic GAM Pipeline
GAM Pipeline Parameters

101
costs. In the next sections, we will see that although SPINE is successful at reducing overhead, it
does not improve the latency or bandwidth. The importance of reducing overhead without altering
the other parameters depends on the application context. For a busy server, overhead reduction might
be important, but in other contexts absolute latency or bandwidth may be more critical.
In the early 1980’s many commercial network adapter designs incorporated the entire pro-
tocol stack into the adapter. There were two reasons for such an approach, which, due to its com-
plexity, required an I/O processor. First, many host operating systems did not support the range of
protocols that existed at the time (e.g., TCP/IP, telnet, and rlogin) [86]. Writing these protocols once
for a re-programmable network adapter was an effective method of quickly incorporating protocols
into a variety of operating systems. Second, the host processors of the time were not powerful enough
to run both multi-tasking jobs and network protocol stacks efﬁciently.
By the late 1980’s however, the tide had turned, with only support for very common pro-
tocol operations included in the adapter. The migration of common protocols into commodity oper-
ating systems and the exponential growth of processor speed eliminated the original motivations for
re-programmable network adapters at the time. There has been a great deal of work, however, in of-
ﬂoading pieces of network protocols. For example, there has been work to ofﬂoad Internet checksum
calculations [33], link layer processing, and packet ﬁltering.
In terms of this thesis, there are clearly tradeoffs between reducing
ó
and increasing
ô
and
õ
. For example, both the Meiko CS-2 and Paragon machines used I/O processors. Adding I/O pro-
cessors added
ô
to the system, and in the Meiko they also added a very high
õ
as well [64]. Given
the results of this thesis however, reducing
ó
at the expense of
ô
is the correct tradeoff. However,
inﬂation of
õ
is not as clear a beneﬁt as this reduces the effectiveness of latency tolerating techniques.
Although the LogGP model is quite useful, its parameters are too abstract to capture some
of the performance enhancements of a system which reduces overhead. We need a method to more
concretely characterize the effect of adding or removing functional units. Although we can cast such
performance improvements in terms of the LogGP model, as in the above example, a better class of
models are the pipeline models introduced in the next section. The problem with LogGP is that it
lumps too much of the off-CPU processing into just two parameters:
õ
and
ô
. In addition, these
parameters include a myriad of system components. Pipeline models allow us to isolate the effect of
each function unit in isolation, yet allow a re-construction of the entire communications path.

102
(µ
sec)
Occupancy
Gap
0
1
2
3
4
Send LANai
Send CPU
Wire
Receive LANai
Receive CPU
Stage
0
32
64
96
160
192
128
Bubbles
Time
155
204
2nd packet
1st packet
Figure 6.2: Generic GAM Pipeline
This ﬁgure plots the movement of 2 packets each of size 2 KB through the abstract GAM pipeline.
Time is represented on the x-axis and the stage number on the y-axis. The ﬁxed occupancy is shown
in light grey and the variable per-byte cost (Gap) time in dark grey for each stage. Bubbles (idle
time) can result when moving from different speed stages.
6.1
Pipeline Framework
A less common viewpoint than either queuing theoretic models or parallel program models
are pipeline models [32, 106]. In this family of models, the network is modeled as a series of store-
and-forward stages. The time to move data through each stage is modeled as a ﬁxed occupancy,
ö
,
plus a variable per-byte cost, also called Gap,
÷
. Different versions of the models arise about the
restrictions placed on the stages. For example, the stages may allow for only ﬁxed-size packets, as
opposed to the more general variable size packets.
Although superﬁcially similar to a network queuing model, the analysis techniques of pipe-
line models are quite different. The differences arise because the questions asked about pipelines
have to do with how to discretize the packets to obtain minimum delay or maximum bandwidth
through the pipeline, not steady-state behavior assuming a random process model. Some of the anal-
ysis techniques are the same as the min-max techniques used in the operations research community.
Figure 6.2 shows an abstract pipeline framework in which to reason about networking per-
formance. Time is represented on the x-axis and stage number on the y-axis. Two 2 KB packets are
shown making their way through the network pipeline. The occupancy portion of the time is repre-
sented as light grey, and the Gap portion in dark grey. Table 6.1 shows the actual values as measured
in [106]. Note in the real GAM system, stages 2 and 3 are collapsed into a single stage to simplify
the LANai ﬁrmware. However, performance is not affected because the sum of stages 1 and 2 nearly
equals stage 1.

103
Stage
Occupancy
Gap
(
ø
sec)
(
ø
sec/KB)
Send CPU
6.7
7.2
Send LANai
5.3
24.5
Wire
0.2
6.4
Recv. LANai
5.2
18.5
Recv. CPU
9.6
7.2
Table 6.1: GAM Pipeline Parameters
This table shows the abstract pipeline parameters for the GAM system. Each stage is abstracted by
a ﬁxed cost, called the occupancy, and a cost-per-byte, which corresponds to a Gap parameter per
stage.
6.2
Example: SPINE IP Router
In this section we explore the overhead reduction techniques used in the Safe Programmable
Integrated Networking Environment (SPINE). SPINE allows fragments of application code to run
on the network interface. An explicit goal of the SPINE system is to improve performance by re-
ducing data and control transfers between the host and I/O device. In the context of this thesis, such
reductions can reduce
ù
. As we shall see in the next sections, this often comes at the expense of
ú
and
û
. The next sections show that by allowing application code to execute on the network interface,
we can obtain substantial efﬁciencies in data movement and control transfers.
Loading application-speciﬁc code, as opposed to vendor-supplied ﬁrmware, onto a pro-
grammable adapter raises many questions. How and when does it execute? How does one protect
against bugs? How does this code communicate with other modules located on the same adapter,
peer adapters, remote devices, or host-based applications spread across a network?
We address these questions using extensible operating system technology derived from
the SPIN operating system [14] and communication technology from the NOW project [5] to de-
sign SPINE. SPINE extends the fundamental ideas of SPIN, that is, type-safe code downloaded into
a trusted execution environment. In the SPIN system, application code was downloaded into the
operating system kernel. In the SPINE environment, code is downloaded into the network adapter.
Extensibility is important, as we cannot predict the types of applications that may want to run directly
on the adapter.
The next sections document an application we have constructed on the SPINE system:
an Internet Protocol router. We have also constructed a video client application. However, the IP

Yüklə 0,74 Mb.

Dostları ilə paylaş:

1 ... 34 35 36 37 38 39 40 41 ... 51