A systematic Characterization of Application Sensitivity to Network Performance

Yüklə 0,74 Mb.

Pdf görüntüsü

səhifə	35/51
tarix	15.10.2018
ölçüsü	0,74 Mb.
	#74178

1 ... 31 32 33 34 35 36 37 38 ... 51

92
results are close to the model predictions. The slope of the SFS curve is fairly insensitive to
Ù
until
throughput is near the saturation point. A queuing model gives us nearly the same result; the slope of
the SFS curve will not change drastically with respect to
Ù
. The most dramatic effect of
Ù
, however,
is on the saturation point.
Figure 5.7(b) shows what happens to the saturation point in the RAID when system ca-
pacity is exceeded; both response time and throughput degrade slightly as offered load exceeds the
saturation point. A less dramatic version of this effect was observable as we scaled
Ú
as well. An in-
teresting open question is how well the system responds to these extreme conditions, i.e., how much
performance is obtainable when the offered load is 150% of peak? Queuing theoretic models tell us
that response time should increase to inﬁnity as offered load nears 100%. Figure 5.7(b) shows that
in a real system (which is a closed system) the response time hovers around an overhead-dependent
maximum while the delivered throughput slowly decreases. Feedback loops built into the RPC layer,
based on algorithms in [56], keep the system out of the realm of very high response times, instead
forcing the entire system towards lower throughputs. The algorithms are quite effective; rather than
a complete system breakdown we observe small degradations in throughput and response time. A
full investigation of these effects, however, is beyond the scope of this work.
Because we are scaling a processor resource, the lower saturation point must be due to the
higher service time of the CPU. In the next sections we will explore the nature of the saturation point.
We ﬁrst derive the sensitivity curve and then examine the components of the service time for both
the SCSI and RAID systems.
Sensitivity to Overhead
Figure 5.8 shows the relationship between overhead and throughput. The modeled line
shows where the CPU reaches 100% utilization in the model presented in Section 5.3, while the mea-
sured line is derived from the results in Figure 5.7. The most interesting aspect of both systems is
that the peak performance drops immediately as we add overhead; unlike the response to latency,
there is no insensitive region. Therefore, we can easily conclude that the CPU is the bottleneck in
the baseline system. Also for both curves, the response to overhead is non-linear, i.e., for each
Û
s of
added overhead, the peak drops off quickly and then tapers out.
To determine the accuracy of the model, we performed a curvilinear regression against the
overhead model in Section 5.3.2. The
Ü%Ý
values of .99 for the SCSI and .93 for the RAID show that
our model is fairly accurate. The sensitivity to
Ù
agrees well with the model.

93
0
200
400
600
800
1000
50
100
150
200
250
300
350
400
450
500
550
600
Peak Ops/sec
Þ
Overhead (usec)
SCSI
Modeled
Measured
0
200
400
600
800
1000
1200
1400
1600
50
100
150
200
250
300
Peak Ops/sec
Þ
Overhead (usec)
RAID
Modeled
Measured
(a)
(b)
Figure 5.8: Peak Throughput vs. Overhead
This ﬁgure plots the saturation point as a function of overhead in microseconds. Measurements for
the graph on the left were taken on the SCSI system, while measurements for the graph on the right
were taken on the RAID system.
Examining Overhead
The SCSI and RAID system both use the same CPU, operating system, network, and nearly
the same number of disks. Yet RAID’s saturation point is much higher. An obvious question is the
reason for the lower performance of the SCSI system. Figure 5.9 reports the percentage breakdown
of different components of CPU near the saturation point for both the SCSI and RAID. Because the
monitor itself (
kgmon
) uses some CPU, it is not possible to actually reach the saturation point while
monitoring the system. The relative areas of the charts show the difference in average time per op-
eration, including CPU idle time, of 1 msec for the SCSI and 714
ß
s for the RAID.
The most obvious difference is the time spent in the device drivers; the FAS SCSI drivers
spend an average of 150
ß
s per NFS operation while the RAID drivers spend an average of only 36
ß
s. The networking stacks and ﬁlesystem code comprise the two largest components of the CPU
time. However, an interesting feature of both systems is that a signiﬁcant amount of the service time
(20% and 26%) is spent in general kernel procedures which do not fall into any speciﬁc category.
There are a myriad of these small routines in the kernel code. Getting an order of magnitude reduc-
tion in the service time would require reducing the time of many sub-systems. Much as was found
in [26, 60] there is no single system accounting for an overwhelming fraction of the service time.

94
Other(19.9%)
Unaccounted(5.2%)
Sync(5.2%)
NFS(6.1%)
Network(11.9%)
Bcopy/Bcmp(10.1%)
MemMgt(4.7%)
Waiting(10.8%)
UFS(10.7%)
SCSI-driver(15.4%)
Kernel Time (SCSI)
% time in sub-system
Other(26.1%)
Unaccounted(4.0%)
Sync(6.5%)
NFS(3.3%)
Network(17.1%)
Bcopy/Bcmp(13%)
MemMgt(5.7%)
Waiting(4%)
UFS(14%)
RAID-driver(5.1%)
Kernel Time (RAID)
% time in sub-system
(a)
(b)
Figure 5.9: Time Breakdown Near Peak Op/sec
These charts show the percentage of time spent in each sub-system when operating near the satura-
tion point. Measurements for the graph on the left were taken on the SCSI system at 1000 ops/sec,
while measurements for the graph on the right were taken on the RAID system at 1400 ops/sec. The
area of each chart shows the relative time per operation, including idle time (i.e., waiting on I/O),
of 1 msec for the SCSI and 714
à
s for the RAID.
5.5.4
Bulk Gap
We choose to examine sensitivity to bulk Gap,
á
, as opposed to the per-message rate
â
.
First, networking vendors often tout per-byte bandwidth as the most important metric in comparing
networks. Using our apparatus we can quantify its sensitivity (and thus importance). Secondly, for
the SPECsfs benchmark,
â
is quite low (in the 1000’s msg/sec range) and is easily handled by most
networks.
Unlike overhead, which is incurred on every message, sensitivity to Gap is incurred only
if the data rate exceeds the Gap. Only if the processor sends data in an interval smaller than that
speciﬁed by
á
will it stall. The clients and server could potentially ignore
á
entirely. Recall the
burst vs. uniform models for gap presented in Section 3.2.2. At one extreme, if all data is sent at a
uniform rate that is less than
á
we will not observe any sensitivity to Gap. At the other extreme, if
all data is sent in bursts then we would observe maximum sensitivity to Gap.
Because SPECsfs sends messages at a controlled rate, we would expect that message in-
tervals are not bursty and the benchmark should be quite insensitive to Gap. Figure 5.10 shows that
this is indeed the case. Only when the bandwidth (
ã
ä
) falls from a baseline of 26 MBs to a mere 2.5
MB/s do we observe any sensitivity to
á
. We are thus assured that the SFS benchmark is not bursty.
Measured production environments, however, are quite bursty [49, 66]. Our measured sen-

Yüklə 0,74 Mb.

Dostları ilə paylaş:

1 ... 31 32 33 34 35 36 37 38 ... 51