Cerias tech Report 2015-01 The Weakness of Winrar encrypted Archives to Compression Side-channel Attacks

Yüklə 274,97 Kb.

Pdf görüntüsü

səhifə	8/10
tarix	17.10.2017
ölçüsü	274,97 Kb.
	#5444

1 2 3 4 5 6 7 8 9 10

4. RESULTS

In this section, the results of the experiments described in the previous section are

discussed.

4.1 Compression ratios

The ﬁrst input for the experiment allows each ﬁle to be considered a separate

treatment. The compression and password options are then considered blocks, of

which there are four total. The ﬁles are divided into four groups: Text, Executable,

Graphics and Other. These are encoded as treatments 1, 2, 3 and 4, respectively.

To balance the experiment, four ﬁles for each type are randomly selected and each

ﬁle is tested in every block. An ANOVA test is run to compare the means of the

four treatments at a signiﬁcance level of α = .05. A box-plot and basic descriptive

statistics of the data follow in Figure 4.1 and Table 4.1.

Fig. 4.1. Box Plot of the distributions of diﬀerent ﬁle types.

Table 4.1.

Descriptive statistics for compression ratio data

Treatmente N Obs

Mean

Std Dev Minimum Maximum

0.3244542 0.0637309

0.2470714

0.4163709

0.3561775 0.0856055

0.2768610

0.4914785

0.8181292 0.3235247

0.2754676

1.0009069

0.3092772 0.3171904

0.0360954

0.8311475

Notice in Figure 4.1, the plots of the treatment means overlap. This suggests

that they are not necessarily distinct. To determine whether there exists a signiﬁcant

diﬀerence between ﬁle types, hypothesis testing on H

: The treatment means are equal

is conducted using Analysis of Variance. SAS provides the ANOVA table in Table 4.2.

The P-value of < 0.0001 is less than the stated signiﬁcance value. Therefore, there is

statistical evidence to reject H

and the conclusion is that there exists a diﬀerence in

compression ratios of diﬀerent ﬁle types.

Table 4.2.

ANOVA table for comparing compression ratios of diﬀerent ﬁle types

Source DF Type III SS Mean Square F-Value P-value

trt

3

2.87792404

0.95930801

16.82

<.0001

blk

0.00000060

0.00000020

0.00

1.0000

To formally test the diﬀerence between means, Tukey’s comparison for treatment

means is implemented. All possible pairs from the data are tested, which make

Tukey’s comparison most appropriate. Means with the same letter are not considered

signiﬁcantly diﬀerent. As illustrated in Table 4.3, treatments 2, 4 and 1 are not

signiﬁcantly diﬀerent. These treatment types correspond to text, executable and

other data ﬁles respectively. Graphics are noted to have a mean signiﬁcantly higher

than other ﬁle types.

Table 4.3.

Tukey’s comparison of treatment means

Tukey Grouping

Mean

N trt

0.81813 16

B

0.35618 16

0.32445 16

0.30928 16

Finally, Table 4.4 provides 95% conﬁdence intervals for the diﬀerent ﬁle type

ratios. These intervals have a 95% chance of containing the true population mean.

Investigators with a known compression ratio falling within one of these intervals can

assume that the ﬁles contained in the archive are of the indicated ﬁle type.

Table 4.4.

95% Conﬁdence Intervals for diﬀerent ﬁle type compression ratios

File Type

Mean

95% Conﬁdence Interval

Text

Executable

Graphic

Other

0.32445

0.35618

0.81813

0.30928

0.29049

0.31056

0.64574

0.14026

0.35841

0.40179

0.99052

0.47830

4.2 File detection

Two experiments are run in this section. The ﬁrst tests whether the appearance

of substrings in the known part of an archive correlates with the compressed length

of the archive. The second experiment tests whether the compression ratio of the

archive is correlated with the compression ratio of a ﬁle in question.

4.2.1 Appearance of substrings

The archives are constructed as described in Section 3.2. The goal is to identify

archives that contain FP.log through the appearance of substrings of a string S in a

known ﬁle. Archives containing FP.log are sorted from the collection. Appearance of

substrings are counted for each archive. Linear regression is then applied to determine

the correlation between the number of appearances and the compressed size of the

archive.

Table 4.5.

SAS output of correlation between size and appearance of substrings

where the ﬁle is present

Root MSE

495032

R-Square 0.2520

Dependent Mean 1293068 Adj R-Sq 0.1273

Coeﬀ Var

38.28347

Table 4.6.

SAS output of correlation between size and appearance of substrings

where the ﬁle is not present

Root MSE

317309

R-Square 0.1396

Dependent Mean

109798

Adj R-Sq 0.0614

Coeﬀ Var

288.99243

Tables 4.5 and 4.6 show the SAS output for the correlation values. The model uses

multiple linear regression, so the Adj R-sq is the most appropriate statistic. Notice

that R

2

= 0.1273 and R

= 0.0614. This implies that the correlation is

present

notpresent

Yüklə 274,97 Kb.

Dostları ilə paylaş:

1 2 3 4 5 6 7 8 9 10