23
stronger for archives that do contain the file in question. This supports the hypothesis
that an attacker can determine whether a string S appears frequently within an
archive.
4.2.2 Difference of ratios
The same collection of archives utilized in Section 4.2.1 are examined again. For
this experiment, the compression ratios of FP.log and the full archive are compared
to discover if there is a difference in their average. For this experiment, it is not
necessary to have a known file from the archive. The two-tailed t-test as shown in
Equation 3.2 is implemented. All necessary statistical computations were performed
using SAS statistical software.
For archives containing FP.log, the following values are found:
x
¯ = 0.05629, s = 0.01879, n = 36, |t
.025,35
| = 2.03011
For archives that do not contain this file, the values are as follows:
x
¯ = 0.28075, s = 0.05383, n = 84, |t
.025,83
| = 1.98896 The t-values for each list can
then be computed using µ
0
= 0.04334.
0.05629 − 0.04334
t
present
=
√
= 4.135
(4.1)
0.01879/ 36
0.28075 − 0.04334
t
notpresent
=
√
= 40.422
(4.2)
0.05383/ 84
Notice that the null hypothesis of x
¯ = 0.04334 would be rejected in both cases
because the calculated t-values are both larger than their respective critical values.
This may be due to a poor choice in significance level. Other levels of α are shown
in Table 4.7. By increasing the confidence of the test, it is possible to differentiate
between archives that contain the file under investigation.
24
Table 4.7.
Hypothesis testing results for different levels of �
�
t
��35
t
��83
File present conclusion
File not present conclusion
.01
2.7238
2.6364
reject H
0
reject H
0
.001
3.5912
3.4116
reject H
0
reject H
0
.0001
4.3888
4.08569
fail to reject H
0
reject H
0
4.2.3
ManintheMiddle
The attack described in Section 3.3 is tested using WinRAR v3.42 and v5.10. The
file alice29.txt is used for testing in all cases. The attacks on RAR and RAR5 formats
are discussed seperately below.
RAR file format
The first step of the attack requires changing the compression method and total
file size in the file header. As discussed in Section 3.3, setting the compression method
to no compression, denoted by 0x30, provides the best results. Additionally, the total
file size is altered to equal the packed file size in the header. This step is illustrated
in Figures 4.2 and 4.3 below.
Fig. 4.2.
The original alice29.rar archive with the compression
method circled and the total file size inside the rectangle
25
Fig. 4.3.
The modified alice29-prime.rar with the compression
method circled and the total file size inside the rectangle
The decompression of alice29-prime.rar results in what looks like garbage text.
In reality, it is the unencrypted compressed version of the original file. The remaining
challenge is to reconstruct the original file given the corrupted text. The final step
of the attack outlined in [9] and [10] is to re-compress alice29-corrupted.txt using
compression method 0x30, restore the compression method and total file size to their
original values and decompress the archive.
The attack fails as outlined during the final step. Comparison of the file contents
with an unencrypted compressed version of the original file verify that the encryp
tion has been removed. However, neither WinRAR v3.42 nor v5.10 is capable of
decompressing the resulting archive correctly.
In addition to the steps outlined in the original attack, the author made several
modifications to the final archive. The modifications were based on the following
observations.
1. There is now padding at the end of the file.
2. The encrypted packed archive size is larger than the unencrypted packed archive
size.
3. The RAR version needed to extract the file has changed.
To address these issues, supplementary modifications can be done. None of the
new changes require any additional knowledge on the part of the attacker. First, the
zero padding at the end of the file is deleted. The number of bytes in the padding
26
is then subtracted from the packed archive size in the file header. Next, the UNP VER
field is adjusted to reflect the same version as indicated in the original alice29.rar
archive. To reduce errors, the file and header CRC32 is recalculated using the built
in HxD function. The corresponding fields are also updated. With these changes, it
is possible to fully recover the original file.
RAR5 file format
The RAR5 format is tested in WinRAR v5.10 using the same test file and attack
outline. When implementing the attack, the new format requires further calculations
to modify the required fields. The variable length quantities for the packed archive size
and the compression method are calculated as described in Section 3.3.1. This section
illustrates the steps necessary to modify the compression method as an example to
readers.
In the encrypted compressed archive alice29.rar, the compression method is rep
resented by the hexadecimal numerals 0x800B. This is represented in binary nota
tion as 10000000 0001011. The final three digits in the binary string represent the
compression method used in the archive. Currently compression method 3, normal
compression, is selected. To move forward with the attack, compression method 0
will be applied by changing the digits to obtain the string 10000000 0000000. The
fourth digit indicates the dictionary size required to extract data. Since there is no
compression in the archive, this bit is not necessary. Finally, converting back to
hexadecimal produces a final value of 0x8000. This is used to replace the initial
compression method.
In contrast to the earlier file formats, the attack fails at this point. Despite changes
to the file header, the RAR5 format is capable of extracting the original contents with
no issue. The extraction does not result in CRC checksum errors as the previous
versions do. Without the extraction of corrupted contents, the attack is unable to
Dostları ilə paylaş: |