Cerias tech Report 2015-01 The Weakness of Winrar encrypted Archives to Compression Side-channel Attacks

Yüklə 274,97 Kb.

Pdf görüntüsü

səhifə	10/10
tarix	17.10.2017
ölçüsü	274,97 Kb.
	#5444

1 2 3 4 5 6 7 8 9 10

proceed. Subsequent attempts to sabotage the ﬁle header to force an error failed.

The ﬁelds altered included CRC32 ﬁelds, ﬁle ﬂags, and attributes.

5. SUMMARY

The ﬁndings presented in Section 4 include several novel results. These will be dis

cussed in detail followed by a brief suggestion of countermeasures to prevent infor

mation leakage.

5.0.4 Discussion

In Section 4.1, statistical methods show that it is possible to distinguish diﬀerent

ﬁle types based on an archive’s compression ratio. Therefore, the proposed Hypoth

esis 1 holds true. It is important to notice that, as illustrated in Table 4.3, Text,

Executable and Other compression ratios are not distinct. However, graphic ﬁles con

sistently compress at a ratio considerably higher than other ﬁle types. This is likely

due to the fact that many image formats implement some form of compression [33].

If the data in an archive has already been compressed, WinRAR’s algorithms can do

little to further reduce an archive’s size. This results in a packed ﬁle size very close

to the total ﬁle size.

This attack is most eﬀective if an investigator is considering compression ratios to

assist in identifying whether an archive contains images. For example, in child pornog

raphy cases a forensic investigator may need to identify archives with large amounts

of images. Compression ratio inspection provides a simple method of identiﬁcation

for archives with these types of contents. The information necessary is very minimal

and can be found from any archive which makes this attack easy to implement in a

variety of situations. Table 4.4 provides some intervals to be used for identifying ﬁle

types. Generally, an archive with a compression ratio greater than .64 can reasonably

be assumed to contain images. The ability to identify ﬁle types within an archive

helps save valuable time and eﬀort that could potentially be lost in attempting to

crack archives with irrelevant contents.

The appearance of substring experiment in Section 4.2.1 supported the hypothesis

that substrings in the known part of an archive correlate with the compressed size of

the archive. This correlation is likely due to the general compression scheme utilized

by WinRAR. If a ﬁle is present in an archive, the appearance of substrings will allow

both LZSS and PPMII compression schemes to work more eﬃciently. In turn, this

results in a lower packed size for the archive. This provided the most surprising

results of the experiments as the conclusions were not immediately obvious from the

raw data.

This attack does have several drawbacks. First, the ﬁle selected for examination

has an extremely high number of repeated strings as stated in Section 3.2. Unless an

investigator is looking for similarly structured ﬁles, such as log ﬁles, it is unlikely that

typical text will include similar levels of repetition. This would result in a weaker

eﬀect on the overall compression size which may cause the correlation to become too

weak for detection. However, the appearance of substring attack is ideal to identify

ﬁles containing proﬁle or bank account information that include many repeated ﬁelds.

Secondly, a relatively large collection of archives was examined, which strength

ened the power of the statistical process. A collection of this size may not be available

for study. Finally, the archives containing the ﬁle were known ahead of time. While

this experimental design is suﬃcient to show correlation, it is not practical to execute

on completely unknown data. For future testing, a Monte Carlo experiment may

provide more accurate results for modeling the relationship between substrings and

archive size.

Section 4.2.2 showed that, with suﬃciently signiﬁcant levels of α it is possible

to distinguish archives that contain a ﬁle from those that don’t. Adequate evidence

is given to show that Hypothesis 2b is valid. It should be noted that in this

experiment the ratios of archives containing the ﬁle have a signiﬁcantly diﬀerent

average than those that don’t. In the event that the averages are closer in value, the

author suggests that lower values of α will be capable of distinguishing between them.

This attack is ideal to use on ﬁles that are highly compressible as its compression ratio

will have a signiﬁcant eﬀect on that of the archive. The selection of ﬁles most suited

to this attack suﬀers from the same issues outlined for the appearance of substrings

attack.

Both the appearance of substrings and diﬀerence of ratios attacks can extend

their usefulness in exﬁltration detection measures. For example, if numerous archives

are detected leaving an organization’s system, sensitive information such as client

data can be checked against the archives as outlined. This can provide a reasonable

perception of what information has been compromised.

Finally, experiments with the Man-in-the-Middle attack in Section 4.2.3 provided

suggestions for improvement. Despite claims that the original attack is capable of

obtaining the plain text of a ﬁle in an archive, it does not perform as suggested. Fol

lowing the attack as outlined in the literature will result in the removal of encryption

from an archive. However, the compressed ﬁle is still unintelligible.

To remedy this, the author suggests some variation in the ﬁnal step of the attack.

First, the ﬁles tend to accumulate extra padding at the end. This is simple to identify

as it consists of a string of hexadecimal values 0x00. The padding may be generated

from the loss of the password and salt after the encryption is removed. To avoid

conﬂict with the ﬁle size, the packed ﬁle size needs to be adjusted according to the

amount of padding removed. Secondly, WinRAR uses standard CRC32 checksums,

which can be computed with oﬀ the shelf software and applied in the relevant ﬁelds.

Finally, the unpacking version ﬁeld should be updated to the value in the original

archive’s ﬁeld to avoid compatibility issues. All of the extra information needed can

be discovered using the archives that an adversary has access to. These steps will

insure that the contents of an encrypted compressed archive can be revealed. The

attack has been veriﬁed on RAR archives using both WinRAR v3.42 and v5.10.

Despite the success of the revised implementation, RAR5 formatted archives remain

robust against the attack. This is possibly due to the enhanced archive recovery

capabilities in v5.x. The software is more capable of detecting and mitigating changes

to ﬁle information. Another potential pitfall in attacking the newest ﬁle format is the

new checksum algorithms. The CRC32 and BLAKE2 checksums are now password

dependent. Without knowing the password, it is not feasible to calculate the values

necessary in the ﬁnal step of the attack. However, the older ﬁle format is the default

method for the newest version and remains very widely used. The attack introduced

in this paper is relevant to current information security needs.

5.0.5 Countermeasures

In response to the information discovered through the experiments, there are sug

gestions to circumvent some of the attacks. Aside from the appearance of substrings

attack, all of the attacks rely on the assumption that the adversary is able to at least

view ﬁle header information. The default setting in WinRAR only encrypts a ﬁle’s

contents and the header information remains in plaintext. For this situation, the

assumption holds. However, users are able to select an option to encrypt ﬁle header

information along with the ﬁle contents. This would mask information such as total

and packed ﬁle size, compression method and any additional ﬁle attributes.

For further security of ﬁles, the author suggests using the RAR5 ﬁle format when

possible. It has the same weakness against the ﬁrst three attacks as the older ﬁle

versions. However, it is resilient against the Man-in-the-middle attack. Thus, it

provides slightly improved security over previous versions.

5.0.6 Conclusion and open questions

This paper shows that knowledge of information in an encrypted archive can be

leaked via the study of compression properties. These attacks require less time and

computing resources than traditional attacks against the encryption of an archive. Is

sues in an attack are addressed to create a successful method for recovering archived

ﬁles. This has been veriﬁed with two diﬀerent versions of WinRAR but the eﬀec

tiveness with other compression software remains to be evaluated. There is also a

possibility of using this attack against an archive containing multiple ﬁles. All of the

presented methods are eﬃcient for investigators to implement as a ﬁrst line of query

to discover information about an unknown archive. These methods also highlight an

area that is lacking in security for the WinRAR software. It is a future challenge to

provide a good compression scheme with eﬀectively implemented encryption.

Some open questions remain in relation to the string detection attacks. The eﬀect

of string frequency in the appearance of substrings remains open for further investi

gation. As discussed in Section 5.0.4, the attack is eﬀective on highly compressible

ﬁles such as logs or databases. However, many text ﬁles do not have a high number of

repetitive strings. The length of the string may also inﬂuence the correlation with an

archive’s size. Further investigation into the eﬀects of repetition and length remain

open.

The experimental design used emphasizes the use of statistics to conclude the

validity of a hypothesis. When conducting the literature review, very few papers im

plemented rigorous statistical methods to reach conclusions. The meaning of data can

be counter-intuitive and it is possible to reach incorrect conclusions without proper

analysis. The author encourages future researchers to use experimental methods to

provide strong validity for information security research.

REFERENCES

[1] Symantec.

Trojan.Dropper

Technical

Details.

[Online].

Avail

able:

http://www.symantec.com/security response/writeup.jsp?docid=2002

082718-3007-99tabid=2

[2] WinZip. What can I do if I forget the encryption password for my zip ﬁle?

[Online]. Available: http://kb.winzip.com/kb/entry/79/

[3] J. Chen, J. Zhou, K. Pan, S. Lin, C. Zhao, and X. Li, “The security of key

derivation functions in WINRAR,” Journal of Computers, vol. 8, no. 9, pp.

2262–2268, 2013.

[4] A. Biryukov, O. Dunkelman, N. Keller, D. Khovratovich, and A. Shamir, “Key

recovery attacks of practical complexity on AES variants with up to 10 rounds,”

IACR eprint server, vol. 374, 2009.

[5] A. Biryukov and D. Khovratovich, “Related-key cryptanalysis of the full AES

192 and AES-256,” in Advances in Cryptology–ASIACRYPT 2009.

Springer,

2009, pp. 1–18.

[6] H. Demirci and A. A. Sel¸cuk, “A meet-in-the-middle attack on 8-round AES,”

in Fast Software Encryption. Springer, 2008, pp. 116–126.

[7] J. Lu, O. Dunkelman, N. Keller, and J. Kim, “New impossible diﬀerential attacks

on AES,” in Progress in Cryptology-INDOCRYPT 2008.

Springer, 2008, pp.

279–293.

[8] A. Biryukov, D. Khovratovich, and I. Nikoli´

c, “Distinguisher and related-key at

tack on the full AES-256,” in Advances in Cryptology-CRYPTO 2009. Springer,

2009, pp. 231–249.

[9] G. S.-W. Yeo and R. C.-W. Phan, “On the security of the WinRAR encryption

feature,” International Journal of Information Security, vol. 5, no. 2, pp. 115–

123, 2006.

[10] T. Kohno, “Attacking and repairing the WinZip encryption scheme,” in Pro

ceedings of the 11th ACM conference on Computer and communications security.

ACM, 2004, pp. 72–81.

[11] D. Polimirova-Nickolova and E. Nickolov, “Examination of archived objects’ size

inﬂuence on the information security when compression methods are applied,”

in Third International Conference Information Research, Applications and Edu

cation, 2005, p. 130.

[12] J. Kelsey, “Compression and information leakage of plaintext,” in Fast Software

Encryption. Springer, 2002, pp. 263–276.

[13] L. Ji-Zhong, J. Lie-Hui, Y. Qing, and X. Yao-Bin, “Hybrid method to analyze

cryptography in software,” in Multimedia Information Networking and Security

(MINES), 2012 Fourth International Conference on. IEEE, 2012, pp. 930–933.

[14] C. Maartmann-Moe, S. E. Thorkildsen, and A. ˚

Arnes, “The persistence of mem

ory: Forensic identiﬁcation and extraction of cryptographic keys,” digital inves

tigation, vol. 6, pp. S132–S140, 2009.

[15] G. Fellows, “WinRAR temporary folder artefacts,” Digital Investigation, vol. 7,

no. 1, pp. 9–13, 2010.

[16] D. Gupta and B. M. Mehtre, “Recent trends in collection of software forensics

artifacts: Issues and challenges,” in Security in Computing and Communications.

Springer, 2013, pp. 303–312.

[17] RarLab. WinRAR at a glance. [Online]. Available:

http://www.win

rar.com/website/index.php?id=features

[18] ——. WinRAR - what’s new in the latest version. [Online]. Available:

http://www.rarlab.com/rarnew.htm

[19] WinRAR, “User’s manual: Rar 5.10 console version,” 2014.

[20] J. S. Plank, K. M. Greenan, and E. L. Miller, “Screaming fast galois ﬁeld arith

metic using intel simd instructions.” in FAST, 2013, pp. 299–306.

[21] WinRAR, “What’s new in the latest version - version 3.00,” 2002.

[22] N. Standard, “Announcing the advanced encryption standard (AES),” Federal

Information Processing Standards Publication, vol. 197, 2001.

[23] M. S. Turan, E. B. Barker, W. E. Burr, and L. Chen, “SP 800-132. recommenda

tion for password-based key derivation: Part 1: Storage applications,” National

Institute of Standards & Technology, Gaithersburg, MD, United States, Tech.

Rep., 2010.

[24] J. A. Storer and T. G. Szymanski, “Data compression via textual substitution,”

Journal of the ACM (JACM), vol. 29, no. 4, pp. 928–951, 1982.

[25] D. Shkarin, “Improving the eﬃciency of the ppm algorithm,” Problems of infor

mation transmission, vol. 37, no. 3, pp. 226–235, 2001.

[26] ——, “PPM: One step to practicality,” in Data Compression Conference, 2002.

Proceedings. DCC 2002, 2002, pp. 202–211.

[27] R. Intel, “Intel 64 and IA–32 architectures optimization reference manual,” Intel

Corporation, May, 2012.

[28] S. W. Smith et al., “The scientist and engineer’s guide to digital signal process

ing,” 1997.

[29] M.

Powell.

The

Canterbury

corpus.

[Online].

Available:

http://corpus.canterbury.ac.nz/

[30] MaximumCompression.

Lossless

data

compression

soft

ware

benchmarks/comparisons.

[Online].

Available:

http://www.maximumcompression.com/

[31] M. H¨

orz, “HxD–HexEditor,” http://mh-nexus.de/en/hxd/, 2002–2009.

[32] RarLab.

Rar

5.0