Cracking .zip and .rar Archives with Passwords

Cracking .zip and .rar Archives with Passwords

With John the Ripper and Hashcat

·

7 min read

I've stored sensitive information (2FA recovery codes, etc) in some Roshal archives (RAR) and forgot about them and the exact password. These are the notes/findings from recovering the archived and encrypted files. Looks like there is a selection of tools for this Rook, John-the-Ripper, Hashcat, etc.

John the Ripper

We need to extract the hash from the RAR archive. Download the latest release and extract (it should already be installed on Kali Linux, just update packages).

Extraction

John the Ripper suite contains all sorts of hash extraction utilities. In this case rar2john.exe is relevant for the task:

cd run
.\rar2john.exe .\some.rar > hashes.txt

The hashes will look like below and the longer the password the more hash entries.

gc hashes.txt
some.rar:$rar5$16$4cde559c998a25026ece3dbd11b5898e$15$35e82175b4e2c08f434fe67ec86567f1$8$b52097118435e103

If on Windows, the redirect will output to a file with Unicode encoding and the john.exe doesn't like that. The default encoding is utf-8, but you could change the expected input file encoding with --encoding=NAME.

.\john.exe rar.hashes
Error: UTF-16 BOM seen in input file.

I changed the file encoding instead of using the --encoding flag since it's just one less argument to have to use:

Get-Content hashes.txt | Set-Content -Encoding utf8 hashes_utf8.txt
ren hashes_utf8.txt hashes.txt

or on Linux:

iconv -f ISO_8859-16 -t UTF-8 -o hashes.txt hashes_utf8.txt
mv hashes_utf8.txt hashes.txt

Ultimately:

.\rar2john.exe some.rar | sc -Encoding utf8 hashes.txt

Cracking

John the Ripper (john.exe) supports password rules (single), wordlists, and incremental (brute-force) cracking. See the examples. Note if cracking multiple files, it's best to specify them in the same command which would save more time than re-running each file.

Rules

The default configuration is in john.conf which specifies the rules and under the run\rules directory:

.\john.exe hashes.txt --single=<RULE>
Wordlist

A default wordlist is used password.lst. But you can specify something like rockyou.txt:

.\john.exe hashes.txt --wordlist=<your_wordlist>
Brute-Force

This could take awhile depending on the password complexity.

.\john.exe hashes.txt --incremental

You can also use masks with --incremental. Mask attacks are a subset of brute-force attacks:

.\john.exe hashes.txt --incremental

Show more options:

.\john.exe --list=hidden-options

You can also pass in generated passwords from another utility like crunch:

crunch 8 12 0123456789abcdef | ./john.exe hashes -stdin -session=s1

Other modes to look into markov, subset, and others. The regex mode is currently slower than mask mode (GPU supported).

?l = abcdefghijklmnopqrstuvwxyz
?u = ABCDEFGHIJKLMNOPQRSTUVWXYZ
?d = 0123456789
?h = 0123456789abcdef
?H = 0123456789ABCDEF
?s = «space»!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
?a = ?l?u?d?s
?b = 0x00 - 0xff

So using my previous knowledge of what characters the password contained, I tried the following masks:

Qu?sntumPh?snt?dm?d
Qu?sntumPh?snt?dm?s
Qu?sntumPh?snt?dm?s?s?s
Qu?sntumPh?snt?dm?d?d?d
.\john.exe .\hashes.txt --mask=Qu?sntumPh?snt?dm?s

.\john.exe .\hashes.txt --mask=Qu@ntumPh@nt0m/s
Warning: detected hash type "RAR5", but the string is also recognized as "RAR5-opencl"
Use the "--format=RAR5-opencl" option to force loading these as that type instead
Using default input encoding: UTF-8
Loaded 1 password hash (RAR5 [PBKDF2-SHA256 256/256 AVX2 8x])
No password hashes left to crack (see FAQ)

Check the john.log for details and the matching password would be under john.pot (configurable).

$rar5$16$4cde559c998a25026ece3dbd11b5898e$15$35e82175b4e2c08f434fe67ec86567f1$8$b52097118435e103:Qu@ntumPh@nt0m!

Nice and won't be using this password anymore :)

Hashcat

Hashcat is another password-cracking utility that leverages the GPU and is pretty much the go-to utility these days. To leverage an NVIDIA GPU, you want to install the CUDA SDK. We pick up from the extracted hashes.txt:

.\hashcat.exe -h
...
 Options Short / Long           | Type | Description                                          | Example
================================+======+======================================================+=======================
 -m, --hash-type                | Num  | Hash-type, references below (otherwise autodetect)   | -m 1000
 -a, --attack-mode              | Num  | Attack-mode, see references below                    | -a 3
...

The hash that this RAR file uses is RAR5 as indicated in the hash file entries. We can see the supported hashcat modes listed from the -help option or from the site. We also need to specify the attack mode:

- [ Attack Modes ] -

  # | Mode
 ===+======
  0 | Straight
  1 | Combination
  3 | Brute-force
  6 | Hybrid Wordlist + Mask
  7 | Hybrid Mask + Wordlist
  9 | Association

We'll do the same mask attack:

.\hashcat -m 13000 -a 3 .\hashes.txt Qu?sntumPh?snt?dm?s

This gave me the following error:

hashcat (v6.2.6) starting

..\john-1.9.0-jumbo-1-win64\run\hashes.txt: Byte Order Mark (BOM) was detected
Successfully initialized the NVIDIA main driver CUDA runtime library.
=======================================================================
* Device #1: NVIDIA GeForce RTX 3080, 9472/10239 MB (2559 MB allocatable), 68MCU

Minimum password length supported by kernel: 0
Maximum password length supported by kernel: 256

Hashfile '..\john-1.9.0-jumbo-1-win64\run\hashes.txt' on line 1 (..\Inf...fe67ec86567f1$8$b52097118435e103): Signature unmatched
No hashes loaded.
Started: Thu Dec 29 18:36:39 2022
Stopped: Thu Dec 29 18:36:43 2022

I decided to try the example hash $rar5$16$74575567518807622265582327032280$15$f8b4064de34ac02ecabfe9abdf93ed6a$8$9843834ed0f7c754 in a file. Which gave me the same error. I checked the encoding and it's UTF-16

With encoding ANSI or UTF-8 and removing the .\some.rar: prefix to match the example we get successful processing:

hashcat (v6.2.6) starting

Successfully initialized the NVIDIA main driver CUDA runtime library.
OpenCL API (OpenCL 3.0 CUDA 12.0.89) - Platform #1 [NVIDIA Corporation]
=======================================================================
* Device #1: NVIDIA GeForce RTX 3080, 9472/10239 MB (2559 MB allocatable), 68MCU

Minimum password length supported by kernel: 0
Maximum password length supported by kernel: 256

Hashes: 1 digests; 1 unique digests, 1 unique salts
Bitmaps: 16 bits, 65536 entries, 0x0000ffff mask, 262144 bytes, 5/13 rotates

Optimizers applied:
* Zero-Byte
* Single-Hash
* Single-Salt
* Brute-Force
* Slow-Hash-SIMD-LOOP

Watchdog: Temperature abort trigger set to 90c

Host memory required for this attack: 1474 MB

The wordlist or mask that you are using is too small.
This means that hashcat cannot use the full parallel power of your device(s).
Unless you supply more work, your cracking speed will drop.
For tips on supplying more work, see: https://hashcat.net/faq/morework

Approaching final keyspace - workload adjusted.

$rar5$16$4cde559c998a25026ece3dbd11b5898e$15$35e82175b4e2c08f434fe67ec86567f1$8$b52097118435e103:Qu@ntumPh@nt0m!

Session..........: hashcat
Status...........: Cracked
Hash.Mode........: 13000 (RAR5)
Hash.Target......: $rar5$16$4cde559c998a25026ece3dbd11b5898e$15$35e821...35e103
Time.Started.....: Thu Dec 29 18:57:31 2022 (1 sec)
Time.Estimated...: Thu Dec 29 18:57:32 2022 (0 secs)
Kernel.Feature...: Pure Kernel
Guess.Mask.......: Qu?sntumPh?snt0m?s [15]
Guess.Queue......: 1/1 (100.00%)
Speed.#1.........:    15265 H/s (0.91ms) @ Accel:64 Loops:256 Thr:32 Vec:1
Recovered........: 1/1 (100.00%) Digests (total), 1/1 (100.00%) Digests (new)
Progress.........: 6528/35937 (18.17%)
Rejected.........: 0/6528 (0.00%)
Restore.Point....: 4352/35937 (12.11%)
Restore.Sub.#1...: Salt:0 Amplifier:0-1 Iteration:32768-32799
Candidate.Engine.: Device Generator
Candidates.#1....: Qu|ntumPh}nt0m* -> Qu`ntumPh}nt0m-
Hardware.Mon.#1..: Temp: 63c Fan:  0% Util: 97% Core:1935MHz Mem:9501MHz Bus:16

Started: Thu Dec 29 18:57:28 2022
Stopped: Thu Dec 29 18:57:33 2022

Check the hashcat.log for details. The cracked hash/password can be found in hashcat.potfile. Running the command above with --show will reveal the contents.

$rar5$16$4cde559c998a25026ece3dbd11b5898e$15$35e82175b4e2c08f434fe67ec86567f1$8$b52097118435e103:Qu@ntumPh@nt0m!

Summary

The funny thing is that the other .rar file that I wanted to recover the password for had much longer hashes that then reminded me of what the actual password was which was stored in a password manager. Because I had to do this last time with a different tool specific to RAR files.

Password safety. Best to use complex passwords of 12 characters or more consisting of upper-case (26), lower-case (26), digits (10), and special characters(33 - excluding space). Each character added from that search space listed will increase by a factor of 26+26+10+33 = 95. So an 8-character password would have a search space of 95^8. The time it takes get shorter and shorter with each new generation of hardware. If we use mask attacks and wordlists from breaches we can dramatically reduce this time.

Note the difference between encryption (reversible) and hashing (non-reversible). Most sites will likely just store the hash of your password with a salt. If they are secure, I've seen homegrown encryption used. The more complicated the password the longer it will take to crack it with brute-force. However, if you use the same password in multiple places, then you're at risk of having those hashes leaked and used in a rainbow table/wordlist attack on other sites/apps. See Troy Hunt's haveibeenowned.com. Password managers will encrypt your passwords since they need to be able to decrypt them for the app/website autofill as well as for your usage. You won't be able to decrypt those passwords without the key if you're using a cryptographically safe algorithm like AES/SHA-256 and specifically PBKDF2-HMAC-SHA256 (with iterations of 310,000 or higher as recommended by OWASP) for passwords (with current hardware, maybe with quantum computers). PBKDF2 and ARGON2 algorithms slow down brute force cracking of passwords, but nothing beats having a high entropy password (50-bits and up). Increasing iteration count increase the time to crack a password, but only linearly. Increasing the entropy of the password (additional length and random character) will increase the time to crack exponentially (since each bit is base-2).

TODO:

  • Benchmarks with other modes and simple/complex passwords.

  • Tailor wordlists

  • Clusters with GPUs and hashcat benchmarks

References