This was a really interesting challenge where the goal was clear immediately. The only problem was finding out how to create a solution. I had just started to learn Rust a few weeks ago and this was the perfect challenge to use its speed in brute-forcing two hashes. In the end, I got a program that finds a solution in 30 seconds, without any multithreading, also grabbing the first blood of the challenge.
The Challenge
The challenge is simply explained in the description:
Let's celebrate TU Delft's birthday. Find two messages that start with 'TU Delft' and 'is the best' such that the first 12 hex characters of their SHA-3 hashes are the same.
So we'll need to make some sort of program that calculates lots of hashes until we find two starting with the same 12 hex characters. It tells us SHA-3 is used, but there are a few variants for the number of output bits for SHA-3: 224
, 256
, 384
, or 512
. If we're going to be brute-forcing something, we better brute-force the correct algorithm. So first we need to find out what variant is used.
When we connect to the nc
port given in the description, we can try inputting a wrong answer:
$ nc [IP] 6001
Find two messages that start with 'TU Delft' and 'is the best' such that the first 12 hex characters of their SHA-3 hashes are the same.
Message 1 (starting with 'TU Delft'): TU Delft-abc
Message 2 (starting with 'is the best'): is the best-xyz
The first parts of your hashes don't seem to match: 8d86c6eb267b 4717d3a8f617
We get back the hashes that don't match! This means we can now just try the four algorithms until we see one that fits. I used a quick online site to do the hashes. There I inputted the same "TU Delft-abc" for all four algorithms:
Server: 8d86c6eb267b
4717d3a8f617
SHA3-224: c09caa4f469c876490457576831e49b1d8f6589d8936c92e318071fe
SHA3-256: 5f88fcd8d44b0f21d20ba8d2c5bbcc976d760492784025cd8f58e1c5dd035a5d
SHA3-384: 4fadbf56103a9441839ff33b98297c53ca29f5bbfdf7bb973fabb5905194f418e804cb7c69fbe355404bf64dbe17f468
SHA3-512: 8d86c6eb267b006194155383bbacb3167dac18596e11034590d621702514fe7558f16e0a6319125ebca557b6c32be78c3eaad8e3ca1454689a398e3247b9f708
When we compare the results, we find that the 8d86c6eb267b
part matches with SHA3-512! Now we can be sure what algorithm is used, and we can start with brute-forcing hashes.
Naive Approach
One simple idea could be starting with a hash of "TU Delft-abc" (8d86c6eb267b
) and then brute-forcing hashes for TU Delft-[ANYTHING]
until it matches with the 8d86c6eb267b
hash. Before doing so, we could calculate how many hashes that would take on average.
Hex characters can be one of 16 possible characters, and here we have 12 of them. So the total would become 16**12 = 281.474.976.710.656
hashes. Computers are fast, but not "281 trillion SHA3-512 hashes in minutes"-fast. You can try to see how fast your machine could calculate hashes with openssl
:
Mine does about 2.000.000 hashes per second, quite fast but doing it with this naive approach would take over 4 years to complete. We need to work smarter, not harder.
The Birthday Paradox
The Birthday Paradox is a relatively well-known paradox that says "the probability of a shared birthday exceeds 50% in a group of only 23 people". This sounds unintuitive, but that's why it's a paradox. When you think about it, adding one person does not increase the chances by 1/365, but by everyone else also in the group. There are 23 more chances of that person sharing a birthday with any of the other people.
With our naive approach, we're basically taking one random person, and then comparing every new person only with that one random person. In our case, adding a new person is generating a new hash, which is the most expensive part. So we're only getting one chance for the new hash to be a match.
If we instead use the idea of the Birthday Paradox to keep every hash we have so far, in a group, our chances of finding a match keep increasing! We would only need to compute about sqrt(281.474.976.710.656) = 16.777.216
hashes, which is very doable with 2 million per second. The only catch is that we need to store all hashes in memory, but it's way faster than the naive approach.
The Program
So let's get cracking. I decided to try and use the Rust programming language, because of its speed. I had just started learning it a few weeks ago and thought this would be the perfect use for it. There is the sha3
crate that can calculate SHA3-512 hashes for us fast. Then we need to also store all the seen hashes somewhere to compare against, but while keeping the original plaintext because we'll need it in the end to submit. I used a HashMap
for this because I knew it is optimized for quickly finding if a key exists already in the map with the .contains_key()
method.
Our goal is to find two strings, that when hashed, have the same 12 hex characters at the start. We can translate this into the hashes having the same 6 bytes at the start, do not need a conversion to hex. We will keep generating "TU Delft"-hashes, but we don't want to compare these to itself. We need to compare it to a bunch of "is the best"-hashes to find a solution. So we'll also need to generate a bunch of those. While we do, it would be wasteful not to also check if those "is the best"-hashes are in the list of "TU Delft"-hashes. So we'll do both at the same time!
Time for some code. I used the bruteforce
crate that can simply generate a bunch of sequential strings from a character set. Automatically increasing the length of the string at the end. This way we never run out of values to try, and it makes our lives easier.
use std::collections::HashMap;
use bruteforce::{BruteForce, charset::Charset};
use sha3::{Digest, Sha3_512};
const PREFIX1: &str = "TU Delft-";
const PREFIX2: &str = "is the best-";
const CHARSET: &'static str = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"; // Characters that will be appended to prefixes
/// Calculate SHA3-512 hash of `s`, returning only the first 6 bytes
fn hash(s: String) -> Vec<u8> {
let mut hasher = Sha3_512::new();
hasher.update(s);
let hash = &hasher.finalize();
hash[..6].to_vec()
}
fn main() {
// Maps where the key is the hash, and the value is the original string
let mut hashes1: HashMap<Vec<u8>, String> = HashMap::new();
let mut hashes2: HashMap<Vec<u8>, String> = HashMap::new();
let brute_forcer = BruteForce::new(Charset::from(CHARSET));
for s1 in brute_forcer {
let s2 = s1.clone(); // Copy for later
// Get starting 6 bytes of the hash
let digest1 = hash(PREFIX1.to_owned() + &s1);
if hashes2.contains_key(&digest1) { // Match with other list
println!("{:?} & {:?} => {}",
PREFIX1.to_owned() + &s1,
PREFIX2.to_owned() + hashes2.get(&digest1).unwrap(),
hex::encode(digest1));
break; // Stop when found
}
// Insert into the list for other to check
hashes1.insert(digest1, s1);
// Do the same for s2, with everything flipped around
let digest2 = hash(PREFIX2.to_owned() + &s2);
if hashes1.contains_key(&digest2) {
println!("{:?} & {:?} => {}",
PREFIX1.to_owned() + hashes1.get(&digest2).unwrap(),
PREFIX2.to_owned() + &s2,
hex::encode(digest2));
break;
}
hashes2.insert(digest2, s2);
}
}
As I'm still new to Rust, this program took about an hour to make. But when it runs, it runs fast. When you try it for yourself, make sure to use the --release
flag to make the program as optimized as possible. On my laptop it takes around 30 seconds to generate a valid solution:
$ cargo new sha3_bruteforce
# # <put code in sha3_bruteforce/src/main.rs>
$ cargo run --release
"is the best-JeZY" & "TU Delft-sG0W" => 7e1be05c17ca
It finds is the best-JeZY
and TU Delft-sG0W
, which both should produce a hash that starts with 7e1be05c17ca
. You can verify it yourself with the site from earlier, but let's try submitting it to the flag server:
$ nc $IP 6001
Find two messages that start with 'TU Delft' and 'is the best' such that the first 12 hex characters of their SHA-3 hashes are the same.
Message 1 (starting with 'TU Delft'): TU Delft-sG0W
Message 2 (starting with 'is the best'): is the best-JeZY
You did it! TUDCTF{b1r7hd4y_4774ck5_4r3_b3773r_7h4n_pl41n_bru73_f0rc3}
We got the flag! In the end, this program was enough to find a solution relatively quickly. This could be even more optimized by using multithreading, essentially running multiple copies of the program at the same time. But this was enough for the challenge.