SuperAES | Jorian Woltjer

This interesting cryptography challenge combining LCGs and AES was solvable not only using math but also some experimentation to find edge cases. We learn that not all parameters of a Linear Congruence Generator are secure.

The Challenge

For this challenge, we get the Python source code to the following encryption algorithm:

m = 288493873028852398739253829029106548736
a = int(time.time())
b = a % 16
s = random.randint(1, m-1)

class LCG:
    def __init__(self, a, b, m, seed):
        self.a = a  # multiplier
        self.b = b  # increment
        self.m = m  # modulus
        self.state = seed

    def next_state(self):
        self.state = (self.a * self.state + self.b) % self.m
        return self.state


states = []

class SuperAES:
    def __init__(self, key, lcg):
        self.aes = AES.new(key, AES.MODE_ECB)
        self.lcg = lcg

    def encrypt(self, plaintext):
        ciphertext = b""
        for i in range(0, len(plaintext), 16):
            pt = plaintext[i:i+16]
            ct = self.encrypt_block(pt)
            ciphertext += ct

        return ciphertext

    def encrypt_block(self, block):
        state = self.lcg.next_state()
        states.append(state)
        keystream = self.aes.encrypt(int(state).to_bytes(16, "big"))
        return bytes([k ^ b for k, b in zip(keystream, block)])


assert len(flag) == 33
assert flag.startswith(b"GCC{")

key = os.urandom(32)
cipher = SuperAES(key, LCG(a, b, m, s))

times = int(input("how many times do you want the flag ?"))
assert times < 50

print(cipher.encrypt(flag*times).hex())

Following the code, an LCG is created with some random parameters and a SuperAES instance is created with a random key. Then the flag is encrypted using this custom cipher, which interestingly allows us to repeat the flag up to 49 times before encrypting.

The encryption takes a random LCG state and encrypts it with the random key, completely unpredictable so far. This then generates 16 bytes as the keystream which is used to XOR with the plaintext and form the ciphertext.

If you're familiar, this kind of logic is very similar to the Counter (CTR) mode of operation for block ciphers. There the state is simply a number counting upward instead of a random one, and it does so securely meaning that the randomness does not actually matter for this algorithm.

Experimenting with the Generator

One problem common with CTR modes is when the counter value is re-used because then the same data is encrypted with the same key, generating the same 16 bytes of keystream.

The outputs of the LCG are random, but we are not sure yet how random. We'll add some debugging information to the generator output to see if anything weird is apparent:

    def encrypt_block(self, block):
        state = self.lcg.next_state()
        states.append(state)
        print(f"{state=}")  # log the random state
        keystream = self.aes.encrypt(
            int(state).to_bytes(16, "big"))
        print(f"{keystream=}")  # and the keystream generated from it
        return bytes([k ^ b for k, b in zip(keystream, block)])

...
if len(states) != len(set(states)):
    print("DUPLICATE STATES DETECTED!")
    print(states)

This should show us all the states and allow us to potentially visually see patterns, and at the end, we also detect in the code if duplicates were detected which is interesting in CTR mode. Running this code a few times with a high number of repeats (49), everything seems to be secure. But every so often, we get a really weird result:

...
state=10489648876493579546755332483835008294
keystream=b'\x81\xb2\xb1\xc9~\xf0R?|e,\xb2\x0e\xe1MC'
state=262921787776739428443602432884303238438
keystream=b'\x18-\x13\x1da]\xfd\xeeS\x08\xefJ\xe9\x9aY\xe9'
state=190798319519526328758788975627026601254
keystream=b'Y\xca\xbf\x8f\xd9\x16\xa0\xc4\xa7\xd5\x9a\xae\xa76\xed\xe2'
state=46551383005100129389162061112473326886
keystream=b'\xbcH}\xeei\xe5 i\xdd\\\x0e\x0c\xb2\x87R\x98'
state=46551383005100129389162061112473326886
keystream=b'\xbcH}\xeei\xe5 i\xdd\\\x0e\x0c\xb2\x87R\x98'
state=46551383005100129389162061112473326886
keystream=b'\xbcH}\xeei\xe5 i\xdd\\\x0e\x0c\xb2\x87R\x98'
state=46551383005100129389162061112473326886
keystream=b'\xbcH}\xeei\xe5 i\xdd\\\x0e\x0c\xb2\x87R\x98'
...

After a few normal iterations, suddenly, the state enters a loop of itself repeating the same state over and over again. That also means the keystream will be the same for every 16 bytes after this point! But is it exploitable?

When two parts of the keystream are predictably the same, a known plaintext attack can easily undo the XOR of the plaintext and ciphertext to get the keystream bytes. And because they are re-used, these can then be used in all other blocks to simply decrypt the ciphertext back into plaintext that you did not know before.

From the challenge source code, we can expect the first bytes of the flag to be GCC{, and assume the last byte to be }. This is some known plaintext that we might be able to exploit.

Solution

We can add a bit more debugging to understand exactly what parts of the plaintext are encrypted at what time:

    def encrypt(self, plaintext):
        ciphertext = b""
        for i in range(0, len(plaintext), 16):
            pt = plaintext[i:i+16]
            print(f"{i=}, {pt=}")  # log the index and part of the plaintext we will encrypt
            ct = self.encrypt_block(pt)
            print(f"{ct=}")  # log the resulting ciphertext
            ciphertext += ct

        return ciphertext

Running this with a case where states are repeated, we find something like this:

i=1520, pt=b'C{ABCDEFGHIJKLMN'
state=132870865390101690479455352299616110342
keystream=b'\xde,\x16sX\xea\\\xe8?\xdf\xe7\x0fxUx\xd1'
ct=b'\x9dWW1\x1b\xae\x19\xaex\x97\xaeE3\x195\x9f'
i=1536, pt=b'OPQRSTUVWXYZ12}G'
state=132870865390101690479455352299616110342
keystream=b'\xde,\x16sX\xea\\\xe8?\xdf\xe7\x0fxUx\xd1'
ct=b'\x91|G!\x0b\xbe\t\xbeh\x87\xbeUIg\x05\x96'
i=1552, pt=b'CC{ABCDEFGHIJKLM'
state=132870865390101690479455352299616110342
keystream=b'\xde,\x16sX\xea\\\xe8?\xdf\xe7\x0fxUx\xd1'
ct=b'\x9dom2\x1a\xa9\x18\xady\x98\xafF2\x1e4\x9c'
i=1568, pt=b'NOPQRSTUVWXYZ12}'
state=132870865390101690479455352299616110342
keystream=b'\xde,\x16sX\xea\\\xe8?\xdf\xe7\x0fxUx\xd1'
ct=b'\x90cF"\n\xb9\x08\xbdi\x88\xbfV"dJ\xac'
i=1584, pt=b'GCC{ABCDEFGHIJKL'
state=132870865390101690479455352299616110342
keystream=b'\xde,\x16sX\xea\\\xe8?\xdf\xe7\x0fxUx\xd1'
ct=b'\x99oU\x08\x19\xa8\x1f\xacz\x99\xa0G1\x1f3\x9d'
i=1600, pt=b'MNOPQRSTUVWXYZ12'
state=132870865390101690479455352299616110342
keystream=b'\xde,\x16sX\xea\\\xe8?\xdf\xe7\x0fxUx\xd1'
ct=b'\x93bY#\t\xb8\x0f\xbcj\x89\xb0W!\x0fI\xe3'
i=1616, pt=b'}'
state=132870865390101690479455352299616110342
keystream=b'\xde,\x16sX\xea\\\xe8?\xdf\xe7\x0fxUx\xd1'
ct=b'\xa3'

Note that any part of GCC{ and } is attackable to get the keystream at that index, and it looks like these pieces of known plaintext are shifted around the place all over the plaintext chunks. This way we should be able to recover the whole keystream.

Even easier however, we can look at the last byte of this which is always }, and is XORed with the 1st byte of the keystream, resulting in 0xa3 in this case.

The XOR works on a byte-level, where the 1st byte of plaintext is always XORed with the 1st byte of the keystream, the 2nd byte of plaintext is XORed with the 2nd byte of the keystream, etc. Because this repeats every 16 bytes, we can re-use the 1st byte of the keystream to decrypt the 1st byte of any other ciphertext that used the same keystream. As we saw in the debugging we can leak all different positions in the flag like this because all characters are shifted into the 1st spot at some point!

To prove our idea, we'll find the 1st keystream byte by XORing the output of 0xa3 with the known plaintext of }, which results in 0xde as the 1st keystream byte.
Now take the 1st byte of the 2nd to last ciphertext: 0x93. We can find its plaintext byte by XORing it with the keystream we just recovered, and get M!

Let's build a full proof-of-concept that decrypts the last byte of the full plaintext which we know to be }, and recover the 1st byte of the keystream as we showed before. Then we can iterate over every other block to find their 1st plaintext character:

from pwn import *

def get(n):
    r = process("./chall.py")
    r.sendlineafter(b"how many times do you want the flag ?", str(n).encode())
    return bytes.fromhex(r.recvall().decode())

# Repeating behaviour doesn't always happen, so just try until it does
while True:
    ct = get(49)
    keystream = ct[-1] ^ ord("}")

    leak = []
    # Iterate in steps of 16 to leak the 1st byte of each block
    for i in range(0, len(ct), 16):
        pt = ct[i] ^ keystream
        leak.append(pt)

    print(bytes(leak))

After some waiting, the output prints a string like this:

b'\x02\xe2\xab\xb4\x95\x9d\x14\xfbZ\xd4\xe2O(\xa7\xeee,\x81\xa8z@3\x1e\xc7\xe1\xc2\xf3\x04/?\x96\xaeNGM}L2K1JZIYHXGWFVEUDTCSBRAQ{PCOCNGM}L2K1JZIYHXGWFVEUDTCSBRAQ{PCOCNGM}'

We successfully leaked all the characters in the flag! Now all that's left is putting them in the right spot. The flag is repeated with a length of 33 as asserted in the source code, so we can do some modular arithmetic to find the positions of the flag we are leaking, they will be the same every run anyway.

from pwn import *

FLAG_LENGTH = 33

def get(n):
    # r = process("./chall.py")
    r = remote("challenges1.gcc-ctf.com", 4001)
    r.sendlineafter(b"how many times do you want the flag ?", str(n).encode())
    return bytes.fromhex(r.recvall().decode())

while True:
    ct = get(49)
    keystream = ct[-1] ^ ord("}")

    # Spots to fill the flag in
    flag = [0]*FLAG_LENGTH

    for i in range(0, len(ct), 16):
        pt = ct[i] ^ keystream
        # flag repeats every 33 bytes, so the index is the remainder after dividing
        flag_index = i % FLAG_LENGTH
        flag[flag_index] = pt

    # print the final flag attempt
    print(bytes(flag))

At this point, we can even run it on the real remote instance, and after a few seconds, we leak the full flag!
GCC{pretend_its_a_good_flag_2515}