Intercept | Jorian Woltjer

This was a very interesting challenge as not a lot of people solved it during the event. The challenge gave not a lot of information, but it was enough for me to solve it.

The Challenge

We get two files. The first is a .pcap file that contains a few TCP packets. There are only 6 packets in the whole capture:

When you see TCP packets in a capture like this, a good start is to see if everything together forms some data. In Wireshark, you can select a packet, and then click Analyze -> Follow -> TCP Stream. This will show the data going back and forth between the two devices.

Text

[/......b#...F8.R...z$....
..mL...w;...J..^.. wNY..?6&..f!...D...R'K...*

This did not show any interesting data, so but it might be encrypted with some algorithm. We can show the data as a hex stream by showing it as 'Raw'. This shows the raw bytes that were sent, with each packet seperated by a newline. We'll save this data for later.

Text

5b2fedd48019
d4fe6223f1d1984638a9
521ac5877a24eed19b0c0ae9f16d4c02cc86773bfaa8924a
cc075e8ce920774e
59eabc3f3626ded46621d3b0ca441afce552274bd6da1f2a

We also get another file, intercept.asm. This file contains some assembly code, including a reference to some sort of encryption funcion:

x86asm

	.text

	.globl	state

	.bss

	.type	state, @object

	.size	state, 1

state:

	.zero	1

	.text

	.globl	do_encrypt

	.type	do_encrypt, @function

do_encrypt:
	push	rbp
	mov	rbp, rsp
	mov	eax, edi
	mov	BYTE PTR [rbp-4], al
	movzx	eax, BYTE PTR state[rip]
	add	eax, 19
	xor	BYTE PTR [rbp-4], al
	movzx	eax, BYTE PTR state[rip]
	add	eax, 55
	mov	BYTE PTR state[rip], al
	movzx	eax, BYTE PTR [rbp-4]
	pop	rbp
	ret

It looks like these assembly instructions are not the whole code, because it starts off already indented. But we do get the whole do_encrypt function, so the data we got in Wireshark is probably this encrypted data with this function. Now we just need to reverse it to get the original data back.

Decompiling

I'm not that familiar with assembly, and I don't understand most of these instructions. Lucky, I don't have to! I thought of the idea to decompile this code somehow, but for that we would need a full program that can be run. However, since we have the full code of do_encrypt we can try some C instructions ourselves to see if anything looks like the Assembly instructions we got in the challenge. That way we can translate the Assembly instructions into more understandable C code that we can then reverse.

To do this manually, I used a site called Compiler Explorer that takes C code, and compiles it to see the corresponding assembly instructions.
Since this encrypt function probably takes a character, and then everntually gives back a character, we can start off the function as something like this:

char do_encrypt(char c) {
    return c;
}

This gives us the following assembly code:

x86asm

do_encrypt(char):
    push    rbp
    mov     rbp, rsp
    mov     eax, edi
    mov     BYTE PTR [rbp-4], al
    ; Missing instructions here
    movzx   eax, BYTE PTR [rbp-4]
    pop     rbp
    ret

Which already looks pretty good compared to the original Assembly. The starting 4 instructions are the same, as well as the ending 3. There is just some instructions missing in the middle, which will probably be the encryption algorithm.
Looking at the original Assembly, we can see it does an add eax, 19 instruction, which looks interesting. Right before it moves the BYTE PTR state[rip] variable into eax, meaning this state[rip] value will get 19 added to it. After trying a bunch of things I found that if you create global variable called state, you get a similar result to the orginal assembly with state: and .zero 1, which would signal that it starts off at zero.

If we also add 19 to it like the assembly instructions did, we get another very similar

char state = 0;

char do_encrypt(char c) {
    state = state + 19;
    return c;
}

x86asm

state:

    .zero   1
do_encrypt(char):
    push    rbp
    mov     rbp, rsp
    mov     eax, edi
    mov     BYTE PTR [rbp-4], al
    movzx   eax, BYTE PTR state[rip]
    add     eax, 19
    ; Missing instructions here
    mov     BYTE PTR state[rip], al
    movzx   eax, BYTE PTR [rbp-4]
    pop     rbp
    ret

After looking a bit online on what the registers mean, I found that the al register is just the lowest 8 bits of the eax register, so these are essentially the same when working with characters from 0 to 255. (Wiki)

Now there's only a few instructions left to translate. First there is an xor instruction which XORs BYTE PTR [rbp-4] (which is the c variable) with the al register. As we now know this is the same as the eax register we just added 19 to. So this means we XOR c with the state+19. Putting this in the compiler explorer this matches our idea:

char state = 0;

char do_encrypt(char c) {
    c ^= state+19;
    return c;
}

x86asm

state:

    .zero   1
do_encrypt(char):
    push    rbp
    mov     rbp, rsp
    mov     eax, edi
    mov     BYTE PTR [rbp-4], al
    movzx   eax, BYTE PTR state[rip]
    add     eax, 19
    xor     BYTE PTR [rbp-4], al
    ; Missing instructions here
    movzx   eax, BYTE PTR [rbp-4]
    pop     rbp
    ret

We're almost there, but we're just missing the movzx eax, BYTE PTR state[rip] and add eax, 55 instructions. We've already seen similar instructions to this, so this just means we move the global state variable into the eax register, and add 55 to it. Putting this into the C code as well gets us the full assembly code decomiled as C code:

char state = 0;

char do_encrypt(char c) {
    c ^= state+19;
    state += 55;
    return c;
}

x86asm

state:

        .zero   1
do_encrypt(char):
        push    rbp
        mov     rbp, rsp
        mov     eax, edi
        mov     BYTE PTR [rbp-4], al
        movzx   eax, BYTE PTR state[rip]
        add     eax, 19
        xor     BYTE PTR [rbp-4], al
        movzx   eax, BYTE PTR state[rip]
        add     eax, 55
        mov     BYTE PTR state[rip], al
        movzx   eax, BYTE PTR [rbp-4]
        pop     rbp
        ret

Now that we have very simple and readable C code, we can actually reverse it to crack the encryption.

Reversing

Looking at the C code, we can see it has a global state variable that keeps getting incremented by 55 every time do_encrypt() is called. We also use this state variable to XOR the character with. Since XOR is symmetric, and we know the state starts at 0, we can just do the encryption again to flip all the bits back to their original value and get the original text.

Since I'm more familiar with Python, I'll translate the C code into Python code like this:

Python

state = 0

def do_encrypt(byte):
    """
    char state = 0;

    int do_encrypt(char byte) {
        byte ^= state+19;
        state += 55;
        return byte;
    }
    """
    global state
    
    byte ^= (state + 19) % 256  # XOR with state in char range
    state += 55  # increment state
    return byte

Note that I used the % 256 here because in Python numbers don't have a maximum range. To simulate a char in C I have to make sure it wraps around if it gets any bigger than 256.
Then we can encrypt anything like this:

Python

data = b"Test"

encrypted = b''
for c in data:
    encrypted += bytes([do_encrypt(c)])

print(encrypted)

And since the XOR is a symmetric operation, we can just encrypt our ciphertext again to get back the orignal data. Putting in the messages we got in Wireshark, we can see some of the plaintext:

Python

packets = [
    "5b2fedd48019", 
    "14e7eb765119d4fe6223f1d1984638a9", 
    "816b5419dac07b27eed9d35e09fdef65521ac5877a24eed19b0c0ae9f16d4c02cc86773bfaa8924a", 
    "2ae9a12a2f1dd7923d39eea78d5909f9f57b2a16ddc87d33ada58f1208d4f737755283da1168a3e6cc075e8ce920774e", 
    "f88d483fb1bb8a440884af7d69e2c5874b3bb3be695d4fd5a97b27e7d7d0572cf0bf665405dbfe", 
    "4225e19b824813e4b96a4e178a95776fe1d8800b0bf7f0705719c0c37834a8f7a26f1febbe3d7119dad66427d5f58b4259eabc3f3626ded46621d3b0ca441afce552274bd6da1f2a"
]

packets = [bytes.fromhex(data) for data in packets]  # Decode from hex

for data in packets:
    encrypted = b''
    for c in data:
        encrypted += bytes([do_encrypt(c)])  # Encrypt and convert into to char

    print(encrypted)

Text

b'Hello?'
b'\x89j\xa9!\xc8\xa1?\x98-\xe5'
b'\xd1\xa04\xaf%\xb2#\xd5\xa0~\xa3\t\xe6#\xc9\xbe?\xac\x16\xa35\xae\xaf>'
b'g\xe5G\xdcn\x9e\x82b'
b':pm7\tPs0}sZp=j\x7f`6Xf3y<\x02~'

The first message b'Hello?' looks like it correctly decrypted. But for some reason the rest of the messages were not decrypted correctly.

I'm still not sure why, but the data we copied from the Follow TCP Stream in Wireshark was not the actual TCP data in the packets. I would guess this is where most people got stuck.
But after looking more at the Wireshark packets I found that I could copy the actual data by clicking on the packet, then in the bottom panel opening the Transmission Control Protocol, and all the way at the bottom there is the TCP payload. By right clicking and going to Copy -> ...as a Hex Stream we can copy the TCP payload as hex to paste into our Python script. After doing this for all the messages we also get one more message than before, showing this really is something different. So the new packets are the following:

Python

packets = [
    "5b2fedd48019", 
    "14e7eb765119d4fe6223f1d1984638a9", 
    "816b5419dac07b27eed9d35e09fdef65521ac5877a24eed19b0c0ae9f16d4c02cc86773bfaa8924a", 
    "2ae9a12a2f1dd7923d39eea78d5909f9f57b2a16ddc87d33ada58f1208d4f737755283da1168a3e6cc075e8ce920774e", 
    "f88d483fb1bb8a440884af7d69e2c5874b3bb3be695d4fd5a97b27e7d7d0572cf0bf665405dbfe", 
    "4225e19b824813e4b96a4e178a95776fe1d8800b0bf7f0705719c0c37834a8f7a26f1febbe3d7119dad66427d5f58b4259eabc3f3626ded46621d3b0ca441afce552274bd6da1f2a"
]

Running the script with this data we actually get all the messages correctly and in the last one it shows us the flag!
HTB{pl41nt3xt_4sm?wh4t_n3xt_s0urc3_c0d3?}