ReversingScriptingCrypto +392 points

3 months ago - 144 views

Babyrev

This was a typical reversing challenge where we get a Linux binary. I had some trouble at the start because I read some values wrong, but in the end, it wasn't too bad. We just need to reverse engineer a password check to get the flag.

The Binary

In this challenge, we just get a file named babyrev that we need to reverse engineer. A good first step is to just run the file command on it, to see what it is.

Shell

$ file babyrev
babyrev: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, 
BuildID[sha1]=1a48d52c4e5d664115f6cd11651f9c688e8198db, for GNU/Linux 3.2.0, stripped

It looks like a Linux ELF binary we can run. This also tells us the binary is "stripped", which means it has no debugging symbols. So things like function names are not available for free. If we just run the binary we can see what it does:

Shell

$ ./babyrev
Welcome to baby's first rev! :>
Please enter your username: j0r1an
Please enter your password: hunter2
j0r1an? I don't know you... stranger danger...

We can input a name and password. In the last line, it says "j0r1an? I don't know you" which means we probably first need to find a valid username to get past this check.

Understanding the Logic

We can open up the binary in a program like Ghidra to decompile the binary and see what it does under the hood (for a basic tutorial of this program, see my Ghidra Introduction).

When we open the binary in the Ghidra CodeBrowser and look at the Functions, we can't see any interesting functions because the binary is stripped. To see what the program does we first have to find the main function. To do this in a stripped binary there is this nice trick:

  1. Find the entry function (every binary has this) and click on it
  2. In the Decompiled code on the right, look for the first FUN_XXX (eg. FUN_00101427). This is always the main function.
  3. Now you can click on it and press L to rename it to something recognizable like main.

In this case, the entry function looks like this:

C

void entry(undefined8 param_1,undefined8 param_2,undefined8 param_3)
{
  undefined8 in_stack_00000000;
  undefined auStack8 [8];

  __libc_start_main(FUN_00101427,in_stack_00000000,&stack0x00000008,FUN_00101520,FUN_00101590,
                    param_3,auStack8);
  do {
                    /* WARNING: Do nothing block with infinite loop */
  } while( true );
}

So the main function is FUN_00101427.
In this main function there are some strings we recognize from running the binary:

C


undefined8 main(void)
{
  int iVar1;
  long in_FS_OFFSET;
  char local_48 [16];
  undefined local_38 [40];
  long local_10;

  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  printf("Welcome to baby\'s first rev! :>\nPlease enter your username: ");
  __isoc99_scanf(&DAT_00102045,local_48);
  printf("Please enter your password: ");
  __isoc99_scanf(&DAT_00102045,local_38);
  iVar1 = strcmp(local_48,"bossbaby");
  if (iVar1 != 0) {
    printf("%s? I don\'t know you... stranger danger...",local_48);
    exit(0);
  }
  puts("You\'re almost there!");
  iVar1 = FUN_001012b2(local_38);
  if (iVar1 == 0x26) {
    printf("You\'re boss baby!");
  }
  if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
    __stack_chk_fail();
  }
  return 0;
}

If we read this code we see that the answer (scanf()) to the first question of "Please enter your username:" is stored in local_48. We can again use L to rename this to something more understandable like username. We also see right after the question and answer for the "password". So local_38 is the password, and we can also rename this to password.

Then after that, we see a strcmp() of the username and the string "bossbaby".

C

iVar1 = strcmp(username,"bossbaby");
if (iVar1 != 0) {
    printf("%s? I don\'t know you... stranger danger...",username);
    exit(0);
}

The return value of strcmp is not 0 if the strings are not equal. This means if our username is not "bossbaby" the program prints this "I don't know you" message and exits. If we run the program and provide "bossbaby" as the username we can see we indeed get further in the program to the "almost there" message.

Shell

$ ./babyrev
Welcome to baby's first rev! :>
Please enter your username: bossbaby
Please enter your password: hunter2
You're almost there!

Now we just need to get the password. And since there is no special message when we get the password correct, the password is probably the flag.

Looking at the rest of the code we see that the password is passed into this FUN_001012b2 function. The return value of this function is then compared to 0x26 and will only let us through if that condition is true. So this function is probably some password check.

C

iVar1 = FUN_001012b2(password);
if (iVar1 == 0x26) {
    printf("You\'re boss baby!");
}

We can rename this function to check_password to make it easier to understand again. If we look into this function we see that it is pretty big and looks complicated:

C

int check_password(char *param_1)
{
  long lVar1;
  char *pcVar2;
  ulong uVar3;
  size_t sVar4;
  size_t *psVar5;
  long in_FS_OFFSET;
  size_t local_68;
  undefined8 local_60;
  char *local_50;
  int local_48;
  int local_44;
  long local_40;
  undefined *local_38;
  long local_30;

  local_30 = *(long *)(in_FS_OFFSET + 0x28);
  local_50 = param_1;
  local_68 = strlen(param_1);
  local_40 = local_68 - 1;
  local_60 = 0;
  uVar3 = ((local_68 * 4 + 0xf) / 0x10) * 0x10;
  for (psVar5 = &local_68; psVar5 != (size_t *)((long)&local_68 - (uVar3 & 0xfffffffffffff000));
      psVar5 = (size_t *)((long)psVar5 + -0x1000)) {
    *(undefined8 *)((long)psVar5 + -8) = *(undefined8 *)((long)psVar5 + -8);
  }
  lVar1 = -(ulong)((uint)uVar3 & 0xfff);
  if ((uVar3 & 0xfff) != 0) {
    *(undefined8 *)((long)psVar5 + ((ulong)((uint)uVar3 & 0xfff) - 8) + lVar1) =
         *(undefined8 *)((long)psVar5 + ((ulong)((uint)uVar3 & 0xfff) - 8) + lVar1);
  }
  pcVar2 = local_50;
  local_38 = (undefined *)((long)psVar5 + lVar1);
  local_44 = 0;
  *(undefined8 *)((long)psVar5 + lVar1 + -8) = 0x1013b0;
  FUN_00101209(pcVar2,(long)psVar5 + lVar1);
  local_48 = 0;
  while( true ) {
    pcVar2 = local_50;
    uVar3 = (ulong)local_48;
    *(undefined8 *)((long)psVar5 + lVar1 + -8) = 0x1013fb;
    sVar4 = strlen(pcVar2);
    if (sVar4 <= uVar3) break;
    if (*(int *)(&DAT_00104020 + (long)local_48 * 4) == *(int *)(local_38 + (long)local_48 * 4)) {
      local_44 = local_44 + 1;
    }
    local_48 = local_48 + 1;
  }
  if (local_30 != *(long *)(in_FS_OFFSET + 0x28)) {
    __stack_chk_fail();
  }
  return local_44;
}

But we don't need to fully understand this code. We can try to find patterns and see if we can figure out what is going on.

Towards the bottom of the code, we see a DAT_00104020 variable. These are always interesting because they point to data stored in the binary. We can double-click on it to see what it contains. Then in the Listing panel in the center, we see a lot of hex values. The simplest way to export these to something like Python for later is by first clicking on the top of the values (from 00104020) and then scrolling down to the bottom (to 001040b7) and shift-clicking on the last value. This will select all values in between. Then right-click, go to "Copy Special" and then scroll down to "Python Byte String" and press OK. Then we can paste it into a python script like so:

Python

data = b'\x66\x00\x00\x00\xd9\x00\x00\x00\x88\x01\x00\x00\x41\x03\x00\x00\xc0\x07\x00\x00\xf9\x06\x00\x00\xa4\x18\x00\x00\x95\x00\x00\x00\x0a\x01\x00\x00\xd5\x01\x00\x00\x7c\x03\x00\x00\xa9\x03\x00\x00\xb0\x07\x00\x00\x69\x19\x00\x00\x27\x01\x00\x00\xa3\x01\x00\x00\xc4\x01\x00\x00\xb9\x02\x00\x00\x54\x07\x00\x00\x89\x08\x00\x00\x50\x0f\x00\x00\xf0\x01\x00\x00\x54\x02\x00\x00\xd9\x02\x00\x00\x58\x05\x00\x00\x71\x05\x00\x00\x24\x09\x00\x00\x19\x10\x00\x00\x42\x03\x00\x00\xad\x03\x00\x00\x08\x05\x00\x00\xe9\x06\x00\x00\x30\x0a\x00\x00\xe1\x10\x00\x00\x84\x12\x00\x00\x00\x05\x00\x00\xd2\x05\x00\x00\x4d\x07\x00\x00'

Just looking at the data it looks like it's separated in bytes of 4. We keep seeing a value, some \x00 characters, and then another value, after 4 bytes. Now let's see what happens to this data in the code.

In the while(true) loop at the bottom, we see that we have a local_48 variable. It starts at 0 and every look gets incremented by 1 with local_48 = local_48 + 1;. So this is an iterator we can rename to i.

We then see that in the if() statement it also uses this i variable. i * 4 is added to the address of DAT_00104020 and local_38 for the comparison. This matches our theory that these values are 4 bytes each, probably 4 bytes per character.

So if this local_38 variable gets compared to the data, the data is probably the encrypted flag, and local_38 is somehow the encrypted password we give. We can rename this to encrypted_password for now. Then we need to understand how this password gets encrypted so we can reverse it.

In some code a bit earlier the encrypted_password gets set to an address:

C

encrypted_password = (undefined *)((long)psVar5 + lVar1);

This (long)psVar5 + lVar1 also is used in the function right after:

C

FUN_00101209(pcVar2,(long)psVar5 + lVar1);

It is the second argument to the function. The first argument is pcVar2 which is another variable. It comes from the following lines:

C

local_50 = param_1;
...
pcVar2 = local_50;

And the param_1 here is just the first parameter of the initial function call. This was the password that was given to the check_password function so pcVar2 is just our given password.

So the FUN_00101209 is called with the first argument being our given password, and the second argument being the address of the encrypted_password variable that gets compared to the data. This second variable is probably the destination of whatever the function does to encrypt the password. So we can rename FUN_00101209 to encrypt_password.

If we look at this function we see some more calculations:

C

long encrypt_password(char *param_1,long param_2)
{
  size_t sVar1;
  int local_1c;

  local_1c = 0;
  while( true ) {
    sVar1 = strlen(param_1);
    if (sVar1 <= (ulong)(long)local_1c) break;
    *(int *)(param_2 + (long)local_1c * 4) =
         local_1c * local_1c +
         ((int)param_1[local_1c] << ((char)local_1c + (char)(local_1c / 7) * -7 & 0x1fU));
    local_1c = local_1c + 1;
  }
  return param_2;
}

It's another while(true) loop that goes until the length of the password is reached. It also has a local_1c variable which is the iterator, we can rename it to i.

This line:

C

*(int *)(param_2 + (long)i * 4) =
    i * i + ((int)param_1[i] << ((char)i + (char)(i / 7) * -7 & 0x1fU));

...is what really does the encrypting. It sets the param_2 with an offset of i * 4 to a value. This is what we expect because it was read in chunks of 4 before as well. The value is a big calculation using param_1[i], or the current character in the loop. We don't need to understand the whole calculation to reverse it, we can just try all possible letters and see what matches the resulting value.

Decrypting the data

Now we need to write a script that brute forces these characters to see when they match with the encrypted data since the encryption happens character by character. We already have the data in Python so we just need to translate the encryption function to Python.

Python

i * i + (param_1[i] << (i + (i // 7) * -7 & 0x1f))

Then we can replace the param_1[i] with all possible characters to see if they match the encrypted data. All of this together in a script looks something like this:

Python

import struct

# Data from Ghidra
data = b'\x66\x00\x00\x00\xd9\x00\x00\x00\x88\x01\x00\x00\x41\x03\x00\x00\xc0\x07\x00\x00\xf9\x06\x00\x00\xa4\x18\x00\x00\x95\x00\x00\x00\x0a\x01\x00\x00\xd5\x01\x00\x00\x7c\x03\x00\x00\xa9\x03\x00\x00\xb0\x07\x00\x00\x69\x19\x00\x00\x27\x01\x00\x00\xa3\x01\x00\x00\xc4\x01\x00\x00\xb9\x02\x00\x00\x54\x07\x00\x00\x89\x08\x00\x00\x50\x0f\x00\x00\xf0\x01\x00\x00\x54\x02\x00\x00\xd9\x02\x00\x00\x58\x05\x00\x00\x71\x05\x00\x00\x24\x09\x00\x00\x19\x10\x00\x00\x42\x03\x00\x00\xad\x03\x00\x00\x08\x05\x00\x00\xe9\x06\x00\x00\x30\x0a\x00\x00\xe1\x10\x00\x00\x84\x12\x00\x00\x00\x05\x00\x00\xd2\x05\x00\x00\x4d\x07\x00\x00'

# Unpack data into integers (4 bytes)
encrypted_values = [unpacked[0] for unpacked in struct.iter_unpack('<I', data)]

flag = b""
for i in range(len(encrypted_values)):  # For every character
    for c in range(256):  # For every possible character
        # *(int *)(param_2 + (long)i * 4) = i * i + ((int)param_1[i] << ((char)i + (char)(i / 7) * -7 & 0x1fU));
        value = i * i + (c << (i + (i // 7) * -7 & 0x1f))
        if value == encrypted_values[i]:  # If the value matches the encrypted value
            flag += bytes([c])
            break

print(flag)

This script first unpacks the data into integers, then loops through all possible characters and checks if the value matches the encrypted value. If it does, it adds the character to the flag. And at the end, it indeed prints the flag!
flag{7bdeac39cca13a97782c04522aece87a}