UMass CTF 2025

Like last year, I bring rev and misc to UMassCTF25. The ideas are not new, but require some good observations. I hope all live players enjoyed the experience or learned something new. For others, be sure to try the challenges first before reading the write-ups.

Source code and build can be found on my GitHub: dungwinux/umassctf25-src. With that, let’s dive in.

[rev] the void

Description

Final points: 486
Solves: 20

It’s just you and me.

SHA256: 45bb5f7188f9c3723b5d32b9a45ad48eae97966e350857d5a2162da25122b36c

Small note: Be aware that some antivirus programs might flag this binary. If you plan to run it, make sure you have Visual C++ Redistributable installed :)

Visual C++ latest (aka.ms)

Files:

void.zip

Comment

I learned about the Heaven’s Gate 2 years ago, but I haven’t seen a challenge doing it in an inverse way. Honestly, there are no advantages in doing this, other than confusing decompiler and debugger.

The core logic is written in x86-32 assembly with AVX2. For the idea, I remember a removed feture on x86: Binary-coded Decimal (BCD). It is not useful for the majority of modern applications, but it is certainly something new to those who aren’t familiar with retro computing. To spice up the reversing a bit, I also added my own implementation of string as a diversion. And that’s all the ingredients. Testing wasn’t so easy, as some debuggers exhibit their weakness:

x64dbg failed to parse instruction and cannot continue after switching CPU mode
windbg executed correctly, but failed to render disassembly and AVX registers after the switch (WinAPI issue?)
lldb and gdb also ran program correctly, also failed to render YMM registers.
Only an unofficial build of gdb was able to print YMMs correctly

Fascinatingly, an unexpected issue in player’s solve was found out during the few final hours of the CTF. Due to my selection of flag, there is an overflow in BCD addition at the third-to-last character, making some brute force attempts failed. To see why, read the solution below.

Solution

Diving through code, we can see our main program reads input, runs it through function at 0x004017a0, and compares the output with a specific string. To move on, we need to decipher that function.

The simplified anatomy of the function is:

Check if CPU supports AVX2
Switch to compatibility mode (64-bit -> 32-bit)
While the length >= 32, use AVX2 implementation, else use Intel BCD opcodes
Switch to a smaller encoding: Subtract by 0x30, and limit at 75 (any larger number becomes 75)
Convert each byte of input into BCD representation (19 (0x13) -> 0x19)
Perform BCD addition with immediate based on byte index (0x19 + 0x02 -> 0x21)
Write-back to destination
Go back to step 3 if not done
Return to long mode (32-bit -> 64-bit)

The switch of CPU mode could be recognized via the use of retf instruction. As x86-64 inherits most of the instructions from x86-32, the majority of the disassembly are still readable. We should see the step 4 as:

AVX2	x86-32
`vmovdqu ymm5, ymmword [rdx + rsi]` `vpsubb ymm5, ymm5, ymmword [rsp + 0x58]` `vpaddusb ymm5, ymm5, ymm1` `vpsubb ymm5, ymm5, ymm1`	`movzx ebx, byte ptr [esi + edx]` `sub bl, 0x30` `mov ax, 75` `cmp al, bl` `cmova eax, ebx`

and step 5 as:

AVX2	x86-32
`vpunpcklbw ymm6, ymm5, ymm0` `vpmullw ymm6, ymm6, ymm2` `vpsrlw ymm6, ymm6, 0xb` `vpunpckhbw ymm7, ymm5, ymm0` `vpmullw ymm7, ymm7, ymm2` `vpsrlw ymm7, ymm7, 0xb` `vpackuswb ymm7, ymm6, ymm7` `vpaddb ymm7, ymm7, ymm7` `vpsllw ymm6, ymm7, 3` `vpsubb ymm5, ymm5, ymm7` `vpaddb ymm7, ymm7, ymm7` `vpaddb ymm7, ymm7, ymm7` `vpsubb ymm5, ymm5, ymm7` `vpor ymm7, ymm5, ymm6`	`mov cx, 0xa04 ; 2564` … `div ch` `shl al, cl` `or al, ah`

However, to recognize the intention of BCD addition is not straightforward, especially with AVX2 instructions seen in the disassembly. However, even if you failed to fully convince your tool to read them as x86-32, there is a tiny hint:

Cutter disassembly

The line with invalid starts with 0x27, which is the opcode for DAA Decimal Adjust AL After Addition. This opcode is used in conjunction with ADD to perform BCD addition. You can click the link to see how it is implemented. The AVX2 implementation can be seen roughly following the same idea, except that the CF=1 case is not accounted since it is impossible to overflow before adjustment with our encoding.

The first addend is our input converted into our encoding. The second addend can be reconstructed and represented through the following Python code (which I named in my solve script, confusingly, offset):

lo_offset = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5]
hi_offset = [0, 1, 2, 3, 4, 5, 6, 7]
offset    = [((hi << 4) | lo) for hi in hi_offset for lo in lo_offset]

With those understandings, we can reverse the work by writing a BCD subtraction. This version of mine also includes underflow handling.

def bcd_sub(a: int, b: int):
    """BCD Subtraction. For 8-bit only"""

    a_ = 100 + int(f'{a:x}')
    b_ = int(f'{b:x}')
    ret = int(f'{a_ - b_}'[-2:], 10)
    # Verbose
    print(f'0x{a:x} ({a_ % 100}) - 0x{b:x} ({b_}) -> 0x{ret}')
    return ret

Running bcd_sub through the expected output, we can derive what the input (aka the flag) should be:

Expected	Offset	Derived input
0x0 (0)	0x0 (0)	0x0 `0`
0x21 (21)	0x1 (1)	0x20 `D`
0x22 (22)	0x2 (2)	0x20 `D`
0x4 (4)	0x3 (3)	0x1 `1`
0x77 (77)	0x4 (4)	0x73 `y`
0x52 (52)	0x5 (5)	0x47 `_`
0x30 (30)	0x6 (6)	0x24 `H`
0x11 (11)	0x7 (7)	0x4 `4`
0x70 (70)	0x8 (8)	0x62 `n`
0x29 (29)	0x9 (9)	0x20 `D`
0x47 (47)	0x0 (0)	0x47 `_`
0x1 (1)	0x1 (1)	0x0 `0`
0x66 (66)	0x2 (2)	0x64 `p`
0x71 (71)	0x3 (3)	0x68 `t`
0x61 (61)	0x4 (4)	0x57 `i`
0x34 (34)	0x5 (5)	0x29 `M`
0x67 (67)	0x10 (10)	0x57 `i`
0x13 (13)	0x11 (11)	0x2 `2`
0x65 (65)	0x12 (12)	0x53 `e`
0x33 (33)	0x13 (13)	0x20 `D`
0x61 (61)	0x14 (14)	0x47 `_`
0x55 (55)	0x15 (15)	0x40 `X`
0x24 (24)	0x16 (16)	0x8 `8`
0x23 (23)	0x17 (17)	0x6 `6`
0x24 (24)	0x18 (18)	0x6 `6`
0x23 (23)	0x19 (19)	0x4 `4`
0x57 (57)	0x10 (10)	0x47 `_`
0x75 (75)	0x11 (11)	0x64 `p`
0x78 (78)	0x12 (12)	0x66 `r`
0x13 (13)	0x13 (13)	0x0 `0`
0x69 (69)	0x14 (14)	0x55 `g`
0x81 (81)	0x15 (15)	0x66 `r`
0x24 (24)	0x20 (20)	0x4 `4`
0x50 (50)	0x21 (21)	0x29 `M`
0x69 (69)	0x22 (22)	0x47 `_`
0x73 (73)	0x23 (23)	0x50 `b`
0x97 (97)	0x24 (24)	0x73 `y`
0x72 (72)	0x25 (25)	0x47 `_`
0x87 (87)	0x26 (26)	0x61 `m`
0x30 (30)	0x27 (27)	0x3 `3`
0x75 (75)	0x28 (28)	0x47 `_`
0x40 (40)	0x29 (29)	0x11 `;`
0x34 (34)	0x20 (20)	0x14 `>`
0x68 (68)	0x21 (21)	0x47 `_`
0x57 (57)	0x22 (22)	0x35 `S`
0x86 (86)	0x23 (23)	0x63 `o`
0x97 (97)	0x24 (24)	0x73 `y`
0x29 (29)	0x25 (25)	0x4 `4`
0x88 (88)	0x30 (30)	0x58 `j`
0x54 (54)	0x31 (31)	0x23 `G`
0x64 (64)	0x32 (32)	0x32 `P`
0x57 (57)	0x33 (33)	0x24 `H`
0x0 (0)	0x34 (34)	0x66 `r`
0x38 (38)	0x35 (35)	0x3 `3`
0x91 (91)	0x36 (36)	0x55 `g`

Thus, the flag is: UMASS{0DD1y_H4nD_0ptiMi2eD_X8664_pr0gr4M_by_m3_;>_Soy4jGPHr3g}

[misc] pandora

Description

Final points: 498
Solves: 9

Surprise, surprise!

SHA256: 4FE8721994E47EF16546380A01EBE66624D57B00DAE478C90DAB1F88278F6072

Note: The file can be considered malicious. Handle with care.

Files:

pandora.zip

Comment

This is certainly what I expected when making the challenge:

Comment on CTF discord server

While brainstorming ideas, I made three observations of the zip file, which later become the three pillars of the challenge:

You can have entries with duplicated names. Archive tool will try to unzip both and ask for confirmation from the second time due to the possibilty of overwrite.
Each entry is compressed individually in Zip archive (antonym is solid). In other words, you can seek and decompress file seperately.
DEFLATE has a maximum compression ratio of 1032:1. Meanwhile, BZip2 works well with exceptionally high repetition of characters.

Originally, I planned to block unzipping by putting huge file right before the flag in the archive. However, 7-Zip sees compressing everything is inefficient and decides to store instead of compress the flag. Instead, I decided to create a sparse file, with the data stored at the beginning. But it would be boring to just put flag in one file, right? So I hid the flag among what compression algorithm can do best: plain text - which leads us to Project Gutenberg.

But the longest step in the making process is compressing eveything, which took me ~25 hours doing naively through 7-Zip. All build commands can be found in my GitHub. Since the ideas are laid out, it should be obvious what to do.

Solution

In case you are new, this is called a zipbomb. Unless you have enough resources (perhaps in the near future), it would hurt your memory and disk to extract everything, so you are expected to seek and partially extract each file in the zip. There are many solutions beyond mine, as it rather depends on how your selected tool parses zip.

Using libzip

libzip comes with useful ziptool that you can use to seek to each entry then stream the decompression. Since the rest are \0 and would take long to decompress, you can exit program base on timeout. After checking each entry, you should find the 40th one containing the flag. You can find Bash script and PowerShell solution below.

seq 41 | xargs -i timeout 0.1 ziptool archive.zip cat "{}" | grep UMASS

0..41 | % -Parallel { busybox timeout 0.5 ziptool archive.zip cat $_ | sls UMASS }

7-Zip + Python

While you can also specify entry and print out its content using Python, there is an issue in zipfile library implementation which will decompress all entries in your memory before you’d even get there. A workaround is to delete entry after reading it, which can be done manually through 7-Zip by selecting the entry and press Delete key on your keyboard. Eventually, you should also find the flag in the 40th entry.

And the flag should be: UMASS{That script, it's got life in it. Working, it is. No doubt.}

ToC

[rev] the void

Description

Comment

Solution

[misc] pandora

Description

Comment

Solution

Using libzip

7-Zip + Python