Created
August 27, 2023 13:45
-
-
Save HanEmile/9202b25c21568fcf4ab3dc67f821c273 to your computer and use it in GitHub Desktop.
The not working, but insanely fun writeup for the "run of the mill" challenge from the dragon sector ctf quals 2022(?).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#! /usr/bin/env nix-shell | |
#! nix-shell -i python -p python39Packages.termcolor | |
import sys | |
from termcolor import colored, cprint | |
""" | |
DragonSector CTF 2021 - Run of the Mill | |
This was an awesome challenge, although this script is what is left over of a | |
not-so-successful by nevertheless amazing eavening in which we tried | |
something that didn't work. | |
Let's start with the challenge: We can input some stuff, (64 bytes in total). | |
The input is stored in memory and is then manipulated in a stage we called | |
the "blackbox" stage. During this stage, there are 6 operations done on the | |
input: | |
- add, sub | |
- rol (roll left), ror (roll right) | |
- xor | |
- mov | |
At the end of the blackbox phase, the mutated input is checked against a | |
stored output, if they are the same, "well done" or so is printed, otherwise | |
we are informed that we didn't enter the right thing. | |
Now this alone would hint that we could just throw angr or so at the problem | |
(which we did and that's how we solved the challenge in the end), but we | |
decided that while angr was running (which took some time, as we didn't get | |
it to run as we wanted directly), we could use the time do try something fun: | |
actually reversing. | |
Now reversing by hand is quite simple and works well on small binaries, but | |
this blackbox phase consists of about 7000 instructions that look like | |
complete garbage; nothing you'd like to reverse by hand. So we came to the | |
conclusion that we could probably automate this. | |
Now you might ask yourself: how? Well, you can't just execute code | |
"backwards", or can you? | |
Well, we started by executing the code using qiling hooking to the memory of | |
our input, so that everytime the memory get's accessed, we can view it, in | |
the end, we got a listing of the memory after every step of the blackbox | |
phase. Using this, we had some kind of reference that we could follow, as we | |
now knew how the input should look after each step... ...or before: | |
We started writing a (at that time) small python script that would emulate | |
the binary blackbox backwards. Starting at the bottom, we used the output we | |
knew we'd get with a given input and tried building an execution enging that | |
would just implenet the instruction that might appear one by one. | |
Now this might work fine for simple stuff, but this ain't so simple... As | |
there are some parts where we'd for example xor with registers where we don't | |
know what's in them, we had to implement some weird telescoping stuff so that | |
we could look ahead into what value might be inside some registers. This all | |
worked quite well, it wasn't nice, but kind of functional. As we continued, | |
we got a pretty messy python script which you can admire below, but as we got | |
to the 12th instruction, a teammate came around and distributed a welcome | |
message; he had solved the challenge using angr... | |
At the same time, we were struggling with a `rol` instruction which rolls | |
some data bitwise. This was quite frustrating, as the roll instruction is | |
kind of simple and shouldn'y be all to much trouble. We had solved some | |
larger problems such as the telescoping and has thought we'd be over the | |
harder parts, but this seemd (and still seems) unlogical. | |
I'm writing this at 03:10 in the morning after being kind of done with the | |
challenge, I'm really motivated to build a simmilar challenge for the ALLES! | |
CTF 2022. | |
Whatever, here some ugly CTF that grew quite organically over a weekend (I | |
added a lot of comments afterwards, so you might get a better understanding | |
of how this is supposed to work): | |
""" | |
# color definitions for having a nicer debugging experience | |
blue = "blue" | |
red = "red" | |
green = "green" | |
cyan = "cyan" | |
yellow = "yellow" | |
# The offset of our input in memory | |
base_offset = 0x412000 | |
# Read the blackbox instructions, inverted | |
with open("blackbox.raw_instr") as data: | |
content = data.readlines()[::-1] | |
# Read the goals we want to reach (these are used as assertions for checking if | |
# we are doing the right thing) | |
with open("lifegoals.txt") as data: | |
lifegoals = data.readlines()[::-1] | |
# Convert our lifegoals to a bytearray, so that we can manipulate it | |
goal = bytearray(bytes.fromhex(lifegoals[0])) | |
# Strip the beginning of the content, as there is some pre/post code that isn't | |
# part of the blackbox | |
content = content[22:] | |
# Recursive look ahead for getting a value for the variable `var` at the | |
# instruction with the index `i`. | |
def telescope(i, var): | |
print("\n========== telescoping further ==========") | |
""" | |
The following block contains some instructions the we don't know the | |
value of (mm0). In (1), the dst value of mm0 is not known. In order to | |
get it, we need to look ahead in order to find it's value like this: | |
mm0 in (dst 1) | |
→ [0x412090] in (src 2) | |
→ r9 in (src 3) | |
→ 0xcb0c99d5b6d7daaf in (src 4) | |
----------------------------------------------------------- | |
1 | pxor mm0, qword [segment.LOAD2] | |
| instr dst size src | |
----------------------------------------------------------- | |
2 | movq mm0, qword [0x412090] | |
| instr dst size src | |
----------------------------------------------------------- | |
3 | mov qword [0x412090], r9 | |
| instr size dst src | |
----------------------------------------------------------- | |
4 | movabs r9, 0xcb0c99d5b6d7daaf | |
| instr dst src | |
----------------------------------------------------------- | |
""" | |
src = "" # the value we want to find | |
x = 0 # the amount of instructions we're looking ahead | |
while True: | |
arr = content[i+x].strip().split(" ") # the components of the instruction | |
print(arr) | |
# local variables used for the instruction that is currently being looked at | |
l_instr = arr[0] | |
l_dst = "" | |
l_src = "" | |
l_size = "" | |
################################ | |
# parse the components | |
if arr[1] not in ["qword", "dword"]: | |
l_dst = arr[1] | |
else: | |
l_size = arr[1] | |
if arr[2] not in ["qword", "dword"]: | |
if len(arr) == 4: | |
l_dst = arr[2] | |
else: | |
l_src = arr[2] | |
else: | |
size = arr[2] | |
if len(arr) == 4: | |
l_src = arr[3] | |
################################ | |
# If the local source contains some hex, set the source to the local | |
# source, as we've found a constant value and break the loop, as we | |
# don't need to recurse deeper or even look further. | |
if l_src[:2] == "0x": | |
src = l_src | |
break | |
# If the destination of the current operation is equal to the variable | |
# we're searching a value for, recurse deeper using the source for | |
# that. | |
# For example: if we've got: | |
# | |
# mov a b | |
# | |
# and we're searching for a and b isn't a constant, we'd need to | |
# telescope further for finding a concrete value for b. | |
if l_dst == var: | |
return telescope(i+x, l_src) | |
print(colored(f"{i=} | {x=} | {l_instr=} {l_dst=} {l_src=} {l_size=} | {len(arr)}\n", blue)) | |
# increase x, which is the amount of instructions we're looking ahead | |
x += 1 | |
print(f"========== Telescoping result: {src} ==========\n") | |
return src | |
# helper function for rotating a given string | |
def rot_left(l, n): | |
return l[-n % len(l):] + l[:-n % len(l)] | |
# helper function for converting a bitstring to bytes (duh) | |
def bitstring_to_bytes(s): | |
v = int(s, 2) | |
b = bytearray() | |
while v: | |
b.append(v & 0xff) | |
v >>= 8 | |
return bytes(b[::-1]) | |
# Iterate over each line in the blackbox, starting at the bottom, trying to | |
# execute the instruction | |
i = 0 | |
while i < len(content): | |
# Print the goal (this is what we extracted using qiling). This is sort of | |
# a reference to work on, so after each step, we want to reach this goal. | |
print(colored(f"\n---> {i} {goal.hex()}", green, attrs=['reverse', 'bold'])) | |
# Get the "line" (instruction), clean it up and split it up into the | |
# individual components | |
line = content[i] | |
line = line.strip() | |
arr = line.split(" ") | |
print(colored(arr, green)) | |
# implementations of the individual instructions | |
if arr[0] == "pxor": | |
if arr[1][1] != "[": | |
print(colored(f"{i} | skipped due to reg in dst", yellow)) | |
elif arr[0] == "xor": | |
if arr[1] == "byte": | |
# convert the address provided (for example [0x00410304]) to an offset | |
# in our array (e.g. 4) | |
addr = int(arr[2][3:-1], 16) - base_offset | |
# extract the value that should be xored | |
val = int(arr[3][2:], 16) | |
print(colored(f"{i} | {arr[0]} [{addr}] {val} | {arr}", blue)) | |
# execute the instruction using the parsed values | |
print(addr) | |
goal[addr] = goal[addr] ^ val | |
elif arr[1] == "word": | |
# convert the address provided (for example [0x00410304]) to an offset | |
# in our array (e.g. 4) | |
addr = int(arr[2][3:-1], 16) - base_offset | |
# extract the value that should be xored | |
val = int(arr[3][2:], 16) | |
print(colored(f"{i} | {arr[0]} [{addr}] {val} | {arr}", blue)) | |
# this xors the word (2 bytes), by first xoring the first byte and | |
# then the second one | |
a = goal[addr] ^ (val % (0xff+1)) | |
b = goal[addr+1] ^ (val >> 8) | |
goal[addr] = a | |
goal[addr+1] = b | |
elif arr[2] == "qword": | |
if arr[1][1] != "[": | |
#i += 1 | |
print(colored(f"{i} | skipped due to reg in dst", yellow)) | |
## convert the address provided (for example [0x00410304]) to an offset | |
## in our array (e.g. 4) | |
#addr = int(arr[3][3:-1], 16) - base_offset | |
## extract the value that should be xored | |
#val = telescope(i, arr[1][2:]) | |
#print(colored(f"{i} | {arr[0]} [{addr}] {val} | {arr}", blue)) | |
#print(f"{val=}") | |
#print(f"{addr=}") | |
#print(f"{goal.hex()=}") | |
#a = goal[addr:addr+8] | |
#b = bytearray(bytes.fromhex(val[2:])) | |
#c = bytearray([x^y for x, y in list(zip(a, b))]) | |
#print(f"{a.hex()=}") | |
#print(f"{b.hex()=}") | |
#print(f"{c.hex()=}") | |
## execute the instruction using the parsed values | |
##goal[addr:addr+8] = c | |
# skip rol/ror that don't do anything (rotate by 0) | |
elif arr[0] == "rol" and arr[3] == "0": | |
print(colored(f"{i} | {arr}", blue)) | |
# rotate left implementation (we're skipping this for now) | |
elif arr[0] == "rol": | |
if arr[1] == "dword": | |
print(colored(f"{i} | {arr}", blue)) | |
#addr = int(arr[2][3:-1], 16) - base_offset | |
#amount = -int(arr[3][2:], 16) | |
#a = goal[addr:addr+4].hex() | |
#print(f"{a=}") | |
#b = bin(int(a, 16))[2:] | |
#print(f"{b=}") | |
#c = rot_left(b, amount) | |
#print(f"{c=}") | |
#e = bitstring_to_bytes(c) | |
#print(f"{e.hex()=}") | |
##e = bytes.fromhex(d) | |
##goal[addr:addr+4] = e | |
#print(colored(f"{i} | rol {addr} {amount} | {arr}", blue)) | |
pass | |
# skip the rotate right stuff | |
elif arr[0] == "ror" and arr[3] == "0": | |
print(colored(f"{i} | {arr}", blue)) | |
# if we've got a mov that uses qwords and the address is out of our range | |
# (idk why this happend, but it does, so we need to handle this...) | |
elif arr[0] == "mov": | |
if arr[1] == "qword": | |
addr = int(arr[2][3:-1], 16) - base_offset | |
if addr > 64: | |
print(colored(f"{i} | skipped due addr out of scope (>64)", yellow)) | |
# Handle movq + pxor, they are kind of always present together, so we'll | |
# handle them together | |
# In the end, all this does is something like `a = a ^ b` | |
elif arr[0] == "movq": | |
# values from the movq instruction | |
# → movq qword [segment.LOAD2], mm0 | |
instr = arr[0] | |
size = arr[1] | |
dst = arr[2] | |
src = arr[3] | |
print(colored(f"{i} | {instr} {size} {dst} | {src}", blue)) | |
# get the next instruction | |
nxt = content[i+1].strip().split(" ") | |
if nxt[0] == "pxor": | |
# pxor mm0, qword [segment.LOAD2] | |
instr = nxt[0] | |
left = nxt[1] | |
size = nxt[2] | |
right = nxt[3] | |
# we're on instruction `i` and have no value for `"left"`, | |
# TELESCOPING! | |
val = telescope(i, left) | |
addr = int(right[3:-1], 16) - base_offset | |
print(f"{val=}") | |
print(f"{addr=}") | |
print(f"{goal.hex()=}") | |
# calculate the xor value needed by zipping the two lists | |
a = goal[addr:addr+8] | |
b = bytearray(bytes.fromhex(val[2:])[::-1]) | |
c = bytearray([x^y for x, y in list(zip(a, b))]) | |
print(f"{a.hex()=}") | |
print(f"{b.hex()=}") | |
print(f"{c.hex()=}") | |
# update the goal using the value we've calculated above | |
goal[addr:addr+8] = c | |
# just skip movabs | |
elif arr[0] == "movabs": | |
if arr[1][1] != "[": | |
print(colored(f"{i} | skipped due to reg in dst", yellow)) | |
#elif arr[0] == "movq": | |
# if arr[3] == "mm0": | |
# print("\nLOOK AHEAD:") | |
# print(f" ✓ {content[i-2]}", end="") | |
# print(f" ✓ {content[i-1]}", end="") | |
# print(f" → {content[i+0]}", end="") | |
# for j in range(1, 20): | |
# print(f" {content[i+j]}", end="") | |
# handle "not implemented" cases | |
else: | |
print(colored(f"not implemented: {arr}", red)) | |
print(colored(goal.hex(), red)) | |
print(colored(lifegoals[i+1], red)) | |
break | |
# ok, this is kind of awesome: we're comparing at the goal we've calculated | |
# until now with the output we're supposed to get... | |
a = lifegoals[i+1].strip() | |
b = goal.hex() | |
if a != b: | |
print(colored("\n==========] ASSERTION ERROR [==========", red)) | |
print(colored(f"got : {goal.hex()}", red)) | |
print(colored(f"wanted: {lifegoals[i+1]}", red), end="") | |
print(" ", end="") | |
# ... WITH NICE COMPILER LIKE HIGHLIGHTING OF DIFFERENCES! | |
for k in range(0, len(goal.hex())): | |
if str(goal.hex())[k] == str(lifegoals[i+1])[k]: | |
print(" ", end="") | |
else: | |
print("^", end="") | |
print("") | |
# and exit if there is an error | |
sys.exit(-1) | |
# increment i (this just loads the next instruction in the next iteration | |
# of the loop) | |
i += 1 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment