iead | Solving VM based checker with Klee

Published:	2023-12-16
Tags:	ctf rev python llvm se

To solve the “just the check please” task from PPP-CTF in 2023 I went the unusual path of lifting the virtual machine code used as obfuscation to LLVM-IR. That way I was able to utilize the advanced symbolic execution engine “Klee” to solve for the winning input.

But let’s not get ahead of ourselfs and first briefly examine virtualization base obfuscation.

The anatomy of virtualized programs

Generally programs employing virtualization can be separated into three distinct parts:

The virtual machine code
The virtual machine emulator
The enclosing program

The virtual machine code contains the program parts the author wants to protect from analysis. It can be anything between a loose bundle of utility functions to large procedures that implement important parts of program logic. The emulator on the other hand offers no functionality to the program other than enabling a seamless transition into, and sometimes even out of the virtual machine. One important property of the emulator is if and how real and virtual code can interact with the memory of its counterpart. While the memory could be fully separated and buffer may be passed “by value” a shared memory model is possible as well.

just the check please

As with all reverse engineering challenges we start of with an analysis and are able to reveal the following information about the challenge:

The flag has 16 characters
The VM uses 64 bit words
The VM code es “encrypted”
The VM code consists of 75305 words
The instruction set

The “encryption” is a simple xor against a PRNG that can be undone with a bit of code:

def xorshift(a):
    b = (a ^ (a << 13)) & 0xffffffffffffffff
    c = (b ^ (b >>  7)) & 0xffffffffffffffff
    d = (c ^ (c << 17)) & 0xffffffffffffffff
    return d

rng = 0xb4968177a89b7ac6
for i in range(len(vmcode)):
    rng = xorshift(rng)
    vmcode[i] ^= rng

Further analysis of the emulator yields the ISA of the virtual machine. It is RISC like architecture with an infinite amount of registers. There is only one instruction to read from memory and none to write to memory. The first sixteen byte of memory contain the flag input the rest is left uninitialized.

nems = {
    0: ("assert", 2), 
    2: ("shl", 3), 
    4: ("shr", 3), 
    5: ("mod", 3), 
    6: ("sub", 3), 
    7: ("and", 3), 
    8: ("imm", 2), 
    9: ("load", 2), 
    10: ("not", 2), 
    11: ("or", 3), 
    12: ("xor", 3), 
    13: ("div", 3), 
    14: ("add", 3), 
    16: ("mul", 3), 
    23: ("ign", 0),
    33: ("win", 0)
}

Two more instructions are noteworthy:

assert: halts if left and right operand differ
win: notifies the emulator that the program successfully finished and the check has been passed

Solving the challenge using Klee

Klee is an advanced symbolic execution framework with one fatal flaw: For it to work you need to add intrinsics into the source code and then use their modified clang compiler to generate bitcode. This is necessary as the symbolic execution is performed on the LLVM IR bitcode, which is unstable. From my work on LLVM optimization passes i luckily know a second, a bit hidden way to get to LLVM IR bitcode. One not advertised property of clang is its capability to also compile human readable LLVM IR “source code” like they were C modules.

When we lift the virtual machine code to human readable LLVM IR and add the necessary intrinisc calls we can effectively use the Klee framework to do all the heavy lifting. To give an example here is how i lift three instructions to LLVM IR. Luckily the machine does not know about jumps and branches so we don’t need to peace together the memory values using phi expressions.

    def lift_asrt(self, idx, args):
        arg0, arg1 = args
        print(f"\t%ins.{idx} = icmp eq i64 {self.mem[arg0]}, {self.mem[arg1]}")
        print(f"\t%conv.{idx} = zext i1 %ins.{idx} to i32")
        print(f"\tcall i32 (i32, ...) bitcast (i32 (...) *@klee_assume to i32(i32, ...)*)(i32 %conv.{idx})")

    def lift_load(self, idx, args):
        arg0, arg1 = args
        print(f"\t%ins.{idx}.ptr = getelementptr inbounds [16 x i8], [16 x i8]* %flag, i64 0, i64 {arg0}")
        print(f"\t%ins.{idx}.tmp = load i8, i8* %ins.{idx}.ptr")
        print(f"\t%ins.{idx} = zext i8 %ins.{idx}.tmp to i64")
        self.mem[arg1] = f"%ins.{idx}"

    def lift_add(self, idx, args):
        arg0, arg1, arg2 = args
        print(f"\t%ins.{idx} = add i64 {self.mem[arg0]}, {self.mem[arg1]}")
        self.mem[arg2] = f"%ins.{idx}"

The module and function setup are the last missing pieces on how this checker can be run with Klee.

@win = private unnamed_addr constant [4 x i8] c"win\00", align 1
@fail = private unnamed_addr constant [5 x i8] c"fail\00", align 1
@flagsym = private unnamed_addr constant [5 x i8] c"flag\00", align 1

declare void @puts(i8*)
declare dso_local i32 @klee_make_symbolic(...)
declare dso_local i32 @klee_assume(...)

define i32 @main(i32 %argc, i8** %argv) {
    %flag = alloca [16 x i8], align 1
    %flagarr = getelementptr inbounds [16 x i8], [16 x i8]* %flag, i64 0, i64 0
    call i32 (i8*, i64, i8*, ...) bitcast (i32 (...)* @klee_make_symbolic to i32(i8*, i64, i8*, ...)*)
    		(i8* %flagarr, i64 16, i8* getelementptr inbounds ([5 x i8], [5 x i8]* @flagsym, i64 0, i64 0))

Finally I use the Klee docker container for evaluation

klee@7f272021900b:~$ clang -emit-llvm -g -c code_klee.ll
warning: overriding the module target triple with x86_64-unknown-linux-gnu [-Woverride-module]
1 warning generated.
klee@7f272021900b:~$ klee code_klee.bc
KLEE: output directory is "/home/klee/klee-out-0"
KLEE: Using STP solver backend
KLEE: SAT solver: MiniSat
KLEE: WARNING: undefined reference to function: puts
KLEE: WARNING ONCE: calling external: puts(93860490601264) at [no debug info]
win

KLEE: done: total instructions = 42497
KLEE: done: completed paths = 1
KLEE: done: partially completed paths = 0
KLEE: done: generated tests = 1
klee@7f272021900b:~$ ktest-tool klee-last/test000001.ktest
ktest file : 'klee-last/test000001.ktest'
args       : ['code_klee.bc']
num objects: 1
object 0: name: 'flag'
object 0: size: 16
object 0: data: b'wat3rT1ghT-blAz3'
object 0: hex : 0x776174337254316768542d626c417a33
object 0: text: wat3rT1ghT-blAz3

using llvm-link makes it possible to join multiple .bc files so the entire project does not need to fit into a single .ll file and it is also possible to mix these lifted .ll files with a C/C++ harness.