QuarkLabs Python Security Challenge - Solution


Overview

What follows is a walkthrough of the Python security challenge posed on QuarkLabs and the steps I took to complete it. There are likely many other routes to get to the same goal, what follows is just what occured to me as I drank 21st Amendment Hell or High Watermelon


image

The Lambda Complex

At the best of times lambdas in Python can be a headfuck, but the start to the challenge was designed to be a deliberate mess of lambda pain that looked like this:

(lambda g, c, d: (lambda _: (_.__setitem__('$', ''.join([(_['chr'] if ('chr'
in _) else chr)((_['_'] if ('_' in _) else _)) for _['_'] in (_['s'] if ('s'  
in _) else s)[::(-1)]])), _)[-1])( (lambda _: (lambda f, _: f(f, _))((lambda  
__,_: ((lambda _: __(__, _))((lambda _: (_.__setitem__('i', ((_['i'] if ('i'  
in _) else i) + 1)),_)[(-1)])((lambda _: (_.__setitem__('s',((_['s'] if ('s'  
in _) else s) + [((_['l'] if ('l' in _) else l)[(_['i'] if ('i' in _) else i  
)] ^ (_['c'] if ('c' in _) else c))])), _)[-1])(_))) if (((_['g'] if ('g' in
_) else g) % 4) and ((_['i'] if ('i' in _) else i)< (_['len'] if ('len' in _  
) else len)((_['l'] if ('l' in _) else l)))) else _)), _) ) ( (lambda _: (_.
__setitem__('!', []), _.__setitem__('s', _['!']), _)[(-1)] ) ((lambda _: (_.  
__setitem__('!', ((_['d'] if ('d' in _) else d) ^ (_['d'] if ('d' in _) else  
d))), _.__setitem__('i', _['!']), _)[(-1)])((lambda _: (_.__setitem__('!', [  
(_['j'] if ('j' in _) else j) for  _[ 'i'] in (_['zip'] if ('zip' in _) else
zip)((_['l0'] if ('l0' in _) else l0), (_['l1'] if ('l1' in _) else l1)) for  
_['j'] in (_['i'] if ('i' in _) else i)]), _.__setitem__('l', _['!']), _)[-1  
])((lambda _: (_.__setitem__('!', [1373, 1281, 1288, 1373, 1290, 1294, 1375,
1371,1289, 1281, 1280, 1293, 1289, 1280, 1373, 1294, 1289, 1280, 1372, 1288,  
1375,1375, 1289, 1373, 1290, 1281, 1294, 1302, 1372, 1355, 1366, 1372, 1302,  
1360, 1368, 1354, 1364, 1370, 1371, 1365, 1362, 1368, 1352, 1374, 1365, 1302  
]), _.__setitem__('l1',_['!']), _)[-1])((lambda _: (_.__setitem__('!',[1375,
1368, 1294, 1293, 1373, 1295, 1290, 1373, 1290, 1293, 1280, 1368, 1368,1294,  
1293, 1368, 1372, 1292, 1290, 1291, 1371, 1375, 1280, 1372, 1281, 1293,1373,  
1371, 1354, 1370, 1356, 1354, 1355, 1370, 1357, 1357, 1302, 1366, 1303,1368,  
1354, 1355, 1356, 1303, 1366, 1371]), _.__setitem__('l0', _['!']), _)[(-1)])  
            ({ 'g': g, 'c': c, 'd': d, '$': None})))))))['$'])

(fig 1)

So on quick inspection of the mess we can see that the outermost lambda takes in three values g, c and d and returns the '$' element of a dictionary. Inbetween that a lot of nested function calls take place.
In an effort to understand WTF is going on in the mess let's break out the lambdas and allow ourselves to work through them one at a time. Given that each lambda is taking the result of the next lambda it makes sense to start at the inner most lambda and work our way backwards (out) from there.

The starting point is the establishment of dictionary object using the values passed into the outmost object (whatever they end up being), all very straight forward:

{ 'g': g, 'c': c, 'd': d, '$': None}

(fig 2)

Next come a couple fairly straight forward lambdas that add in extra values to the starting dictionary:

lambda _: (_.__setitem__('!',[1375,  
1368, 1294, 1293, 1373, 1295, 1290, 1373, 1290, 1293, 1280, 1368, 1368,1294,  
1293, 1368, 1372, 1292, 1290, 1291, 1371, 1375, 1280, 1372, 1281, 1293,1373,  
1371, 1354, 1370, 1356, 1354, 1355, 1370, 1357, 1357, 1302, 1366, 1303,1368,  
1354, 1355, 1356, 1303, 1366, 1371]), _.__setitem__('l0', _['!']), _)[(-1)]  

(fig 3)

For those not super fluent in lambdas (yet!) this takes in a variable named _ as it's argument and then proceeds to call the __setitem__ method on it, showing us that _ is a dictionry object and that initially the key '!' is being set to a list of integers, which is then itself finally reassigned to the key 'l0'. At the end of this our dictionary looks like:

{ 'g': g,
'c': c,  
'd': d,  
'$': None,  
'!': [1375,  
1368, 1294, 1293, 1373, 1295, 1290, 1373, 1290, 1293, 1280, 1368, 1368,1294,  
1293, 1368, 1372, 1292, 1290, 1291, 1371, 1375, 1280, 1372, 1281, 1293,1373,  
1371, 1354, 1370, 1356, 1354, 1355, 1370, 1357, 1357, 1302, 1366, 1303,1368,  
1354, 1355, 1356, 1303, 1366, 1371],  
'l0': [1375,  
1368, 1294, 1293, 1373, 1295, 1290, 1373, 1290, 1293, 1280, 1368, 1368,1294,  
1293, 1368, 1372, 1292, 1290, 1291, 1371, 1375, 1280, 1372, 1281, 1293,1373,  
1371, 1354, 1370, 1356, 1354, 1355, 1370, 1357, 1357, 1302, 1366, 1303,1368,  
1354, 1355, 1356, 1303, 1366, 1371] }  

(fig 4)

The next lambda looks very similar and does much the same:

lambda _: (_.__setitem__('!', [1373, 1281, 1288, 1373, 1290, 1294, 1375,  
1371,1289, 1281, 1280, 1293, 1289, 1280, 1373, 1294, 1289, 1280, 1372, 1288,  
1375,1375, 1289, 1373, 1290, 1281, 1294, 1302, 1372, 1355, 1366, 1372, 1302,  
1360, 1368, 1354, 1364, 1370, 1371, 1365, 1362, 1368, 1352, 1374, 1365, 1302  
]), _.__setitem__('l1',_['!']), _)[-1]

(fig 5)

The only difference this time is the lambda takes the dictioary output from the previous lambda (fig4) as it's input and then overwrites the '!' element with a new list of integers which then gets reassigned to the 'l1' element. The dictionary now looks like:

{ 'g': g,
'c': c,  
'd': d,  
'$': None,  
'!': [1373, 1281, 1288, 1373, 1290, 1294, 1375,  
1371,1289, 1281, 1280, 1293, 1289, 1280, 1373, 1294, 1289, 1280, 1372, 1288,  
1375,1375, 1289, 1373, 1290, 1281, 1294, 1302, 1372, 1355, 1366, 1372, 1302,  
1360, 1368, 1354, 1364, 1370, 1371, 1365, 1362, 1368, 1352, 1374, 1365, 1302  
],
'l0': [1375,  
1368, 1294, 1293, 1373, 1295, 1290, 1373, 1290, 1293, 1280, 1368, 1368,1294,  
1293, 1368, 1372, 1292, 1290, 1291, 1371, 1375, 1280, 1372, 1281, 1293,1373,  
1371, 1354, 1370, 1356, 1354, 1355, 1370, 1357, 1357, 1302, 1366, 1303,1368,  
1354, 1355, 1356, 1303, 1366, 1371],  
'l1' : [1373, 1281, 1288, 1373, 1290, 1294, 1375,  
1371,1289, 1281, 1280, 1293, 1289, 1280, 1373, 1294, 1289, 1280, 1372, 1288,  
1375,1375, 1289, 1373, 1290, 1281, 1294, 1302, 1372, 1355, 1366, 1372, 1302,  
1360, 1368, 1354, 1364, 1370, 1371, 1365, 1362, 1368, 1352, 1374, 1365, 1302  
] }

(fig 6)

The third lambda is a little more complex than the previous two but introduces some conventions seen in the rest of the lambdas that are worth understanding. It looks like this (whitespace added for clarity):

lambda _: (_.__setitem__('!', [ (_['j'] if ('j' in _) else j)  
    for _[ 'i'] in (_['zip'] if ('zip' in _) else zip)
((_['l0'] if ('l0' in _) else l0), (_['l1'] if ('l1' in _) else l1))
for _['j'] in (_['i']  
               if ('i' in _) else i)]

), _.__setitem__('l', _['!']), _)[-1]

(fig 7)

We again see elements of the input dictionary _ being set but now there are a bunch of if, else and for keywords present. While logically these act as you would expect their syntax differs slightly from that which you see such functions used outside of a lambda.

The first construct you will see repeated in the rest of the lambdas is:

(_['j'] if ('j' in _) else j

(fig 8)

All this is really doing is looking for the presence of a key in a dictionary and returning one of two results based on whetehr the key is there or not, it's somewhat equivilent to the get method on a dictionary with the optional default argument. We then hit a for loop:

for _[ 'i'] in (_['zip'] if ('zip' in _) else zip)  

(fig 9)

This uses the above check for key in dictionary if logic to set the zip function (there is no key named 'zip' so the zip function is returned). Once we follow this and the next for loops through we end up setting yet another key in the dictionary, this time named 'l' that has the value of the lists from 'l0' and 'l1' combined into a single list. The dictionary now looks like:

{ 'g': g,
'c': c,  
'd': d,  
'$': None,  
'!': [1375, 1373, 1368, 1281, 1294, 1288, 1293, 1373, 1373, 1290, 1295,  
1294, 1290, 1375, 1373, 1371, 1290, 1289, 1293, 1281, 1280, 1280, 1368,  
1293, 1368, 1289, 1294, 1280, 1293, 1373, 1368, 1294, 1372, 1289, 1292,  
1280, 1290, 1372, 1291, 1288, 1371, 1375, 1375, 1375, 1280, 1289, 1372,  
1373, 1281, 1290, 1293, 1281, 1373, 1294, 1371, 1302, 1354, 1372, 1370,  
1355, 1356, 1366, 1354, 1372, 1355, 1302, 1370, 1360, 1357, 1368, 1357,  
1354, 1302, 1364, 1366, 1370, 1303, 1371, 1368, 1365, 1354, 1362, 1355,  
1368, 1356, 1352, 1303, 1374, 1366, 1365, 1371, 1302],  
'l0': [1375,  
1368, 1294, 1293, 1373, 1295, 1290, 1373, 1290, 1293, 1280, 1368, 1368,1294,  
1293, 1368, 1372, 1292, 1290, 1291, 1371, 1375, 1280, 1372, 1281, 1293,1373,  
1371, 1354, 1370, 1356, 1354, 1355, 1370, 1357, 1357, 1302, 1366, 1303,1368,  
1354, 1355, 1356, 1303, 1366, 1371],  
'l1' : [1373, 1281, 1288, 1373, 1290, 1294, 1375,  
1371,1289, 1281, 1280, 1293, 1289, 1280, 1373, 1294, 1289, 1280, 1372, 1288,  
1375,1375, 1289, 1373, 1290, 1281, 1294, 1302, 1372, 1355, 1366, 1372, 1302,  
1360, 1368, 1354, 1364, 1370, 1371, 1365, 1362, 1368, 1352, 1374, 1365, 1302  
],
i     (1371, 1302)  
j     1302  
l     [1375, 1373, 1368, 1281, 1294, 1288, 1293, 1373, 1373, 1290, 1295,  
1294, 1290, 1375, 1373, 1371, 1290, 1289, 1293, 1281, 1280, 1280, 1368,  
1293, 1368, 1289, 1294, 1280, 1293, 1373, 1368, 1294, 1372, 1289, 1292,  
1280, 1290, 1372, 1291, 1288, 1371, 1375, 1375, 1375, 1280, 1289, 1372,  
1373, 1281, 1290, 1293, 1281, 1373, 1294, 1371, 1302, 1354, 1372, 1370,  
1355, 1356, 1366, 1354, 1372, 1355, 1302, 1370, 1360, 1357, 1368, 1357,  
1354, 1302, 1364, 1366, 1370, 1303, 1371, 1368, 1365, 1354, 1362, 1355,  
1368, 1356, 1352, 1303, 1374, 1366, 1365, 1371, 1302] }  

(fig 10)

The follwoing lambda is the first time we see any of the variables (g, c, d) that are passed into the outer most lambda being referenced. This lambda XOR's (^) the value of d with itself. Obvioulsy any number XOR'd with itself just returns 0. This means that the value of the d variable can just be any integer and the 'i' key in the dictionary will get set to 0.

lambda _: (_.__setitem__('!',  
                             ((_['d'] if ('d' in _) else d)
                              ^ (_['d'] if ('d' in _) else d))
), _.__setitem__('i', _['!']), _)[(-1)]

(fig 11)

The next lambda is the simplest yet, just adding a new element to the dictionary 's' and assigning it a value of an empty list (via the '!' element like all previous lambdas).

lambda _: (_.__setitem__('!', []), _.__setitem__('s', _['!']), _)[(-1)]  

(fig 12)

The next lambda is a nested set of other lambdas and is a bit more involved, it is shown below and I've tried to shift the formatting a little to break it into more discrete chunks:

lambda _: (lambda f, _: f(f, _))  
((lambda ZZ,_: ((lambda _: ZZ(ZZ, _))
((lambda _: (_.__setitem__('i', ((_['i'] if ('i' in _) else i) +
1)),_)[(-1)])

((lambda _: (_.__setitem__('s',
            ((_['s'] if ('s' in _) else s) + [((_['l'] if ('l' in _)
else l)[(_['i'] if ('i' in _) else i)] ^ (_['c'] if ('c' in _) else  
c))])), _)[-1])(_)))

 if (((_['g'] if ('g' in
_) else g) % 4) and ((_['i'] if ('i' in _) else i)< (_['len'] if ('len' in _  
) else len)((_['l'] if ('l' in _) else l)))) else _)), _)

(fig 13)

Walking through this we again see alot of the same conditional checks for keys in the dictionary, the 'i' value is then used as an iterator to walk through the values in the list 'l' from the dictionary and perform a mathematical operation on them.

First each item in the 'l' list is XOR'd (^) with the value held in the c variable and puts the result in a list in the 's' key.

Next the g variable is referenced from the dictionary and is subject to a modulo 4 operation:

if (((_['g'] if ('g' in _) else g) % 4)  

(fig 14)

This condition means that for the iteration loop to continue the expression in fig 13 needs to return a non-zero value, which means the value of g can be any value as long as it is not a multiple of 4.

So at this point we know d can be any integer and g can be any integer that is not a multiple of 4. The task is to find out what the correct value of c should be so that when it's XOR'd against the values in the list l it should produce the correct result.

The final lambda walks through the list produced in the previous lambda and stored in the 's' key and takes the chr() representation of the integers and joins them into a string and stores this in the '$' key. With the right value of c this should be our magic string.

lambda _: (_.__setitem__('$', ''.join([(_['chr']  
                        if ('chr' in _)else chr)
 ((_['_'] if ('_' in _) else _)) for _['_'] in (_['s'] if ('s'
in _) else s)[::(-1)]])), _)[-1]

(fig 15)

Finding The Exit

OK we made our way through the lambda complex, we understand the operations that take place on the lists of integers contained within the first 2 lambdas and we know the values needed for 2 of the input variables (g, d). The singular unknown is the value needed for c that when XOR'd with the values in the list produces a meaningful string.

Being a man of limited mathematical genius I opted to just try lots of values of c and see what they produced and if anything looked like the secret URL. We just need to add a little extra logic around the lambdas to walk through guess for the value of c and see what strings are returned. This is shown below:

quark = (lambda g, c, d: (lambda _: (_.__setitem__('$',  
''.join([(_['chr'] if ('chr'  
in _) else chr)((_['_'] if ('_' in _) else _)) for _['_'] in (_['s'] if ('s'  
in _) else s)[::(-1)]])), _)[-1])( (lambda _: (lambda f, _: f(f, _))((lambda  
__,_: ((lambda _: __(__, _))((lambda _: (_.__setitem__('i', ((_['i'] if ('i'  
in _) else i) + 1)),_)[(-1)])((lambda _: (_.__setitem__('s',((_['s'] if ('s'  
in _) else s) + [((_['l'] if ('l' in _) else l)[(_['i'] if ('i' in _) else i  
)] ^ (_['c'] if ('c' in _) else c))])), _)[-1])(_))) if (((_['g'] if ('g' in
_) else g) % 4) and ((_['i'] if ('i' in _) else i)< (_['len'] if ('len' in _  
) else len)((_['l'] if ('l' in _) else l)))) else _)), _) ) ( (lambda _: (_.
__setitem__('!', []), _.__setitem__('s', _['!']), _)[(-1)] ) ((lambda _: (_.  
__setitem__('!', ((_['d'] if ('d' in _) else d) ^ (_['d'] if ('d' in _) else  
d))), _.__setitem__('i', _['!']), _)[(-1)])((lambda _: (_.__setitem__('!', [  
(_['j'] if ('j' in _) else j) for  _[ 'i'] in (_['zip'] if ('zip' in _) else
zip)((_['l0'] if ('l0' in _) else l0), (_['l1'] if ('l1' in _) else l1)) for  
_['j'] in (_['i'] if ('i' in _) else i)]), _.__setitem__('l', _['!']), _)[-1  
])((lambda _: (_.__setitem__('!', [1373, 1281, 1288, 1373, 1290, 1294, 1375,
1371,1289, 1281, 1280, 1293, 1289, 1280, 1373, 1294, 1289, 1280, 1372, 1288,  
1375,1375, 1289, 1373, 1290, 1281, 1294, 1302, 1372, 1355, 1366, 1372, 1302,  
1360, 1368, 1354, 1364, 1370, 1371, 1365, 1362, 1368, 1352, 1374, 1365, 1302  
]), _.__setitem__('l1',_['!']), _)[-1])((lambda _: (_.__setitem__('!',[1375,
1368, 1294, 1293, 1373, 1295, 1290, 1373, 1290, 1293, 1280, 1368, 1368,1294,  
1293, 1368, 1372, 1292, 1290, 1291, 1371, 1375, 1280, 1372, 1281, 1293,1373,  
1371, 1354, 1370, 1356, 1354, 1355, 1370, 1357, 1357, 1302, 1366, 1303,1368,  
1354, 1355, 1356, 1303, 1366, 1371]), _.__setitem__('l0', _['!']), _)[(-1)])  
            ({ 'g': g, 'c': c, 'd': d, '$': None})))))))['$'])

g=1 #Not a multiple of 4  
d=1 #Any integer  
c=0 #What we need to find

for guess in range(c, 10000):  
    try:
        print "%d: %s"%(guess, quark(g, guess, d))
    except:
        ##For certain values of c chr() throws an exception
        pass

(fig 16)

Running this code will produce a mess of strings scrolling up the screen, looking at the output you should see one string with a URL in - the magic value for this 1337 - cute

1335:!lbai{o|e}bolmac!}zozgm!|k}a{|mk}!l9j6:=6jk>7hhhl?<k=7;>k9oj:79>o:o776:>=ljh=98=jj:?96ojh  
1336:.cmnf/pt`sjrm`c/bnl.ru`uhb.sdrntsbdr.c6e9529ed18gggc03d2841d6`e5861`5`889512ceg2672ee5069`eg  
1337:/blog.quarkslab.com/static/resources/b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf  
1338:,aold-rvbqhpoba-`ln,pwbwj`,qfplvq`fp,a4g;70;gf3:eeea21f0:63f4bg7:43b7b::;730age0450gg724;bge  

(fig 17)

Going to blog.quarkslab.com/static/resources/b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf results in a file downloading.

A pure Python equivilent of the transformations done by the lambda's follows to aid any understanding that is needed:

l = [1375, 1373, 1368, 1281, 1294, 1288, 1293, 1373, 1373, 1290, 1295,  
1294, 1290, 1375, 1373, 1371, 1290, 1289, 1293, 1281, 1280, 1280, 1368,  
1293, 1368, 1289, 1294, 1280, 1293, 1373, 1368, 1294, 1372, 1289, 1292,  
1280, 1290, 1372, 1291, 1288, 1371, 1375, 1375, 1375, 1280, 1289, 1372,  
1373, 1281, 1290, 1293, 1281, 1373, 1294, 1371, 1302, 1354, 1372, 1370,  
1355, 1356, 1366, 1354, 1372, 1355, 1302, 1370, 1360, 1357, 1368, 1357,  
1354, 1302, 1364, 1366, 1370, 1303, 1371, 1368, 1365, 1354, 1362, 1355,  
1368, 1356, 1352, 1303, 1374, 1366, 1365, 1371, 1302]  
l.reverse()  
c = 1337  
out = []  
for x in l:  
    out.append(chr(x^c))

print ''.join(out)  

(fig 18)

Customise y0 Pythons

Once downloaded a quick inspection of the data shows it to be a 64-bit ELF binary, time to spin up a fresh Linux VM.

$ file b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf
b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf: ELF 64-bit LSB  
executable, x86-64, version 1 (SYSV), dynamically linked (uses shared  
libs), for GNU/Linux 2.6.26, not stripped  

(fig 19)

Running strings on the binary returns a wealth of strings that look very much like function names and modules from a C Python runtime e.g.

....
PySymtable_Build  
PyUnicodeUCS2_DecodeUTF16Stateful  
PyComplex_FromCComplex  
fwrite@@GLIBC_2.2.5  
_PyGILState_Init  
PyStructSequence_InitType  
PyString_Concat  
PyAST_Check  
....

(fig 20)

Running the binary however returns an error:

$ ./b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf
Could not find platform independent libraries <prefix>  
Could not find platform dependent libraries <exec_prefix>  
Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]  
ImportError: No module named site  

(fig 21)

Yup! Looks like an unhappy Python runtime to me, lets set the PYTHONHOME environment variable and strace the execution to see what the binary is looking for:

$ export PYTHONHOME="/tmp/quark"
$ strace ./b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf

<snipped for brevity>  
stat("/tmp/quark/lib/python2.7", {st_mode=S_IFDIR|0755, st_size=4096,  
...}) = 0
stat("/tmp/quark/lib/python2.7/lib-dynload", 0x7fff150b0eb0) = -1 ENOENT  
(No such file or directory)
write(2, "ImportError", 11ImportError)             = 11  
write(2, ": ", 2: )                       = 2  
write(2, "No module named site", 20No module named site)    = 20  
write(2, "\n", 1  
)                       = 1
rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7fd97079f340},  
{0x4fde10, [], SA_RESTORER, 0x7fd97079f340}, 8) = 0
exit_group(1)                           = ?  
+++ exited with 1 +++

(fig 22)

OK as expected with a Python runtime initialising it is looking for the site module, specifically it is looking for it in the lib/python2.7 subdirectory of where PYTHONHOME was set to be. Let's pop a empty site.py at that location and see if the mystery binary complains less:

$ ./b7d8438de09fffb12e3950e7ad4970a4a998403bdf3763dd4178adf
Python 2.7.8+ (nvcs/newopcodes:a9bd62e4d5f2+, Sep  1 2014, 11:41:46)  
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.  

(fig 23)

OK awesome, we are dropped to an interactive shell that has a strange banner but on the surface behaves like what we would expect. Now to begin to look for the secret song name .....

Jump Into The C

As the binary doesn't come with any site-packages it's functionality is very limited. The first question I had was whether the binary was using the standard opcode table mappings, a quick test for this is to see if the runtime can import a .pyc module created by a standard Python 2.7.x runtime. I tested this and it imports standard Python bytecode fine, so opcodes seem normal. This is good as if we want to add in functionality from exiting Python modules such as inspect or pdb we can do so.

Let's first see what the builtins are in this runtime:

>>> import sys
>>> sys.builtin_module_names
('__builtin__', '__main__', '_ast', '_codecs', '_sre', '_symtable',
'_warnings', '_weakref', 'do_not_run_me', 'errno', 'exceptions', 'gc',  
'imp', 'marshal', 'posix', 'pwd', 'signal', 'sys', 'tb', 'tb_div',  
'tb_mod', 'tb_mult', 'tb_sub', 'thread', 'xxsubtype', 'zipimport')  

(fig 24)

Probably a safe guess that do_not_run_me is a non-standard builtin, everything else looks normal though. Let's import it and see what we get:

>>> import do_not_run_me
>>> dir(do_not_run_me)
['__doc__', '__name__', '__package__', 'run_me']
>>> type(do_not_run_me.run_me)
<type 'builtin_function_or_method'>  

(fig 25)

>>> do_not_run_me.run_me.__doc__
'There are two kinds of people in the world: those who say there is no  
such thing as infinite recursion, and those who say ``There are two  
kinds of people in the world: those who say there is no such thing as  
infinite recursion, and those who say ...'  

(fig 26)

OK docs not that helpful, let's just run do_not_run_me.run_me function ..

>>> do_not_run_me.run_me()
Traceback (most recent call last):  
  File "<stdin>", line 1, in <module>
TypeError: function takes exactly 1 argument (0 given)  

(fig 27)

Pass in a guessed args:

>>> do_not_run_me.run_me(1337)
Traceback (most recent call last):  
  File "<stdin>", line 1, in <module>
TypeError: must be string or read-only buffer, not int  

(fig 28)

OK now a string arg:

>>> do_not_run_me.run_me("test")
Segmentation fault  

(fig 29)

Hmmm, a segfault is far more indicative of something screwy in the C Python runtime than anything at the python language layer. This however didn't stop me from dicking around for 30 mins or so getting the dis and pdb modules (and the million dependency modules) copied over to /tmp/quark/lib/python2.7/site-packages. Interestingly the itertools module was not available or builtin to the binary so to satisfy some dependencies pure Python versions of some of the functions had to be
patched into the depending modules. Once done I played around to see what dis could tell me about the do_not_run_me module. In short, jack shit.

Right fuckit, let's jump into the binary itself - fire up IDA Pro or if you don't have that then Hopper has a free trial.

Symbols hadn't been stripped so finding the code relating to the donotrun_me module was pretty straightforward, a psuedo code example of them is below:

function initdo_not_run_me {  
    rax = Py_InitModule4_64("do_not_run_me", HelloMethods, 0x0, 0x0, 0x3f5);
    return rax;
}

(fig 30)

function run_me {  
    rsp = rsp - 0x28;
    rax = PyArg_ParseTuple(rsi, 0x56c70b, rsp, rsp + 0x10, r8, r9, rbx);
    LODWORD(rdx) = 0x0;
    if (LODWORD(rax) != 0x0) {
            rbx = *(*(*_PyThreadState_Current + 0x10) + 0x30);
            PyObject_Call(PyFunction_New(PyMarshal_ReadObjectFromString(incr.9351,
0x91), rbx), PyTuple_New(0x0), 0x0);  
            rbp = PyObject_Call(PyFunction_New(PyMarshal_ReadObjectFromString(*rsp, *(rsp + 0x10)), rbx), PyTuple_New(0x0), 0x0);
            PyObject_Call(PyFunction_New(PyMarshal_ReadObjectFromString(decr.9357,
0x217), rbx), PyTuple_New(0x0), 0x0);  
            rdx = rbp;
    }
    rax = rdx;
    return rax;
}

(fig 31)

The init code of do_not_run_me looks absolutely standard. The code associated with run_me shows the use of Python marshaling via the PyMarshal_ReadObjectFromString() function (docs here https://docs.python.org/2/c-api/marshal.html) of two strings referenced as incr.9351 and decr.9357.

Looking in the binary the two blobs of data that are referenced as incr.9351 and decr.9357 can be easily found and carved out, they are shown below in their Python hex string form:

\x63\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x43\x00\x00\x00\x73
\x14\x00\x00\x00\x64\x01\x00\x87\x00\x00\x7c\x00\x00\x64\x01\x00\x3c\x61
\x00\x00\x7c\x00\x00\x1b\x28\x02\x00\x00\x00\x4e\x69\x01\x00\x00\x00\x28
\x01\x00\x00\x00\x74\x04\x00\x00\x00\x54\x72\x75\x65\x28\x01\x00\x00\x00
\x74\x0e\x00\x00\x00\x52\x6f\x62\x65\x72\x74\x5f\x46\x6f\x72\x73\x79\x74
\x68\x28\x00\x00\x00\x00\x28\x00\x00\x00\x00\x73\x10\x00\x00\x00\x6f\x62
\x66\x75\x73\x63\x61\x74\x65\x2f\x67\x65\x6e\x2e\x70\x79\x74\x03\x00\x00
\x00\x66\x6f\x6f\x05\x00\x00\x00\x73\x06\x00\x00\x00\x00\x01\x06\x02\x0a
\x01\x00\x00\x00\x00\x00\x00\x00\x54\x68\x65\x72\x65\x20\x61\x72\x65\x20
\x74\x77\x6f\x20\x6b\x69\x6e\x64\x73\x20\x6f\x66\x20\x70\x65\x6f\x70\x6c
\x65\x20\x69\x6e\x20\x74\x68\x65\x20\x77\x6f\x72\x6c\x64\x3a\x20\x74\x68
\x6f\x73\x65\x20\x77\x68\x6f\x20\x73\x61\x79\x20\x74\x68\x65\x72\x65\x20
\x69\x73\x20\x6e\x6f\x20\x73\x75\x63\x68\x20\x74\x68\x69\x6e\x67\x20\x61
\x73\x20\x69\x6e\x66\x69\x6e\x69\x74\x65\x20\x72\x65\x63\x75\x72\x73\x69
\x6f\x6e\x2c\x20\x61\x6e\x64\x20\x74\x68\x6f\x73\x65\x20\x77\x68\x6f\x20
\x73\x61\x79\x20\x60\x60\x54\x68\x65\x72\x65\x20\x61\x72\x65\x20\x74\x77
\x6f\x20\x6b\x69\x6e\x64\x73\x20\x6f\x66\x20\x70\x65\x6f\x70\x6c\x65\x20
\x69\x6e\x20\x74\x68\x65\x20\x77\x6f\x72\x6c\x64\x3a\x20\x74\x68\x6f\x73
\x65\x20\x77\x68\x6f\x20\x73\x61\x79\x20\x74\x68\x65\x72\x65\x20\x69\x73
\x20\x6e\x6f\x20\x73\x75\x63\x68\x20\x74\x68\x69\x6e\x67\x20\x61\x73\x20
\x69\x6e\x66\x69\x6e\x69\x74\x65\x20\x72\x65\x63\x75\x72\x73\x69\x6f\x6e
\x2c\x20\x61\x6e\x64\x20\x74\x68\x6f\x73\x65\x20\x77\x68\x6f\x20\x73\x61
\x79\x20\x2e\x2e\x2e\x00\x61\x39\x62\x64\x36\x32\x65\x34\x64\x35\x66\x32
\x2b\x00\x6e\x76\x63\x73\x2f\x6e\x65\x77\x6f\x70\x63\x6f\x64\x65\x73\x00
\x25\x73\x25\x73\x25\x73\x2c\x20\x25\x2e\x32\x30\x73\x2c\x20\x25\x2e\x39
\x73\x00\x31\x31\x3a\x34\x31\x3a\x34\x36\x00\x53\x65\x70\x20\x20\x31\x20
\x32\x30\x31\x34\x00\x00\x00\x00\x00\x00\x00\x00\x55\x6e\x76\x65\x72\x73
\x69\x6f\x6e\x65\x64\x20\x64\x69\x72\x65\x63\x74\x6f\x72\x79\x00

(fig 32 - incr.9351)

\x63\x00\x00\x00\x00\x00\x00\x00\x00\x1d\x00\x00\x00\x43\x00\x00\x00\x73
\x8b\x00\x00\x00\x7c\x00\x00\x64\x01\x00\x6b\x03\x00\x72\x19\x00\x7c\x00
\x00\x64\x02\x00\x55\x61\x00\x00\x6e\x6e\x00\x7c\x01\x00\x6a\x02\x00\x64
\x03\x00\x6a\x03\x00\x64\x04\x00\x77\x00\x00\xa0\x05\x00\xc8\x06\x00\xa0
\x07\x00\xb2\x08\x00\xa0\x09\x00\xea\x0a\x00\xa0\x0b\x00\x91\x08\x00\xa0
\x0c\x00\x9e\x0b\x00\xa0\x0d\x00\xd4\x08\x00\xa0\x0e\x00\xd5\x0f\x00\xa0
\x10\x00\xdd\x11\x00\xa0\x07\x00\xcc\x08\x00\xa0\x12\x00\x78\x0b\x00\xa0
\x13\x00\x87\x0f\x00\xa0\x14\x00\x5b\x15\x00\xa0\x16\x00\x97\x17\x00\x67
\x1a\x00\x53\x86\x01\x00\x86\x01\x00\x86\x01\x00\x54\x64\x00\x00\x1b\x28
\x18\x00\x00\x00\x4e\x69\x03\x00\x00\x00\x69\x01\x00\x00\x00\x74\x00\x00
\x00\x00\x63\x01\x00\x00\x00\x02\x00\x00\x00\x04\x00\x00\x00\x73\x00\x00
\x00\x73\x1f\x00\x00\x00\x8f\x00\x00\x5d\x15\x00\x87\x01\x00\x7c\x00\x00
\x8f\x01\x00\x64\x00\x00\x4e\x86\x01\x00\x59\x54\x71\x03\x00\x64\x01\x00
\x1b\x28\x02\x00\x00\x00\x69\x0d\x00\x00\x00\x4e\x28\x01\x00\x00\x00\x74
\x03\x00\x00\x00\x63\x68\x72\x28\x02\x00\x00\x00\x74\x02\x00\x00\x00\x2e
\x30\x74\x01\x00\x00\x00\x5f\x28\x00\x00\x00\x00\x28\x00\x00\x00\x00\x73
\x10\x00\x00\x00\x6f\x62\x66\x75\x73\x63\x61\x74\x65\x2f\x67\x65\x6e\x2e
\x70\x79\x73\x09\x00\x00\x00\x3c\x67\x65\x6e\x65\x78\x70\x72\x3e\x16\x00
\x00\x00\x73\x02\x00\x00\x00\x06\x00\x69\x4b\x00\x00\x00\x69\x62\x00\x00
\x00\x69\x7f\x00\x00\x00\x69\x2d\x00\x00\x00\x69\x59\x00\x00\x00\x69\x65
\x00\x00\x00\x69\x68\x00\x00\x00\x69\x43\x00\x00\x00\x69\x7a\x00\x00\x00
\x69\x41\x00\x00\x00\x69\x78\x00\x00\x00\x69\x63\x00\x00\x00\x69\x6c\x00
\x00\x00\x69\x5f\x00\x00\x00\x69\x7d\x00\x00\x00\x69\x6f\x00\x00\x00\x69
\x61\x00\x00\x00\x69\x64\x00\x00\x00\x69\x6e\x00\x00\x00\x28\x04\x00\x00
\x00\x74\x04\x00\x00\x00\x54\x72\x75\x65\x74\x09\x00\x00\x00\x71\x75\x61
\x72\x6b\x73\x6c\x61\x62\x74\x06\x00\x00\x00\x61\x70\x70\x65\x6e\x64\x74
\x04\x00\x00\x00\x6a\x6f\x69\x6e\x28\x00\x00\x00\x00\x28\x00\x00\x00\x00
\x28\x00\x00\x00\x00\x73\x10\x00\x00\x00\x6f\x62\x66\x75\x73\x63\x61\x74
\x65\x2f\x67\x65\x6e\x2e\x70\x79\x74\x03\x00\x00\x00\x66\x6f\x6f\x11\x00
\x00\x00\x73\x06\x00\x00\x00\x00\x02\x0c\x01\x0d\x02\x00\x00\x00\x00\x00
\x00\x00\x00\x00

(fig 33 - decr.9357)

As we know they are inputs to marshal let's try to unmarshal them, the binary doesn't have the marshal module builtin so use a standard Python runtime:

import marshal  
import dis  
x =  
"\x63\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x43\x00\x00\x00\x73  
\x14\x00\x00\x00\x64\x01\x00\x87\x00\x00\x7c\x00\x00\x64\x01\x00\x3c\x61
\x00\x00\x7c\x00\x00\x1b\x28\x02\x00\x00\x00\x4e\x69\x01\x00\x00\x00\x28
\x01\x00\x00\x00\x74\x04\x00\x00\x00\x54\x72\x75\x65\x28\x01\x00\x00\x00
\x74\x0e\x00\x00\x00\x52\x6f\x62\x65\x72\x74\x5f\x46\x6f\x72\x73\x79\x74
\x68\x28\x00\x00\x00\x00\x28\x00\x00\x00\x00\x73\x10\x00\x00\x00\x6f\x62
\x66\x75\x73\x63\x61\x74\x65\x2f\x67\x65\x6e\x2e\x70\x79\x74\x03\x00\x00
\x00\x66\x6f\x6f\x05\x00\x00\x00\x73\x06\x00\x00\x00\x00\x01\x06\x02\x0a
\x01\x00\x00\x00\x00\x00\x00\x00\x54\x68\x65\x72\x65\x20\x61\x72\x65\x20
\x74\x77\x6f\x20\x6b\x69\x6e\x64\x73\x20\x6f\x66\x20\x70\x65\x6f\x70\x6c
\x65\x20\x69\x6e\x20\x74\x68\x65\x20\x77\x6f\x72\x6c\x64\x3a\x20\x74\x68
\x6f\x73\x65\x20\x77\x68\x6f\x20\x73\x61\x79\x20\x74\x68\x65\x72\x65\x20
\x69\x73\x20\x6e\x6f\x20\x73\x75\x63\x68\x20\x74\x68\x69\x6e\x67\x20\x61
\x73\x20\x69\x6e\x66\x69\x6e\x69\x74\x65\x20\x72\x65\x63\x75\x72\x73\x69
\x6f\x6e\x2c\x20\x61\x6e\x64\x20\x74\x68\x6f\x73\x65\x20\x77\x68\x6f\x20
\x73\x61\x79\x20\x60\x60\x54\x68\x65\x72\x65\x20\x61\x72\x65\x20\x74\x77
\x6f\x20\x6b\x69\x6e\x64\x73\x20\x6f\x66\x20\x70\x65\x6f\x70\x6c\x65\x20
\x69\x6e\x20\x74\x68\x65\x20\x77\x6f\x72\x6c\x64\x3a\x20\x74\x68\x6f\x73
\x65\x20\x77\x68\x6f\x20\x73\x61\x79\x20\x74\x68\x65\x72\x65\x20\x69\x73
\x20\x6e\x6f\x20\x73\x75\x63\x68\x20\x74\x68\x69\x6e\x67\x20\x61\x73\x20
\x69\x6e\x66\x69\x6e\x69\x74\x65\x20\x72\x65\x63\x75\x72\x73\x69\x6f\x6e
\x2c\x20\x61\x6e\x64\x20\x74\x68\x6f\x73\x65\x20\x77\x68\x6f\x20\x73\x61
\x79\x20\x2e\x2e\x2e\x00\x61\x39\x62\x64\x36\x32\x65\x34\x64\x35\x66\x32
\x2b\x00\x6e\x76\x63\x73\x2f\x6e\x65\x77\x6f\x70\x63\x6f\x64\x65\x73\x00
\x25\x73\x25\x73\x25\x73\x2c\x20\x25\x2e\x32\x30\x73\x2c\x20\x25\x2e\x39
\x73\x00\x31\x31\x3a\x34\x31\x3a\x34\x36\x00\x53\x65\x70\x20\x20\x31\x20
\x32\x30\x31\x34\x00\x00\x00\x00\x00\x00\x00\x00\x55\x6e\x76\x65\x72\x73
\x69\x6f\x6e\x65\x64\x20\x64\x69\x72\x65\x63\x74\x6f\x72\x79\x00"

y = marshal.loads(x)  
print y

print dis.disco(y)  

(fig 34)

This produces:

<code object foo at 0x1048105b0, file "obfuscate/gen.py", line 5>  
  6           0 LOAD_CONST               1 (1)
              3 LOAD_CLOSURE             0
Traceback (most recent call last):  
  File "unm.py", line 8, in <module>
    print dis.disco(y)
  File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/dis.py",  
line 107, in disassemble  
    print '(' + free[oparg] + ')',
IndexError: tuple index out of range  

(fig 35)

As can be infered from the code objects name this is not a marshaled object of standard Python bytecode and so the disassembler crashes out when it gets confused.

I assume that conversely whatever is given as the argument to do_not_run_me.run_me also needs to have an object using the same remapped crazy bytecode and so this is the source of the segfault when a passed in string does not unmarshal to a code object of the obfuscated type and gets passed to PyObject_Call. However, we can use the function to unmarshal and exec the crazy code objects we extracted from
the binary and return their output upon execution.

Let's use the marshaled strings carved from the binary as input to the do_not_run_me.run_me() function. Using the incr.9351 data as input returns 3 and does not segfault, this is a good sign \o/. Using the decr.9357 data as an input returns:

>>> decr_9357_str =
"\x63\x00\x00\x00\x00\x00\x00\x00\x00\x1d\x00\x00\x00\x43\x00\x00\x00\x73  
\x8b\x00\x00\x00\x7c\x00\x00\x64\x01\x00\x6b\x03\x00\x72\x19\x00\x7c\x00
\x00\x64\x02\x00\x55\x61\x00\x00\x6e\x6e\x00\x7c\x01\x00\x6a\x02\x00\x64
\x03\x00\x6a\x03\x00\x64\x04\x00\x77\x00\x00\xa0\x05\x00\xc8\x06\x00\xa0
\x07\x00\xb2\x08\x00\xa0\x09\x00\xea\x0a\x00\xa0\x0b\x00\x91\x08\x00\xa0
\x0c\x00\x9e\x0b\x00\xa0\x0d\x00\xd4\x08\x00\xa0\x0e\x00\xd5\x0f\x00\xa0
\x10\x00\xdd\x11\x00\xa0\x07\x00\xcc\x08\x00\xa0\x12\x00\x78\x0b\x00\xa0
\x13\x00\x87\x0f\x00\xa0\x14\x00\x5b\x15\x00\xa0\x16\x00\x97\x17\x00\x67
\x1a\x00\x53\x86\x01\x00\x86\x01\x00\x86\x01\x00\x54\x64\x00\x00\x1b\x28
\x18\x00\x00\x00\x4e\x69\x03\x00\x00\x00\x69\x01\x00\x00\x00\x74\x00\x00
\x00\x00\x63\x01\x00\x00\x00\x02\x00\x00\x00\x04\x00\x00\x00\x73\x00\x00
\x00\x73\x1f\x00\x00\x00\x8f\x00\x00\x5d\x15\x00\x87\x01\x00\x7c\x00\x00
\x8f\x01\x00\x64\x00\x00\x4e\x86\x01\x00\x59\x54\x71\x03\x00\x64\x01\x00
\x1b\x28\x02\x00\x00\x00\x69\x0d\x00\x00\x00\x4e\x28\x01\x00\x00\x00\x74
\x03\x00\x00\x00\x63\x68\x72\x28\x02\x00\x00\x00\x74\x02\x00\x00\x00\x2e
\x30\x74\x01\x00\x00\x00\x5f\x28\x00\x00\x00\x00\x28\x00\x00\x00\x00\x73
\x10\x00\x00\x00\x6f\x62\x66\x75\x73\x63\x61\x74\x65\x2f\x67\x65\x6e\x2e
\x70\x79\x73\x09\x00\x00\x00\x3c\x67\x65\x6e\x65\x78\x70\x72\x3e\x16\x00
\x00\x00\x73\x02\x00\x00\x00\x06\x00\x69\x4b\x00\x00\x00\x69\x62\x00\x00
\x00\x69\x7f\x00\x00\x00\x69\x2d\x00\x00\x00\x69\x59\x00\x00\x00\x69\x65
\x00\x00\x00\x69\x68\x00\x00\x00\x69\x43\x00\x00\x00\x69\x7a\x00\x00\x00
\x69\x41\x00\x00\x00\x69\x78\x00\x00\x00\x69\x63\x00\x00\x00\x69\x6c\x00
\x00\x00\x69\x5f\x00\x00\x00\x69\x7d\x00\x00\x00\x69\x6f\x00\x00\x00\x69
\x61\x00\x00\x00\x69\x64\x00\x00\x00\x69\x6e\x00\x00\x00\x28\x04\x00\x00
\x00\x74\x04\x00\x00\x00\x54\x72\x75\x65\x74\x09\x00\x00\x00\x71\x75\x61
\x72\x6b\x73\x6c\x61\x62\x74\x06\x00\x00\x00\x61\x70\x70\x65\x6e\x64\x74
\x04\x00\x00\x00\x6a\x6f\x69\x6e\x28\x00\x00\x00\x00\x28\x00\x00\x00\x00
\x28\x00\x00\x00\x00\x73\x10\x00\x00\x00\x6f\x62\x66\x75\x73\x63\x61\x74
\x65\x2f\x67\x65\x6e\x2e\x70\x79\x74\x03\x00\x00\x00\x66\x6f\x6f\x11\x00
\x00\x00\x73\x06\x00\x00\x00\x00\x02\x0c\x01\x0d\x02\x00\x00\x00\x00\x00
\x00\x00\x00\x00"
>>> do_not_run_me.run_me(decr_9357_str)
Traceback (most recent call last):  
  File "obfuscate/gen.py", line 22, in foo
NameError: global name 'quarkslab' is not defined  

(fig 36)

Let's make a quarklab list object:

>>> quarkslab = []
>>> do_not_run_me.run_me(decr_9357_str)
>>> quarkslab
['For The New Lunar Republic']

(fig 37)

That looks like a thing, let's see if it hashes to the salted hash value given on the challenge page:

>>> import hashlib
>>> hashlib.sha256("bacalhau"+'For The New Lunar Republic').hexdigest()
'61b42c223973996c797a9a366c64c3595052ff71089b4ff13d3251b66b6366e9'  

(fig 38)

Awesome, it looks like we are done.

The Prize!

Revel in it's glory:

Footnotes

The steps above are just the way I got through to the Pony song while drinking watermelon beer, there are likely other (smarter) ways that you could achieve the same results.

Big thanks to the Quarkslab folks for creating a fun Python focussed challenge.

If you find mistakes in the above let me know and I'll try and correct them.

Rich Smith
Director of Security @Etsy // Co-Founder @theSyndis // International Vagabond & Miscreant