Solving a Crack Me with Triton and Pin (a.k.a the lazy way)

I have originally written this post for blackbunny‘s blog.

Some definitions


As stated in Triton’s home page:

Triton is a dynamic binary analysis (DBA) framework. It provides internal components like a Dynamic Symbolic Execution (DSE) engine, a Taint Engine, AST representations of the x86 and the x86-64 instructions set semantics, SMT simplification passes, a SMT Solver Interface and, the last but not least, Python bindings. Based on these components, you are able to build program analysis tools, automate reverse engineering and perform software verification.

That might sound gibberish for some of you. So let’s cover these definitions first.

AST representation

An AST (Abstract Syntax Tree) is a tree representation of the structure of some code.
Example, the AST representing this instruction:

Could be the following:
AST

As explained here, all of Triton’s expressions are on the SSA form.
Meaning that each time a register/flag is modified, a new reference to the new value will be created.

In this example from Triton’s documentation, ref!41 is the new reference to rax and ref!40 is the reference to the previous one.
This allows to easily build “nested” expressions. (new expression making references to previously generated ones) and provides the ability to get the expression of a register/flag at each execution point. See here for more details.

AST can be useful in order to manipulate an instructions and also to translate them to another language such as SMTLib language in order to feed an SMT Solver.

SMT Solvers

SMT Solvers such as Z3 are used in many applications such as software verification, constraint solving, security etc…
What one needs to know, in order to understand this post, is that SMT Solvers are supposed to provide a solution verifying a set of constraints.

Let’s provide a few constraints for Z3 to solve:

Which will give the following output:

In this simple example, two constraints are provided to z3:

And as you can see Z3 was able to find a solution satisfying these constraints.

Z3 is also able to simplify expressions, which can come in handy in case of obfuscated code.
Here’s an example:

Output:

Now imagine combining all these concepts in order to analyse and exploit binaries.

This is what Triton is able to do.

The Triton/Pintool (Python) API


Triton provides different DBA mechanisms (through symbolic execution or through tracing win Pin)
In this first post I’ll also be covering the Pin tracing feature coupled with a few symbolic execution features.

First things first, let’s dissect the instruction count script:

This tiny script is quite self explanatory.

One can find a full description off all triton/pintool bindings here but here is a short  description of the ones we’ll be using:
setArchitecture:
Setup an architecture (ARCH.X86_64 or ARCH.X86)

startAnalysisFromEntry:
Request Pin to start tracing from the executable’s entrypoint.
Internaly this is just setting the startAnalysisFromEntry option (there is no c++ function associated with this function)

insertCall: this function provides means to add python callbacks as Pin’s routine, instruction and syscall callbacks (among other things).

runProgram: This function is a wrapper to PIN_StartProgram. This function basically execve’s to the program we want to analyse. It will never return.

With regards to above mentioned descriptions, one can quickly understand that the mycb callback will be called before each called instruction, and the number of instructions executed will be displayed by the fini callback at the end of the program’s execution.

Other important bindings we’ll use in this post are:
addCallback: Provides callbacks to Triton whenever it will need a concrete memory, a concrete register or when it will need to perform a symbolic simplification.

getPathConstraints:
This function holds some of the magic that will make the CrackMe’s resolution the easiest possible.
It provides a list of all path constraints that allowed the execution to reach the current execution point. Meaning the equations/symbolic expressions of each control flow instructions (je, jne, ja etc..) preceding the current execution point.

convertMemoryToSymbolicVariable: This function converts a memory address to a symbolic variable (the same kind of variables used with Z3) these variables will be the ones Z3 will provide models/possible solutions.

takeSnapshot/restoreSnapshot: these function allow us to take a whole machine context (cpu and memory) and restore it. This can help to simulate the restart of the execution (remember runProgram never returns).

The binary


The binary we are going to analyse is a crackme I wrote based on a challenge I’ve payed with recently.

With radare2 or IDA we can quickly see the mess before success (_095C_):

 

A quick look at the binary shows us that each of the checks are only depending on argv[1]:
variables used for the checks are located in rbp-0x12, rbp-0x11, rbp-0x10, rbp-0xf, rbp-0xe, rbp-0xd
and each of these variables are only depending on argv[1]. Example with this snippet:

One can also see that the binary expects an 11 bytes password as only argv[1] offsets from 0 to 0xa are used. (see main’s first basic block)

We can also spot two kinds of checks: the ones leading to the end of the program’s execution (call to exit) and those which might lead to the incrementation of a variable:

We’ll definitely want to avoid prematurely stoping the program so the first kind of checks should not lead us to the exit calls.

Regarding the ohter checks (leading to the vriable incrementation) one can see cases where je is used and cases where jne is used.

But how to know which path to take ? the true or the false ?

The answer is simple: each check is a value comparison (“equality check”), meaning that one value would lead to a path, and the rest would lead to another path.
As the binary is verifying a password we can safely assume that the correct paths are the ones verifying the comparison (Z=1).

This means that all the following basic blocks need to be avoided:
0x40070c, 0x40071f, 0x40073c, 0x400755, 0x400774, 0x400793, 0x4007af, 0x4007d3, 0x40081b, 0x400853, 0x400878, 0x4008b9, 0x400952

And all the these other blocks need to be taken:
0x4007f7, 0x4008fa

These are all the infomation we need in order to solve the CrackMe. Yes, that’s all we need. There is no need to understand all the obscure calculations done in each block.

Time for some magic !


Here is the script I wrote in order to solve this challenge:

Basically this script checks if we are executing wrong basic blocks, requests for a model in order to avoid these basic blocks, patches argv[1] and restarts the program until it finds the correct password validating all path constraints.

Let’s focus on the two most important callbacks:

before:

This function is only doing something at the entrypoint, it is symbolizing 11 bytes from argv[1]:

It is also building a list (symVarConstraints) which will hold all symbolic variables constraints.
In our case, the constraints are simple: argv[1] must only contain printable characters (from 0x20 to 0x7E)
This is done with ast.bvuge(variable(symvar), bv(0x20, 8)) and ast.bvule(variable(symvar), bv(0x7E, 8))

ast.bvuge being an unsigned comparison (Greater or Equal) of two bit vectors
and ast.bvule being an unsigned comparison (Less than or Equal) of two bit vectors

so these two expressions may be translated to

  • symvar need to be greater or equal to 0x20.
  • symvar need to be less or equal to 0x7E.

before_symproc:

This function is the one doing the magic:

first it is checking wether the program is executing the avoid addresses or if it missed the take addresses:

Then, in the case we actually are executing an address we want to avoid, or in the case we missed an address we actually wanted to take, it is building a pathConstraints list where all branches leading to avoid addresses have there path contraint negated (with ast.lnot). And all branches leading to take addresses are stored as is.

Finally, still in the wrong cases, the script builds a full constraint (full_constraint) and-ing (with superAnd) the symvar constraints (printable) and all the previous path constraints.

this full_constraints is then provided to z3 through Triton’s getModel API in order to get a possible solution. Then, we just replace argv[1] with the new model, we clear all path constraints and restore the snapshot (restart the program):

Here is what we get:

Magic isn’t it ?

All related files can be found here. Enjoy !

7 thoughts on “Solving a Crack Me with Triton and Pin (a.k.a the lazy way)

  1. Thanks for the guide. I found it very interesting and wanted to try to follow it.

    Unfortunately I’m having a lot of trouble getting pintool to work. I first tried to install pin-3.0-76991-gcc-linux.tar.gz, but Triton wouldn’t build with it (some error about libpindwarf.so not found). Then installed the older version, pin-2.14-71313-gcc.4.4.7-linux.tar.gz, after which Triton builds and installs fine.

    But when I do “triton solve.py CrackMe” it complains about “E: 4.4 is not a supported linux release”. Apparently my Linux kernel is too new. I guess I should set up an older VM to make this work.

    Which distribution and version are you using?

    I wonder if this could work with e.g. unicorn emulator instead of Intel’s proprietary tool.

    1. It is not recommended but you can have Triton working on a 4.x kernel, see here
      On another hand, I’m running Triton on a Debian 8.5 (kernel 3.16).
      Regarding using unicorn for emulation, I guess this is possible, but needs to be implemented. Triton provides an interface to plug whatever tracer you need:
      Triton architecture

  2. Thanks. I’ve reinstalled the environment on Ubuntu Trusty 14.04 (3.13.0-88-generic), 64 bit. No more complaints about kernel version. However, I get an error in solve.py now:

    $ triton solve.py ./CrackMe
    Traceback (most recent call last):
    File “solve.py”, line 106, in before
    setCurrentMemoryValue(argv1 + offset, ord(“_”))
    TypeError: tracer::pintool::context::setCurrentMemoryValue(): Page not writable.

    The Triton example (./triton ./src/examples/pin/ir.py /usr/bin/id) works, however.

    1. You need to provide at least one parameter to the binary:
      triton solve.py ./CrackMe _
      I have temporarily updated the script in order to raise an exception if the argument is missing.

Leave a Reply

Your email address will not be published. Required fields are marked *