Triton - A DBA Framework

What kind of semantics information Triton can provide?


 

 

1 - Abstract

Triton is a binary analysis framework which is currently mainly focused around the symbolic execution research area. Triton is basically a C++11 library and exposes a C++ API as well as Python bindings. Through this blog post we will see how Triton (build 1189) manages and deals with instruction semantics, then, what kind of information you can get back after processing an instruction into Triton.

 

2 - Working at the instruction level

Triton mainly works at the instruction level. It means that the highest class is Instruction. This class takes as inputs opcodes and a potential context (memory state or registers state). If you process this instruction, you can get back expressions (as AST) about all side effects (semantics) of the instruction as well as several information about implicit and explicit effects (GET, PUT, LOAD and STORE).

 

2.1 - Semantics expressions

As an example, we will use the add rax, rbx x86_64 instruction. What you need to do is just to give opcodes and a potential context. Note that if you don't give a context, the concretization will be processed on zero.

>>> from triton import *

>>> setArchitecture(ARCH.X86_64)
>>> setAstRepresentationMode(AST_REPRESENTATION.PYTHON)

>>> inst = Instruction()
>>> inst.setOpcodes("\x48\x01\xD8") # add rax, rbx
>>> inst.setAddress(0x400000)
>>> inst.updateContext(Register(REG.RAX, 0x4444444455555555))
>>> inst.updateContext(Register(REG.RBX, 0x1111111122222222))

>>> processing(inst)

>>> print inst
400000: add rax, rbx

>>> for expression in inst.getSymbolicExpressions():
...     print expression
...
ref_0 = ((0x4444444455555555 + 0x1111111122222222) & 0xFFFFFFFFFFFFFFFF) # ADD operation
ref_1 = (0x1 if (0x10 == (0x10 & (ref_0 ^ (0x4444444455555555 ^ 0x1111111122222222)))) else 0x0) # Adjust flag
ref_2 = ((((0x4444444455555555 & 0x1111111122222222) ^ (((0x4444444455555555 ^ 0x1111111122222222) ^ ref_0) & (0x4444444455555555 ^ 0x1111111122222222))) >> 63) & 0x1) # Carry flag
ref_3 = ((((0x4444444455555555 ^ ~0x1111111122222222) & (0x4444444455555555 ^ ref_0)) >> 63) & 0x1) # Overflow flag
ref_4 = ((((((((0x1 ^ (((ref_0 & 0xFF) >> 0x0) & 0x1)) ^ (((ref_0 & 0xFF) >> 0x1) & 0x1)) ^ (((ref_0 & 0xFF) >> 0x2) & 0x1)) ^ (((ref_0 & 0xFF) >> 0x3) & 0x1)) ^ (((ref_0 & 0xFF) >> 0x4) & 0x1)) ^ (((ref_0 & 0xFF) >> 0x5) & 0x1)) ^ (((ref_0 & 0xFF) >> 0x6) & 0x1)) ^ (((ref_0 & 0xFF) >> 0x7) & 0x1)) # Parity flag
ref_5 = ((ref_0 >> 63) & 0x1) # Sign flag
ref_6 = (0x1 if (ref_0 == 0x0) else 0x0) # Zero flag
ref_7 = 0x400003 # Program Counter

Each ref_ is a SymbolicExpression object which is unique and represents the output of the instruction. These references represent our SSA form.

This object (SymbolicExpression) provides several methods which give information about the output assignment - is the output assigned to a register? To a memory? Which register, which memory cells? etc. For instance, if we want to know what the ref_0 is assigned to, we can do something like this:

>>> ref_0 = inst.getSymbolicExpressions()[0]

>>> ref_0.isRegister()
True

>>> ref_0.getOriginRegister()
rax:64 bv[63..0]

>>> hex(ref_0.getOriginRegister().getConcreteValue())
0x5555555577777777L

Note that all references point to the largest register. For example, if the destination is AL, the output reference contains the RAX expression.

 

2.2 - Explicit and implicit information

Triton doesn't make a distinction between what information are implicit or explicit, everything is processed at the same level. However, you can classify information into 6 categories:

  • Output semantics expressions
  • Memory LOAD accesses
  • Memory STORE accesses
  • Input (read) registers
  • Output (write) registers
  • Input (read) immediates

Example with the previous add rax, rbx:

>>> for reg, ast in inst.getReadRegisters():
...     print reg, hex(reg.getConcreteValue())
...
rax:64 bv[63..0] 0x4444444455555555L
rbx:64 bv[63..0] 0x1111111122222222L

>>> for reg, ast in inst.getWrittenRegisters():
...     print reg, hex(reg.getConcreteValue())
...
rax:64 bv[63..0] 0x5555555577777777L
rip:64 bv[63..0] 0x400003L
af:1 bv[0..0] 0x0L
cf:1 bv[0..0] 0x0L
of:1 bv[0..0] 0x0L
pf:1 bv[0..0] 0x1L
sf:1 bv[0..0] 0x0L
zf:1 bv[0..0] 0x0L

Another example with a ret instruction:

>>> inst = Instruction()
>>> inst.setOpcodes("\xc3") # ret
>>> inst.setAddress(0x40000)
>>> inst.updateContext(Register(REG.RSP, 0xfff5000))
>>> inst.updateContext(Memory(0xfff5000, 8, 0x50000))

>>> processing(inst)

>>> print inst
40000: ret

>>> for load, ast in inst.getLoadAccess():
...     print load, hex(load.getConcreteValue())
...
[@0xfff5000]:64 bv[63..0] 0x50000L

>>> for reg, ast in inst.getWrittenRegisters():
...     print reg, hex(reg.getConcreteValue())
...
rsp:64 bv[63..0] 0xfff5008L
rip:64 bv[63..0] 0x50000L

 

2.3 - LOAD / STORE expressions

When an instruction performs a memory access (LOAD, STORE or both), you can get information about its base, index, scale and displacement as well as the AST of the access expression.

>>> inst = Instruction()
>>> inst.setAddress(0x400000)
>>> inst.setOpcodes("\x8A\x84\x4B\x00\x10\x00\x00") # mov al, BYTE PTR [rbx+rcx*2+0x1000]
>>> inst.updateContext(Register(REG.RAX, 0x4444444455555555))
>>> inst.updateContext(Register(REG.RBX, 0x10000))
>>> inst.updateContext(Register(REG.RCX, 0x200))

>>> processing(inst)
>>> print inst
0x400000: mov al, byte ptr [rbx + rcx*2 + 0x1000]

>>> op0 = inst.getOperands()[0]
>>> op1 = inst.getOperands()[1]

>>> print op0
al:8 bv[7..0]

>>> print op1
[@0x11400]:8 bv[7..0]

>>> print op1.getBaseRegister()
rbx:64 bv[63..0]

>>> print op1.getIndexRegister()
rcx:64 bv[63..0]

>>> print op1.getScale()
0x2:8 bv[7..0]

>>> print op1.getDisplacement()
0x1000:8 bv[7..0]

>>> print op1.getLeaAst()
((0x10000 + ((((0x200 * 0x2) & 0xFFFFFFFFFFFFFFFF) + 0x1000) & 0xFFFFFFFFFFFFFFFF)) & 0xFFFFFFFFFFFFFFFF)

>>> print hex(op1.getLeaAst().evaluate())
0x11400L

This LEA AST may contain previous expressions (SSA) and symbolic variables which imply a symbolic memory access and allow you to know the max and min range of the memory access.

 

2.4 - Segments

When you develop a binary analysis framework or an emulator, the main difficulty is to manage correctly segments and their content - Trust me, that's not too easy if you want a design which makes sense :-). Triton provides two lazy ways to deal with segments and their contents:

  • Using emulation: You have to define a base address (instead of an offset into a GDT) and a content.
  • Using concrete value: You can directly define at the process level the target address and its content (e.g: in the case where you use a DBI).

First example with the emulation way:

>>> enableSymbolicEmulation(True)
>>> setLastRegisterValue(Register(REG.FS, 0xfff400))
>>> setLastRegisterValue(Register(REG.RSI, 0x10))
>>> setLastMemoryAreaValue(0xfff400, [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26])

>>> inst = Instruction()
>>> inst.setAddress(0x40000)
>>> inst.setOpcodes("\x64\x8A\x46\x04") # mov al, BYTE PTR fs:[rsi+0x4]

>>> processing(inst)

>>> print inst.getOperands()[1].getSegmentRegister()
fs:64 bv[63..0]

>>> print inst.getOperands()[1]
[@0xfff414]:8 bv[7..0]

>>> print inst.getOperands()[1].getConcreteValue()
21

Second example is the case where you use a DBI like Pin. Pin can provide the runtime memory access and its content which has been read or written. That's typically what we use into our libpintool.so tracer.

>>> inst = Instruction()
>>> inst.setAddress(0x40000)
>>> inst.setOpcodes("\x64\x8A\x46\x04") # mov al, BYTE PTR fs:[rsi+0x4]
>>> inst.updateContext(Memory(0xfff414, CPUSIZE.BYTE, 21)) # This information comes from the DBI

>>> processing(inst)

>>> print inst.getOperands()[1].getSegmentRegister()
fs:64 bv[63..0]

>>> print inst.getOperands()[1]
[@0xfff414]:8 bv[7..0]

>>> print inst.getOperands()[1].getConcreteValue()
21

 

2.5 - The Abstract Syntax Tree of expressions

Triton represents its expressions into an Abstract Syntax Tree and can display this AST into two different representations, the SMT2-Lib or the Python syntax. These modes can be enabled using their appropriate flags.

An abstract syntax tree (AST) is a representation of a grammar as tree. Triton uses a custom AST for its expressions. As all expressions are built at runtime, an AST is available at each program point. For example, let assume this set of instructions:

1. mov al, 1
2. mov cl, 10
3. mov dl, 20
4. xor cl, dl
5. add al, cl

At the line 5, the AST of the AL register looks like this:

This AST represents the semantics of the AL register at the program point 5 from the program point 1. Note that this AST has been simplified for a better comprehension. The real AST contains some concat and extract which represent the real Intel behavior. According to the API, you can build and modify your own AST. Then, you can perform some modifications and simplifications before sending it to the solver. Below, a little example about the node manipulation.

>>> node = bvadd(bv(1, 8), bvxor(bv(10, 8), bv(20, 8)))
>>> print node
(bvadd (_ bv1 8) (bvxor (_ bv10 8) (_ bv20 8)))

>>> subchild = node.getChilds()[1].getChilds()[0]
>>> print subchild
(_ bv10 8)
>>> print subchild.getChilds()[0].getValue()
10
>>> print subchild.getChilds()[1].getValue()
8

>>> node = bvadd(bv(1, 8), bvxor(bv(10, 8), bv(20, 8)))
>>> print node
(bvadd (_ bv1 8) (bvxor (_ bv10 8) (_ bv20 8)))

>>> node.setChild(0, bv(123, 8))
>>> print node
(bvadd (_ bv123 8) (bvxor (_ bv10 8) (_ bv20 8)))

>>> node.evaluate()
153L
>>> node.isSigned()
True
>>> node.getBitvectorSize()
8

 

2.5.1 - Subtrees to keep the SSA form

To manage more easily the subtrees and to keep the SSA form of registers and memory cells, we have added a REFERENCE node which is a "terminate" node of a tree but which contains a reference to another subtree. Below, an example of one "partial" tree linked with two other subtrees.

If you try to go through the full AST you will fail at the first reference node because a reference node does not contains child nodes. The only way to jump from a reference node to the targeted node is to use the getFullAst() function.

>>> setAstRepresentationMode(AST_REPRESENTATION.SMT)

>>> zfId = getSymbolicRegisterId(REG.ZF)
>>> partialTree = getSymbolicExpressionFromId(zfId).getAst()
>>> print partialTree
(ite (= ref!89 (_ bv0 32)) (_ bv1 1) (_ bv0 1))

>>> fullTree = getFullAst(partialTree)
>>> print fullTree
(ite (= (bvsub ((_ extract 31 0) ((_ zero_extend 32) ((_ extract 31 0) ((_ zero_extend 32) (bvxor ((_ extract 31 0) ((_ zero_extend 32) (bvsub ((_ extract 31 0) ((_ zero_extend 32) ((_ sign_extend 24) ((_ extract 7 0) SymVar_0)))) (_ bv1 32)))) (_ bv85 32)))))) ((_ extract 31 0) ((_ zero_extend 32) ((_ sign_extend 24) ((_ extract 7 0) ((_ zero_extend 32) ((_ zero_extend 24) ((_ extract 7 0) (_ bv49 8))))))))) (_ bv0 32)) (_ bv1 1) (_ bv0 1))

 

2.5.2 - Commutativity and the equal operator

Sometime it's useful to match patterns but with the commutativity it's a bit boring. Triton overwrites the equal (==) operator to manage the commutativity for you. Below some examples with few operations:

>>> a
((0x1 * 0x2) & 0xFF)
>>> b
((0x2 * 0x1) & 0xFF)
>>> a == b
True

>>> a
(((0x1 * 0x2) & 0xFF) ^ 0x3)
>>> b
(0x3 ^ ((0x2 * 0x1) & 0xFF))
>>> a == b
True

>>> a
(0x1 / 0x2)
>>> b
(0x2 / 0x1)
>>> a == b
False

 

3 - Sequence of instructions

Triton interprets a succession of processing() calls as a contiguous sequence of executed instructions and all internal engines are updated automatically. In other words, if you call the processing() function several times, you simulate an execution.

Example:

>>> from triton import *
>>> setArchitecture(ARCH.X86_64)

>>> inst = Instruction()
>>> inst.setAddress(0x40000)
>>> inst.setOpcodes("\x48\xC7\xC0\x42\x00\x00\x00") # mov rax, 0x42
>>> processing(inst)

>>> inst = Instruction()
>>> inst.setAddress(0x40007)
>>> inst.setOpcodes("\x48\xFF\xC0") # inc rax
>>> processing(inst)

>>> rax_ast = getAstFromId(getSymbolicRegisterId(REG.RAX))
>>> print rax_ast
(bvadd ((_ extract 63 0) ref!0) (_ bv1 64))

>>> print getFullAst(rax_ast)
(bvadd ((_ extract 63 0) (_ bv66 64)) (_ bv1 64))

>>> hex(rax_ast.evaluate())
'0x43L'

As you can see, our first call to the getAstFromId() function returns the current AST of the RAX register. This AST has a reference (ref!0) to the previous expression of RAX (which is the mov instruction). To slice the full AST, you have to use the getFullAst() function which reconstructs the AST and returns the root node from the origin of the statement.

 

3.1 - Symbolic emulation

As explained in the previous chapters, Triton maintains the state of registers and memory cells at the last program point and can provide their related expressions as AST. By evaluating these AST you can perform a symbolic emulation.

For instance, we will try to emulate this sample. First of all, we have to load the complete binary into the Triton's space memory. To do that, we use the ABF tool to read sections, but feel free to use what you want in this part. The loading in memory looks like this:

# Load the binary
print '[+] Load binary'
binary = Abstract('./crackme_hash').getBinary()

# Map the executable sections
print '[+] Map the exec sections'
for sec in binary.getExecSections():
    vaddr = sec['vaddr']
    count = 0
    for byte in sec['opcodes']:
        setLastMemoryValue(vaddr+count, byte)
        count += 1

# Map the data sections
print '[+] Map the data sections'
for sec in binary.getDataSections():
    vaddr = sec['vaddr']
    count = 0
    for byte in sec['data']:
        setLastMemoryValue(vaddr+count, byte)
        count += 1

Then, once the binary is mapped, you have to emulate opcodes from a point A to a point B. In this example, we focus on the main() function. As this is the main() function, we still have to define the argc and *argv[] values.

# Define our input context
# content of argv[1]
setLastMemoryValue(0x10000000, ord('e'))
setLastMemoryValue(0x10000001, ord('l'))
setLastMemoryValue(0x10000002, ord('i'))
setLastMemoryValue(0x10000003, ord('t'))
setLastMemoryValue(0x10000004, ord('e'))
# 0x20000000 = argv0
# 0x20000008 = argv1 -> 0x10000000 -> 'elite'
setLastMemoryValue(0x20000008, 0x00)
setLastMemoryValue(0x20000009, 0x00)
setLastMemoryValue(0x2000000a, 0x00)
setLastMemoryValue(0x2000000b, 0x10)
setLastMemoryValue(0x2000000c, 0x00)
setLastMemoryValue(0x2000000d, 0x00)
setLastMemoryValue(0x2000000e, 0x00)
setLastMemoryValue(0x2000000f, 0x00)

# RDI = argc
# RSI = *argv[]
setLastRegisterValue(Register(REG.RDI, 0x2))
setLastRegisterValue(Register(REG.RSI, 0x20000000))

# Setup stack
setLastRegisterValue(Register(REG.RSP, 0x7fffffff))
setLastRegisterValue(Register(REG.RBP, 0x7fffffff))

print '[+] Start emulation'
emulate(START_POINT)

That's all! Now let's emulate the code.

def emulate(pc):
    while pc:
        # Get back the good opcodes
        opcodes = array.array('B', getMemoryAreaValue(pc, 20)).tostring()

        # Create the Triton instruction
        instruction = Instruction()
        instruction.setOpcodes(opcodes)
        instruction.setAddress(pc)

        # Process
        processing(instruction)

        print instruction

        # Check the end point or if triton doesn't supports this instruction
        if pc == END_POINT or not len(instruction.getSymbolicExpressions()):
            break

        # Next!
        pc = buildSymbolicRegister(REG.RIP).evaluate()
    return

The full code is available here. For this example we used a simple program but as you can imagine, if you want to emulate a part of a real program, you have to manage syscalls and external calls. To do that, Triton allows you to create your own AST and expressions, then you can assign theses expressions to memory cells or registers.

 

4 - How do we test our semantics?

We (re)developed from scratch (and by hand (several times until the design was good enough)) the Intel x86 and x86_64 ISA behaviors which can be found in this file. I can hear what you're thinking, "These guys probably got a schizophrenia developing this file." and you are not totally wrong... For those who have already done this kind of stuffs, they now what we've gone through - fighting with the Intel manual because side effects are not explained or are just wrong (yes, what the Intel manual explains is sometime not the reality).

Anyway! All that to say that if you redevelop your own semantics you have to test your expressions. On our side, we made a script which uses our pintool tracer to maintain two traces in parallel, a concrete one (given by Pin) and another one which is our AST expressions. Then, at each program point, we evaluate AST of all registers and memory accesses and we compare the evaluated value to the real CPU value. If there is a difference, we dump the instruction, the real CPU context and the AST, then we try to fix the issue offline after three cups of coffee and two doses of Paracetamol.

For example, you can try to run our script on a binary and see what is going on:

$ ./triton ./src/testers/check_semantics.py /usr/bin/evince
[OK] 0x41c8d0: xor ebp, ebp
[OK] 0x41c8d2: mov r9, rdx
[OK] 0x41c8d5: pop rsi
[OK] 0x41c8d6: mov rdx, rsp
[OK] 0x41c8d9: and rsp, 0xfffffffffffffff0
[OK] 0x41c8dd: push rax
[OK] 0x41c8de: push rsp
[OK] 0x41c8df: mov r8, 0x44a4d0
[OK] 0x41c8e6: mov rcx, 0x44a460
[OK] 0x41c8ed: mov rdi, 0x41c340
[OK] 0x41c8f4: call 0x41be40
[OK] 0x41be40: jmp qword ptr [rip + 0x247672]
[OK] 0x7ff9328f4530: push r14
[OK] 0x7ff9328f4532: push r13
[OK] 0x7ff9328f4534: push r12
[OK] 0x7ff9328f4536: push rbp
[OK] 0x7ff9328f4537: mov rbp, rcx
[OK] 0x7ff9328f453a: push rbx
[OK] 0x7ff9328f453b: sub rsp, 0x90
[OK] 0x7ff9328f4542: mov rax, qword ptr [rip + 0x3769bf]
[OK] 0x7ff9328f4549: mov qword ptr [rsp + 0x18], rdi
[OK] 0x7ff9328f454e: mov dword ptr [rsp + 0x14], esi
[OK] 0x7ff9328f4552: mov qword ptr [rsp + 8], rdx
[OK] 0x7ff9328f4557: test rax, rax
[OK] 0x7ff9328f4627: xor edx, edx
[OK] 0x7ff9328f4629: jmp 0x7ff9328f4569
[OK] 0x7ff9328f4569: lea rax, qword ptr [rip + 0x376b50]
[OK] 0x7ff9328f4570: test r9, r9
[OK] 0x7ff9328f4573: mov dword ptr [rax], edx
[OK] 0x7ff9328f4575: je 0x7ff9328f4583
[OK] 0x7ff9328f4577: xor edx, edx
[OK] 0x7ff9328f4579: xor esi, esi
[OK] 0x7ff9328f457b: mov rdi, r9
[OK] 0x7ff9328f457e: call 0x7ff932909f30
[OK] 0x7ff932909f30: push r12
[OK] 0x7ff932909f32: push rbp
[OK] 0x7ff932909f33: mov r12, rsi
[OK] 0x7ff932909f36: push rbx
[OK] 0x7ff932909f37: mov rbx, rdi
[OK] 0x7ff932909f3a: lea rdi, qword ptr [rip + 0x361797]
[OK] 0x7ff932909f41: mov rbp, rdx
[OK] 0x7ff932909f44: call 0x7ff932909d30
[OK] 0x7ff932909d30: push rbx
[OK] 0x7ff932909d31: mov esi, 1
[OK] 0x7ff932909d36: mov rbx, rdi
[OK] 0x7ff932909d39: xor eax, eax
[OK] 0x7ff932909d3b: cmp dword ptr [rip + 0x366efe], 0
[OK] 0x7ff932909d50: cmpxchg dword ptr [rip + 0x3634b9], esi
[OK] 0x7ff932909d73: mov rsi, qword ptr [rbx]
[OK] 0x7ff932909d76: test rsi, rsi
[OK] 0x7ff932909d79: je 0x7ff932909eab
[...]

Note that sometime there are some differences due to undefined behaviors such as flags. However, if you find a real difference which implies a wrong semantics, I suggest to open an issue and if you fix the issue by yourself, you will be my hero :-).

 

5 - How many semantics Triton supports?

Currently Triton only supports ~220 instructions and the full list can be found here. Thanks to @wisk and his awesome Medusa project which provides some Intel semantics.

Currently what is supported by Triton is enough for lots of real programs. However, if you need a specific instruction you can open an issue or add the semantics by yourself. For instance, the SUB semantics looks like this:

01. void sub_s(triton::arch::Instruction& inst) {
02.   auto& dst = inst.operands[0];
03.   auto& src = inst.operands[1];

04.   /* Create symbolic operands */
05.   auto op1 = triton::api.buildSymbolicOperand(inst, dst);
06.   auto op2 = triton::api.buildSymbolicOperand(inst, src);

07.   /* Create the semantics */
08.   auto node = triton::ast::bvsub(op1, op2);

09.   /* Create symbolic expression */
10.   auto expr = triton::api.createSymbolicExpression(inst, node, dst, "SUB operation");

11.   /* Spread taint */
12.   expr->isTainted = triton::api.taintUnion(dst, src);

13.   /* Upate symbolic flags */
14.   triton::arch::x86::semantics::af_s(inst, expr, dst, op1, op2);
15.   triton::arch::x86::semantics::cfSub_s(inst, expr, dst, op1, op2);
16.   triton::arch::x86::semantics::ofSub_s(inst, expr, dst, op1, op2);
17.   triton::arch::x86::semantics::pf_s(inst, expr, dst);
18.   triton::arch::x86::semantics::sf_s(inst, expr, dst);
19.   triton::arch::x86::semantics::zf_s(inst, expr, dst);

20.   /* Upate the symbolic control flow */
21.   triton::arch::x86::semantics::controlFlow_s(inst);
22. }

Lines 2 to 3 get back the abstract operands. Lines 5 to 6 build the appropriate AST based on the previous expressions (SSA). Line 8 is the semantic of the SUB operation. Line 10 assigns the new expression to the destination. Line 12 spreads the taint. Line 14 to 19 define flags and the line 21 defines the program counter register (RIP). That's all! This template can be used for several instructions. The only thing which will change between instructions, is the line 8 which represents the real operation of the instruction.

 

5.1 - Few words about floating-point instructions

As you can see we have started to add some SSE instructions, however we added the easy ones. Indeed, all arithmetic floating-point instructions are not trivial to implement when we work in the symbolic execution area. There are so many problems to add the FPA and one of them is that solvers still don't deal very well with multiple theories into a same expression.

Example: let assume an arithmetic computation of two bitvectors and the result is stored into the memory. Then, this result is loaded into a floating point register and another arithmetic computation is applied to this register. Here, if we slice the expression we've got two theories into a same expression and we cannot solve this.

One of our first ideas was to redevelop and decompose (separate the mantissa and exponent) the process of the computation of a floating number into the bitvector operations but this implies huge expressions which will make later the solver to get lost...

So, the plan B is to wait for an improvement of the smt2-lib research area which is in awesome expansion since these last years.

 

6 - Conclusion

In this little blog post we have put the spotlight on the semantics part and we've seen what kind of semantics information Triton can provide by giving opcodes, then we browsed very quickly the Python API. Note that there is still a lot of things that we didn't explain such as the taint part, the SMT simplification passes, the symbolic optimization or the solver part.

Triton can be used to analyse a binary dynamically as well as statically but keep in mind that it's designed to favor the dynamic part, tracing real programs. We can confirm that the libTriton works very well on Linux and OSX but still not on Windows. There is still a lot of features that we want to add but the time is missing. So, if this blog post interested you, we will be happy to work with you and to accept your pull requests.

Hope that this post made you want to play with Triton and if you have some questions don't hesitate to contact us at triton at quarkslab com or via our #qb_triton channel on freenode.

Cheers,