libTriton
version 0.4 build 1351

[internal] All information about the taint engine.
Taint analysis is used to know at each program point what part of memory and register are controllable by the user input. There is three kinds of taint granularity but an infinite number of ways to implement this analysis:
Triton uses an overapproximation and we will describe why.
With an overapproximation, we lose precision and we may provide false positives. Example:
If we ask to the taint engine if we can control the comparison, it will say YES
because RAX
is tagged has tainted even if it's false. Actually, RAX[63..8]
is not tainted but RAX[7..0] is.
The only advantages of an overapproximation are:
This method is destructive on a big program, and so, totally useless for an analyst. An analyst wants precisions even if this is not all possibilities. Then, why an analyst may want to know if a register is tainted?
In exploit development, what the user wants in reality is knowing if a register is controllable by himself and knowing what values can hold this register at specific program point. Taint analysis (any overapproximation you choose) cannot give you this kind of information. A lots of instructions have an influence on the value that can hold a register. (Path conditions, arithmetic operations, ...)
Applying a symbolic execution and asking a model at each program point to know if a register is controllable or not is pretty expensive. Therefore, we use an overapproximation to fix the loss of time and if a register is tainted, we ask a model for the precision.
e.g
: Imagine this 16bits register [xxxxxxxx]
where x
are bits that the user can control and 
bits that the user cannot control. This state of register is setup like that due to arithmetic operations but may be something else with a different input value. In this case, it's not useful to know what bits are controllable by the user because they will probably change with another input value. In this case, using a perfectapproximation or an underapproximation is not useful. What we want is only knowing what values can hold this register according to the input.
That's why Triton uses symbolic execution for precision and an overapproximated tainting to know if we can ask a model to the SMT solver  Asking a model means that the symbolic variables are controllable by the user input.