When implementing the Fibonacci calculator I was able to understand that the real implementation challenge is actually the amount of register movements that the binary / library posses. This comes from the fact that actually it is very likely to nd at least one of each logical operations and memory store / load operations, but not always the operation on the right registers is found. This directly implies that if we are able to move registers as much as possible, then we can do all the operations found on different registers. We start with the following assumptions:
- There is at least one gadget in the form movl %(A), B
- There is at least one gadget in the form movl A, %(B)
- There is a subset of N registers (not containing esp) such that those can be exchanged and moved, without afecting the state of every other register in the set.
- There are gadgets that do add, sub, mul, div (at least one of each).
- There exists at least one gadget lahf and eax is part of the N subset of registers.
- There exists a gadget which changes the value of esp.
Item 1 & 2 ensure capabilities to load / store memory. Item 4 ensures logical operations. Items 5 & 6 ensure conditional jumps and 6 alone ensures possibility to call existing functions inside the binary. The assumption in item 3, is the one that ensures tha we have to nd only one logical operation of each in item 4, in order to be able to perform that instruction on any of the N registers. Memory stores and loads, those instructions are a must for every binary, and are very likely to be found as gadgets. The same applies for memory moves and exchanges. What we do first is we create a graph to see how movements and exchanges occur. Imageine we are able to nd the following 3 gadgets:
1. mov A, B; ret 2. xchg B, A; ret 3. xchg B, C; ret
The gadgets above will form the following graph (dashed lines represent – xchg, solid – mov):
It’s easy to see on the graph that we can easily implement mov B, A, by the following sequence:
xchg B, A; ret | mov A, B; ret
And it is also easy to see that we can implement xcgh A, C:
xchg B, A; ret | xchg B, C; ret | xchg B, A; ret
This results in the following graph:
rule 1. mov X, Z: xchg Y, Z | mov X, Y | xcgh Y, Z rule 2. xchg X, Z: xchg X, Y | xchg Y, Z | xchg X, Y
It is very easy to see that “iterating” through this graph, we can actually obtain all possible movs (by substitution of X,Y,Z):
(rule 1) [X/A][Y/B][Z/C] => mov A, C (rule 1) [X/A][Y/C][Z/A] => mov C, A (rule 1) [X/A][Y/C][Z/A] => mov C, A (rule 1) [X/B][Y/A][Z/C] => mov B, C (rule 1) [X/A][Y/A][Z/B] => mov B, C
In such a strong connected graph, as the one above, we only need one gadget, to implement all of them. So the idea is, to find the strongest connected graph for a given set of gadgets. And then we only need to find 2 registers where we can apply all possible logical operations that we are targeting. Then we have the Turing complete set of instructions and we can build an interpreter over that.