Conference DEFCON 18. Troll reverse engineering with the help of mathematics

Trolling with math is what I'm going to talk about. This is not some kind of fashionable hacker stuff, rather, it’s an artistic expression, a funny, intelligent technology so that people would consider you a jerk. Now I will check if my report is ready for display on the screen. It seems everything is going fine, so I can introduce myself. 3r37474.  3r33490. 3r37474.  3r33490. Conference DEFCON 18. Troll reverse engineering with the help of mathematics
3r37474.  3r33490. 3r37474.  3r33490. My name is Frank Tu, it is written frank ^ 2 and @franksquared on Twitter, because Twitter also has some kind of spammer called "frank 2". I tried to apply social engineering to them so that they would delete his account, because technically it is spam and I have the right to get rid of it as a clone. But apparently, if you act honestly with them, they do not reciprocate you, because despite my request to delete the spam account, they didn’t do anything with it, so I sent this fucking Twitter to hell. 3r37474.  3r33490. 3r37474.  3r33490. Many people recognize me by my cap. I work in regional groups DefCon DC949 and DC310. I also work with Rapid? but I can’t talk about it here without obscenities, and my manager doesn’t want me to swear. So, I prepared this talk for DefCon and I’m going to meet in 15 minutes, although this is quite a difficult topic. In essence, this is a standard presentation, which is devoted to reverse engineering and related funny things. 3r37474.  3r33490. 3r37474.  3r33490. When discussing this topic on Twitter, two camps were formed. One guy said, “I have no idea what this fucking frank ^ 2 is talking about, but it's awesome!” The second guy from Reddit saw my slides and was upset about the links to things that were not relevant to the topic, got angry that such a serious topic was not fully covered, so I wished for my presentation to have “more content and less garbage”. 3r37474.  3r33490. 3r37474.  3r33490. 3r37474.  3r33490. 3r37474.  3r33490. You see, the “medications” are much more, but don’t worry, now the share of science has slightly increased. 3r37474.  3r33490. 3r37474.  3r33490. 3r361 3r37474.  3r33490. 3r37474.  3r33490. So, some time ago, my friend Merlin, sitting here in the front row, wrote an amazing bot based on the IRC Python script, which occupies just one line. 3r37474.  3r33490. 3r37474.  3r33490. This is really a terrific exercise for learning functional programming, which is a lot of fun. You can simply add one function after another and get combinations of all sorts of different functions, and all this is drawn on the screen as a rainbow wave, in general, this is one of the most stupid things you can do. 3r37474.  3r33490. 3r37474.  3r33490. 3r37474.  3r33490. 3r37474.  3r33490. So, the formula f (x) is very simple in its meaning, it works like ordinary functions. You have X, you have input, and then you get X 7 times, and this is equal to your value. In Python, you can make a function (lambda x: x * 7). If you want to work with Java - I'm sorry, I hope you never want to do it - you can do something like:
 3r33490. 3r37474.  3r33490.
public static int multiplyBySevenAndReturn (Integer x)
{return x * 7;}
3r3-300. 3r37474.  3r33490. You know, math functions can even be much more complicated, but that's all we need to know about them at the moment. 3r37474.  3r33490. 3r37474.  3r33490. If we consider the assembly of the code, then you will notice that the JMP and CALL instructions are not tied to specific values, they work with an offset. If you use a debugger, then you can see that the JMP00401000 is more like a “jump over several bytes forward” instructions than a specific order to jump over 5 or 10 bytes. The same applies to the CALL function, except that it pushes a whole bunch of things into your stack. The exception is the case when you “paste” the address to the register, that is, refer to a specific address. Here everything is completely different. After you hook an address to a register and do something like CALL EAX, the function accesses a specific value in EAX. The same goes for CALL[EAX]or JMP[EAX]- it just de-EAX and goes to this address. When using a debugger, you may not be able to determine which specific address CALL is accessing. This can be a problem, so you should be aware of this. 3r37474.  3r33490. Let's take a look at the JMP SHORT “short jump” function. This is a special instruction in the x86 architecture that allows you to use an offset of 1 byte instead of an offset of 4 bytes, which reduces the used memory space. This will be of significance later for all the manipulations that will occur with the individual instructions. It is important to keep in mind that JMP SHORT has a range of 256 bytes. However, there is no such thing as CALL SHORT. 3r37474.  3r33490. 3r37474.  3r33490. 3r3113. 3r37474.  3r33490. 3r37474.  3r33490. Now consider the magic of computer science. In the middle of creating these slides, I realized that in fact you can define an assembly as zero space, that is, technically, there is zero space between each instruction. If you look at the individual instructions, you will see that each is executed one by one after the other instruction. Technically, this can be interpreted as an unconditional jump to the next instruction. This gives us the space between each assembly instruction, while each instruction is correspondingly associated with an unconditional leap. 3r37474.  3r33490. 3r37474.  3r33490. If you look at this example of assembly, by the way, these are very simple things that I recommend to decode using ASCII, so, this is just a set of ordinary instructions. 3r37474.  3r33490. 3r37474.  3r33490.  3r33490. 3r37474.  3r33490.
 3r33490.
disassemble each instruction to find out what the code is; 3r33333.  3r33490.
allocate a place in memory that is much larger than the size of the instruction set. I usually reserve 10 times the memory size of the code; 3r33333.  3r33490.
for each instruction, determine f (x); 3r33333.  3r33490.
set each instruction to the corresponding (x, y) memory location; 3r33333.  3r33490.
add an unconditional jump to the instruction; 3r33333.  3r33490.
Mark the memory as executable and run the code. 3r33333.  3r33490.
3r37474.  3r33490. Unfortunately, many questions arise here. It's like with gravity, which works only in theory, but in practice we see a completely different one. Because in reality, x86 sends your JMP instructions, CALL instructions to the line, distorts your self-referential code, a self-modifying code that uses iteration. 3r37474.  3r33490. 3r37474.  3r33490. 3r3168. 3r37474.  3r33490. 3r37474.  3r33490. Let's start with JMP instructions. Since JMP instructions have an offset, when placed in an arbitrary location, they no longer point to where you think they should be. SHORT JMP find themselves in a similar position. Randomly placed by your function (x, y), they will not indicate what you are counting on. But unlike “long” JMP, “short” JMP is easier to fix, especially if you are dealing with a one-dimensional array. SHORT JMP is easy to convert to regular JMP, but then you have to figure out what the new offset has become. 3r37474.  3r33490. 3r37474.  3r33490. Working with register-based JMP is another headache, and because they require hard shifts and can be calculated in the process of execution, there is no easy way to know where they are going. To automatically identify each register, you need to use a bunch of knowledge from compilation theory. In the process of execution there may be function pointers, class pointers, and the like. However, if you do not want to do additional work in order to do all this, then you can not do it. The functions f (x) work in real code is not as elegant as on paper. If you want to do it properly, you need to do a lot of work. 3r37474.  3r33490. 3r37474.  3r33490. To define class pointers and similar things, you need to conjure with C and C ++. Before saving, during disassembling, convert your SHORT JMP to regular JMP, because you have to deal with the offset, it is quite simple. 3r37474.  3r33490. 3r37474.  3r33490. Trying to calculate actual displacements is a huge headache. All instructions found by you have offsets that will move when the code is moved, and must be recalculated. This means that you need to follow the instructions and where they are going, like goals. I find it difficult to explain to you on slides, but an example of how to achieve this is on the CD with the materials of this conference. 3r37474.  3r33490. 3r37474.  3r33490. After you place all the instructions, replace the old offsets with the new offsets. If you do not damage the offset, then everything will work out. Now, when you are prepared, there is a real opportunity to translate ideas at the highest level. For this you need:
 3r33490. 3r37474.  3r33490.
 3r33490.
disassemble instructions; 3r33333.  3r33490.
prepare a memory buffer; 3r33333.  3r33490.
initialize existing constants f (x); 3r33333.  3r33490.
iterate the values ​​of f (x) and certain data pointers on which your code will be written while tracking fucking instructions; 3r33333.  3r33490.
assign instructions to the corresponding pointers created; 3r33333.  3r33490.
fix all conditional jumps; 3r33333.  3r33490.
Mark a new section of the memory as executable; 3r33333.  3r33490.
execute code. 3r33333.  3r33490.
3r37474.  3r33490. If you put everything right in places, then strange things turn out - everything gets confused, instructions jump to strange places of memory, and it all looks just enchanting. 3r37474.  3r33490. 3r37474.  3r33490. 3r37474.  3r33490. 3r37474.  3r33490. So, I enter the command “to confuse by the formula” in the opened window. 3r37474.  3r33490. 3r37474.  3r33490. 3r37474.  3r33490. 3r37474.  3r33490. Here the active function CALL EAX is highlighted, then the jump instruction follows, which will be applied, you see a bunch of different things in the buffer, and all this is done with each individual instruction. 3r37474.  3r33490. 3r37474.  3r33490. 3r37474.  3r33490. 3r37474.  3r33490. Now I rewind the program to the end, and you will see the result. So, the code still looks great, a bunch of JMP instructions are collected here, it looks confusing, and in reality it is confusing. 3r37474.  3r33490. 3r37474.  3r33490. 3r33300. 3r37474.  3r33490. 3r37474.  3r33490. The next slide shows a graphical representation of how the stack looks. 3r37474.  3r33490. 3r37474.  3r33490. 3r3309. 3r37474.  3r33490. 3r37474.  3r33490. Every time this happens, I generate a random sinusoidal wave formula that wears an arbitrary shape, you see a bunch of different shapes, and that's cool. I think that the code starts somewhere at the top left, but I do not remember exactly. This is how it twists everything, you can not only make sinusoids, but also twist spirals. 3r37474.  3r33490. 3r37474.  3r33490. 3r33333. 3r37474.  3r33490. 3r37474.  3r33490. Only two formulas work here that I have included in the source code. Based on this, you can do many creative things that you want, in essence, this is just a DIFF from the initial buffer to the final buffer. 3r37474.  3r33490. 3r37474.  3r33490. The problem is that this code example uses unconditional jumps, which is actually bad, because the code must be exactly the same as before, that is, unconditional jumps follow only in one direction. Therefore, you need to go from the entry point to the end in the same way, get rid of the jump instructions and you're done - you got your code! What to do? It is necessary to turn unconditional jumps into conditional ones. Conditional jumps are performed in two directions, it is much better, we can say that it is 50% better. 3r37474.  3r33490. 3r37474.  3r33490. Here we have an interesting dilemma: if we need conditional jumps, then we also need to use unconditional jumps what the fuck? And what should we do? Opaque predicates will save us! For those who do not know, an opaque predicate is essentially a boolean statement that always holds for a particular version, regardless of anything. 3r37474.  3r33490. So let's consider the zero space expansion that I mentioned earlier. If you have a set of instructions and they have unconditional jumps of transitions between each instruction, it follows from this that a series of assembly instructions that do not have a direct effect on the instructions we need may precede or follow a single instruction. 3r37474.  3r33490. For example, if you have written very specific instructions that do not change the main assembly of what you are trying to confuse, that is, you try not to contact registers as long as you maintain the state of each assembly instruction. And this is even more amazing. 3r37474.  3r33490. You can view each assembly instruction that can be confused, like a preamble, assembly data, and postscript. The preamble is what precedes the assembly instruction, and the postscript is what follows it. The preamble is usually used or can be used for two things:
 3r33490. 3r37474.  3r33490.
 3r33490.
correction of the consequences of the opaque predicate of the previous preamble; 3r33333.  3r33490.
anti-debugging code snippets. 3r33333.  3r33490.
3r37474.  3r33490. But the preamble is substantially limited, because you cannot do too much. 3r37474.  3r33490. Postscript is more fun stuff. It can be used for:
 3r33490. 3r37474.  3r33490.
 3r33490.
opaque predicates and tangled jumps to the following sections of the code; 3r33333.  3r33490.
antidebugging and obfuscation of general code execution; 3r33333.  3r33490.
encrypt and decrypt various code fragments in the program itself. 3r33333.  3r33490.
3r37474.  3r33490. Right now I am working to make it possible to encrypt and decrypt each individual instruction so that when each instruction is executed, it decrypts the next section, the next section, the next, and so on. The next slide shows an example of this. 3r37474.  3r33490. 3r37474.  3r33490. 3r37474.  3r33490. 3r37474.  3r33490. That's all, thank you for coming! 3r37474.  3r33490. 3r37474.  3r33490. 3r33448.
3r33450.
3r37474.  3r33490. 3r37474.  3r33490. Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending a friend, 30% discount for Habr users on a unique counterpart to the entry-level servers that we invented for you: 3r33462. The whole truth about VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR???GB SSD 1Gbps from $ 20 or how to share the server?
(Options are available with RAID1 and RAID1? up to 24 cores and up to 40GB DDR4). 3r37474.  3r33490. 3r37474.  3r33490. 3r33478. VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR???GB SSD 1Gbps until January for free 3r3481. If you pay for a period of six months, you can order here 3r3483. . 3r37474.  3r33490. 3r37474.  3r33490. 3r33478. Dell R730xd 2 times cheaper? [/b] Only we have 3r33479. 2 x Intel Dodeca-Core Xeon E5-2650v???GB DDR4 6x480GB SSD 1Gbps 100 TV from $ 249 r3r3483. in the Netherlands and the USA! Read about
How to build the infrastructure of the building. class c using servers Dell R730xd E5-2650 v4 worth 9000 euros for a penny?

3r33490. 3r33490. 3r33490.
3r33490.
+ 0 -

Add comment