A Guide to Assembler x86 for Beginners

In our time, there is rarely a need to write in pure assembler, but I definitely recommend this to anyone who is interested in programming. You will see things from a different angle, and skills will come in handy when debugging code in other languages.
In this article, we will write the calculator from scratch. reverse Polish entry (RPN) on pure assembler x86. When finished, we can use it like this:
$ ./calc "32 + 6 *" # "(3 + 2) * 6" in the infix notation

All the code for article here . It is abundantly commented out and can serve as a training material for those who already know assembler.
Let's start with writing the basic program Hello world! to check the environment settings. Then let's move on to system calls, call stacks, stack frames, and x86 call conventions. Then, for practice, write some basic functions in assembler x86 - and start writing the RPN calculator.
at hand. table ASCII .

Hello, world!

To verify the environment, save the following code in the file. calc.asm :
    ; The builder finds the _start symbol and starts executing the program
; from here.
global _start
; The .rodata section stores the constants (read-only)
; The order of the sections does not matter, but I like to put it forward
section .rodata
; We declare a pair of bytes as hello_world. Pseudo-instruction of the NASM
database. ; allows a single-byte value, a string constant or a combination thereof,
; like here. 0xA = new line, 0x0 = zero of the end of the line
hello_world: db "Hello world!", 0xA, 0x0
; The beginning of the .text section, where the program code is
section .text
mov eax, 0x04; write the number 4 in the register eax (0x04 = write ())
mov ebx, 0x1; file descriptor (1 = standard output, 2 = standard error)
mov ecx, hello_world; pointer to the output string
mov edx, 14; the length of the string is
int 0x80; send the interrupt signal 0x8? which is OS
; interprets as the system call
mov eax, 0x01; 0x01 = exit ()
mov ebx, 0; 0 = no errors
int 0x80

Comments explain the overall structure. List of registers and general instructions can be studied in "Assembler's Handbook x86 of the University of Virginia" . With further discussion of system calls, this is all the more necessary.
The following commands collect the assembler file into an object file, and then compiles the executable file:
    $ nasm -f elf_i386 calc.asm -o calc
$ ld -m elf_i386 calc.o -o calc

After the launch, you should see:
    $ ./calc
Hello world!



This is an optional part, but to simplify the assembly and layout in the future, you can do Makefile . Save it in the same directory as calc.asm :
    CFLAGS = -f elf32
LFLAGS = -m elf_i386
all: calc
calc: calc.o
ld $ (LFLAGS) calc.o -o calc
calc.o: calc.asm
nasm $ (CFLAGS) calc.asm -o calc.o
rm -f calc.o calc

Then, instead of the above instructions, simply run make.

System calls

System calls to Linux They tell the OS to do something for us. In this article, we use only two system calls: write () to write a line to a file or stream (in our case this is a standard output device and a standard error) and exit () to exit the program:
    syscall 0x01: exit (int error_code)
error_code - use 0 to exit without errors and any other values ​​(such as 1) for errors
syscall 0x04: write (int fd, char * string, int length)
fd - use 1 for standard output, 2 for standard error output stream
string is a pointer to the first character of the string
length - the length of the string in bytes

System calls are configured by storing the system call number in the register. eax , and then its arguments in ebx , ecx , edx in this order. You can see that in exit () only one argument - in this case ecx and edx do not matter.
eax ebx ecx edx
The number of the system call is arg1 arg2 arg3


Call stack

A Guide to Assembler x86 for Beginners  
A call stack is a data structure that stores information about each access to a function. Each call has its own section in the stack - a "frame". It stores some information about the current call: the local variables of this function and the return address (where the program should go after the function is executed).
At once I shall note one not obvious thing: the stack increases down by memory. When you add something to the top of the stack, it is inserted at the memory address lower than the previous one. In other words, as the stack grows, the memory address at the top of the stack decreases. To avoid confusion, I will always remind you of this fact.
Instruction push brings something to the top of the stack, and pop carries away data from there. For example, push ex allocates a space at the top of the stack and places the value from register there. eax , and pop eax transfers any data from the top of the stack to eax and frees this area of ​​memory.
The purpose of register esp - point to the top of the stack. Any data above esp are considered not on the stack, these are garbage data. Running the instruction push (or pop ) moves esp . You can manipulate esp and directly, if you give a report to your actions.
Register ebp is similar to esp , but it always points to the middle of the current frame of the stack, just before the local variables of the current function (we'll talk about this later). However, calling another function does not move ebp automatically, you need to do it manually each time.

Calls agreement for the x86 architecture

In x86 there is no built-in function concept in both high-level languages. Instruction call - it's essentially just jmp ( goto ) To another memory address. To use subroutines as functions in other languages ​​(which can take arguments and return data back), you need to follow a convention of calls (there are many conventions, but we use CDECL, the most popular convention for x86 among C compilers and assembler programmers). This also ensures that the subroutine registers are not confused when calling another function.

Rules of the calling party

Before calling a function, the calling party must:
  2. Save the registers on the stack that the caller must store. The called function can change some registers: in order not to lose data, the calling party must store them in memory before putting it on the stack. It's about registers eax , ecx and edx . If you do not use any of them, you can not save them.  
  3. Write the function arguments to the stack in reverse order (first the last argument, at the end the first argument). This order ensures that the called function gets its arguments from the stack in the correct order.  
  4. Call the subroutine.  

If possible, the function will store the result in eax . Immediately after call the calling party must:
  2. Remove the function arguments from the stack. This is usually done by simply adding the number of bytes to esp . Do not forget that the stack is growing down, so you need to add bytes to remove from the stack.  
  3. Recover the saved registers by taking them from the stack in reverse order by the instruction pop . The called function will not change any other registers.  

The following example shows how these rules are applied. Suppose that the function _subtract takes two integer (4-byte) arguments and returns the first argument minus the second. In the subroutine _mysubroutine call _subtract with the arguments 10 and 2 :
; here is some code
push ecx; keep the registers (I decided not to save eax)
push edx
push 2; the second rule, we put the arguments in the reverse order
push 10
call _subtract; eax is now 10-2 = 8
add esp, 8; delete 8 bytes from the stack (two arguments of 4 bytes)
pop edx; restore the saved registers
pop ecx
; still some code where I use a surprisingly useful value from eax


Rules of the called subroutine

Before calling the subroutine must:
  2. Save the base register pointer ebp previous frame, writing it on the stack.  
  3. Adjust ebp from the previous frame to the current frame (current value is exp ).  
  4. Select more space in the stack for local variables, move the pointer if necessary. esp . As the stack grows down, you need to subtract the missing memory from esp .  
  5. Save the registers of the called subroutine on the stack. This is ebx , edi and esi . It is not necessary to save registers that you do not plan to modify.  

Call stack after step 1:
Call stack after step 2:
The call stack after step 4:
In these diagrams, a return address is specified in each stack frame. It automatically inserts the instruction into the stack. call . Instruction ret extracts the address from the top of the stack and goes to it. We do not need this instruction, I just showed why the local variables of the function are 4 bytes above ebp , but the function arguments are 8 bytes below ebp .
In the last diagram, you can also notice that the local variables of the function always start 4 bytes above ebp from the address ebp-4 (here subtraction, because we are moving up the stack), and the function arguments always start 8 bytes below ebp from the address ebp + 8 (addition, because we are moving down the stack). If you follow the rules of this convention, it will be with the variables and arguments of any function.
When the function is completed and you want to return, you must first install eax to the return value of the function, if necessary. In addition, you need:
  2. Recover the saved registers by removing them from the stack in the reverse order.  
  3. Free up the space allocated to the local variables in step ? if necessary: ​​done by simple installation of esp in ebp  
  4. Restore the base pointer ebp previous frame, taking it out of the stack.  
  5. Return with ret  

Now implement the function _subtract from our example:
push ebp; save the base pointer of the previous frame
mov ebp, esp; setting up ebp
; Here I would allocate a place on the stack for local variables, but I do not need them
; Here I would save the registers of the subroutine that I called, but I did not have anything
; I'm going to change
; Here the function
begins. mov eax,[ebp+8]; copy the first argument of the function to eax. Brackets
; means access to memory at ebp + 8
sub eax,[ebp+12]; subtraction of the second argument at ebp + 12 from the first
; argument
; Here the function ends, eax is equal to its return value
; Here I would restore the registers, but they were not saved
; Here I would free the stack of variables, but the memory for them was not allocated
pop ebp; restore the base pointer of the previous frame


Input and output

In the above example, you may notice that the function always starts the same way: push ebp , mov ebp , esp and allocating memory for local variables. In the x86 package there is a handy instruction that all this performs: enter a b , where a - the number of bytes that you want to allocate for local variables, b - "level of nesting", which we will always expose to 0 . In addition, the function always ends with the instructions pop ebp and mov esp , ebp (although they are only needed when allocating memory for local variables, but in any case do not cause harm). This can also be replaced by one instruction: leave . We are making changes:
enter ? 0; saving the base pointer of the previous frame and setting up ebp
; Here I would save the registers of the subroutine that I called, but I did not have anything
; I'm going to change
; Here the function
begins. mov eax,[ebp+8]; copy the first argument of the function to eax. Brackets
; means access to memory at ebp + 8
sub eax,[ebp+12]; subtraction of the second argument at ebp + 12 from
; the first argument is
; Here the function ends, eax is equal to its return value
; Here I would restore the registers, but they were not saved
leave; restore the base pointer of the previous frame


Writing some basic functions

Having mastered the agreement on calls, you can start writing some subroutines. Why not generalize the code that prints "Hello world!", To output any lines: function _print_msg .
Here you need one more function _strlen to calculate the length of the string. On C it can look like this:
    size_t strlen (char * s) {
size_t length = 0;
while (* s! = 0)
{//the beginning of the cycle
length ++;
s ++;
} //end of the cycle
return length;

In other words, from the very beginning of the line, we add 1 to the return value for each character, except for zero. As soon as the zero character is seen, we return the value accumulated in the loop. In assembler, this is also quite simple: you can use the previously written function as the base. _subtract :
enter ? 0; we save the base pointer of the previous frame and set up ebp
; Here I would save the registers of the subroutine that I called, but I did not have anything
; I'm going to change
; Here the function
begins. mov eax, 0; length = 0
mov ecx,[ebp+8]; the first argument of the function (pointer to the first
?; string character) is copied to ecx (it is saved by the calling
;; side, so we do not need to save it)
_strlen_loop_start:; this is a label where you can go
cmp byte[ecx], 0; dereference the pointer and compare it with zero. By
; the memory is read in 32 bits (4 bytes).
; Otherwise, you must specify explicitly. Here we indicate
; reading only one byte (one character)
je _strlen_loop_end; exit from the cycle when zero appears
inc eax; now we are inside the loop, add 1 to the return value
add ecx, 1; go to the next character in line
jmp _strlen_loop_start; transition back to the beginning of the cycle
; Here the function ends, eax is equal to the return value
; Here I would restore the registers, but they were not saved
leave; restore the base pointer of the previous frame

Already not bad, right? First, writing code in C can help, because most of it is directly converted to assembler. Now you can use this function in _print_msg , where we apply all the knowledge obtained:
enter ? 0
; Here the function
begins. mov eax, 0x04; 0x04 = system call write ()
mov ebx, 0x1; 0x1 = standard output
mov ecx,[ebp+8]; we want to print the first argument of this function,
; First, set edx to the length of the string. It's time to call _strlen
push eax; keep the registers of the called function (I decided not to save edx)
push ecx
push dword[ebp+8]; Let's put the _strlen argument in _print_msg. Here is NASM
; swears, if you do not specify the size, I do not know why.
; In either case, the pointer is dword (4 bytes, 32 bits)
call _strlen; eax is now equal to the length of the line
mov edx, eax; move the line size to edx, where we need it
add esp, 4; delete 4 bytes from the stack (one 4-byte argument char *)
pop ecx; restore the registers of the calling party
pop eax
; we have finished working with the function _strlen, we can initiate the system call
int 0x80

And let's see the fruits of our hard work, using this function in the full program "Hello, world!".
enter ? 0
; keep the registers of the calling party (I decided not to save any)
push hello_world; add the argument to _print_msg
call _print_msg
mov eax, 0x01; 0x01 = exit ()
mov ebx, 0; 0 = no errors
int 0x80

Believe it or not, we've covered all the main topics that are needed to write basic programs in assembler x86! Now we have all the introductory material and theory, so we'll concentrate on the code and apply the knowledge to write our RPN calculator. The functions will be much longer and will even use some local variables. If you want to see the finished program immediately, here it is .
For those of you who are not familiar with the reverse Polish record (sometimes called the reverse Polish notation or postfix notation), then the expressions are evaluated using the stack. Therefore, you need to create a stack, as well as the functions _pop and _push for manipulating this stack. You will also need the function _print_answer , which will output a string representation of the numeric result at the end of the calculations.

Creating the stack

First, we define the memory space for our stack, as well as the global variable stack_size . It is advisable to change these variables so that they do not fall into the section .rodata , and in .data .
    section .data
stack_size: dd 0; create a variable dword (4 bytes) with the value 0
stack: times 256 dd 0; fill the stack with zeros

Now you can implement the functions _push and _pop :
enter ? 0
; We keep the registers of the called function, which we will use
push eax
push edx
mov eax,[stack_size]
mov edx,[ebp+8]
mov[stack + 4*eax], edx; We enter the argument to the stack. We scale on
; four bytes in accordance with the size of dword
inc dword[stack_size]; Add 1 to the stack_size
; Restore the registers of the called function
pop edx
pop eax
enter ? 0
; We keep the registers of the called function
dec dword[stack_size]; First, subtract 1 from stack_size
mov eax,[stack_size]
mov eax,[stack + 4*eax]; Enter the number on the top of the stack in eax
; Here I would restore the registers, but they were not saved


Conclusion of the numbers

_print_answer is much more difficult: you will have to convert numbers to strings and use several other functions. We need the function _putc , which outputs one character, the function mod to calculate the remainder of division (module) of two arguments and _pow_10 for raising to the power of 10. Later you will understand why they are needed. It's pretty simple, here's the code:
enter ? 0
mov ecx,[ebp+8]; asksecx (stored by the calling party) argument
; functions
mov eax, 1; first degree 10 (10 ** 0 = 1)
_pow_10_loop_start:; multiplies eax by 10 if ecx is not equal to 0
cmp ecx, 0
je _pow_10_loop_end
imul eax, 10
sub ecx, 1
jmp _pow_10_loop_start
enter ? 0
push ebx
mov edx, 0; is explained below
mov eax,[ebp+8]
mov ebx,[ebp+12]
idiv ebx; divides the 64-bit integer[edx:eax]on ebx. We want to divide
; only a 32-bit integer eax, so set edx to
; zero.
; We store the quotient in eax, the remainder in edx. As usual, get
; information on a specific instruction can be found in the directories,
; listed at the end of the article.
mov eax, edx; returns the remainder of the division (module)
pop ebx
enter ? 0
mov eax, 0x04; write ()
mov ebx, 1; standard output is
lea ecx,[ebp+8]; the input symbol is
mov edx, 1; output only 1 character
int 0x80

So, how do we display individual numbers in a number? First, note that the last digit of the number is the remainder of the division by 10 (for example, 123% 10 = 3 ), And the next digit is the remainder of the division by 100 divided by 10 (for example, (123% 100) /10 = 2 ). In general, you can find a specific number digit (from right to left), finding (number% 10 ** n) /10 ** (n-1) , where the number of units will be n = 1 , the number of tens of n = 2 and so on.
Using this knowledge, you can find all the digits of a number with n = 1 up to n = 10 (this is the maximum number of digits in the signed 4-byte whole). But it's much easier to go from left to right - so we can print each character as soon as we find it, and get rid of the zeros on the left. Therefore, we sort out the numbers from n = 10 up to n = 1 .
On C the program will look something like this:
    #define MAX_DIGITS 10
void print_answer (int a) {
if (a < 0) { //если число отрицательное
putc ('-'); //print the minus sign
.a = -a; //convert to a positive number
int started = 0;
.for (int i = MAX_DIGITS ; i> 0; i--) {
.int digit = (a% pow_10 (i)) /pow_10 (i-1);
if (digit == 0 && started == 0) continue; //not output extra zeros
started = 1;
putc (digit + '0');
Now you understand why we need these three functions. Let's implement it in assembler:
    % define MAX_DIGITS 10
enter ? 0; we use 1 byte for the variable "started" in code C
push ebx
push edi
push esi
mov eax,[ebp+8]; our argument is "a"
cmp eax, 0; if the number is not negative, skip this conditional
; operator
jge _print_answer_negate_end
; call putc for '-'
push eax
push 0x2d; the symbol '-'
call _putc
add esp, 4
pop eax
neg eax; transform into a positive number
mov byte[ebp-4], 0; started = 0
mov ecx, MAX_DIGITS; variable i
cmp ecx, 0
je _print_answer_loop_end
; call pow_10 for ecx. Let's try to make ebx as a variable "digit" in C code.
; So far, we assign edx = pow_10 (i-1), and ebx = pow_10 (i)
push eax
push ecx
dec ecx; i-1
push ecx; The first argument for _pow_10 is
call _pow_10
mov edx, eax; edx = pow_10 (i-1)
add esp, 4
pop ecx; restore the value of i for ecx
pop eax
; end pow_10 call
mov ebx, edx; digit = ebx = pow_10 (i-1)
imul ebx, 10; digit = ebx = pow_10 (i)
; call _mod for (a% pow_10 (i)), that is (eax mod ebx)
push eax
push ecx
push edx
push ebx; arg? ebx = digit = pow_10 (i)
push eax; arg? eax = a
call _mod
mov ebx, eax; digit = ebx = a% pow_10 (i + 1), almost there
add esp, 8
pop edx
pop ecx
pop eax
; call termination mod
; divide ebx (the variable "digit") into pow_10 (i) (edx). I'll have to save the pair
; registers, because idiv uses for dividing and edx, eax. Since
; edx is our divisor, move it to some
; another register
push esi
mov esi, edx
push eax
mov eax, ebx
mov edx, 0
idiv esi; eax stores the result (digit)
mov ebx, eax; ebx = (a% pow_10 (i)) /pow_10 (i-1), the variable "digit" in the code C
pop eax
pop esi
; end division
cmp ebx, 0; if digit == 0
jne _print_answer_trailing_zeroes_check_end
cmp byte[ebp-4], 0; if started == 0
jne _print_answer_trailing_zeroes_check_end
jmp _print_answer_loop_continue; continue
mov byte[ebp-4], 1
; started = 1
add ebx, 0x30; digit + '0'
; call putc
push eax
push ecx
push edx
push ebx
call _putc
add esp, 4
pop edx
pop ecx
pop eax
; ending call putc
sub ecx, 1
jmp _print_answer_loop_start
pop esi
pop edi
pop ebx

It was a hard test! I hope the comments help to understand. If you are now thinking: "Why can not you just write printf ("% d") ? ", Then you will like the end of the article, where we will replace the function with this one!
Now we have all the necessary functions, it remains to implement the basic logic in _start - and that is all!

Calculation of the reverse Polish record

As we have already said, the reverse Polish record is calculated using the stack. When reading, the number is written to the stack, and when reading, the operator is applied to two objects at the top of the stack.
For example, if we want to calculate 84/3 + 6 * (This expression can also be written in the form 6384 /+ * ), the process is as follows:
Step The symbol Stack in front of Stack after
1 8 [] [8]
2 4 [8] [8, 4]
3 / [8, 4] [2]
4 3 [2] [2, 3]
5 + [2, 3] [5]
6 6 [5] [5, 6]
7 * [5, 6] [30]

If the input is a valid postfix expression, then at the end of the calculations there is only one element left on the stack - this is the answer, the result of the calculations. In our case, the number is 30.
In assembler, you need to implement something like this code in C:
    int stack[256]; //probably 256 is too much for our stack
int stack_size = 0;
int main (int argc, char * argv[]) {
char * input = argv[0];
size_t input_length = strlen (input);
for (int i = 0; i < input_length; i++) {
char c = input[i];
if (c> = '0' && c <= '9') { //если символ — это цифра
push (c - '0'); //convert the character to an integer and put in the stack is
} else {
} else {
.int b = pop ();
.int a = pop ();
.if (c == '+') {
.stop (a + b);
} else if (c == '-') {
? push (ab);
} else if (c == '*') {
        push (a * b);
} else if (c == '/') {
push (a /b);
} else {
error ("Invalid inputn");
exit (1);
if (stack_size! = 1) {
error ("Invalid inputn");
exit (1);
print_answer (stack[0]);
exit (0);

Now we have all the functions necessary to implement this, let's begin.
; The _start arguments are not the same as in other functions.
; Instead, esp points directly to argc (the number of arguments), and
; esp + 4 points to argv. Therefore, esp + 4 indicates the name
; programs, esp + 8 - to the first argument and so on
mov esi,[esp+8]; esi = "input" = argv[0]
; we call _strlen to determine the size of the input data
push esi
call _strlen
mov ebx, eax; ebx = input_length
add esp, 4
; end _strlen call
mov ecx, 0; ecx = "i"
cmp ecx, ebx; if (i> = input_length)
jge _main_loop_end
mov edx, 0
mov dl,[esi + ecx]; then load one byte from memory into the lower byte
; edx. The rest of the edx is zeroed.
; edx = variable c = input[i]
cmp edx, '0'
jl _check_operator
cmp edx, '9'
jg _print_error
sub edx, '0'
mov eax, edx; eax = variable c - '0' (number, not character)
jmp _push_eax_and_continue
; twice call _pop to carry out the variable b to edi, and the variable b to eax
push ecx
push ebx
call _pop
mov edi, eax; edi = b
call _pop; eax = a
pop ebx
pop ecx
; end call _pop
cmp edx, '+'
jne _subtract
add eax, edi; eax = a + b
jmp _push_eax_and_continue
cmp edx, '-'
jne _multiply
sub eax, edi; eax = a-b
jmp _push_eax_and_continue
cmp edx, '*'
jne _divide
imul eax, edi; eax = a * b
jmp _push_eax_and_continue
cmp edx, '/'
jne _print_error
push edx; save edx, because the register will be reset for idiv
mov edx, 0
idiv edi; eax = a /b
pop edx
; Now we put eax on the stack and continue
; call _push
push eax
push ecx
push edx
push eax; the first argument is
call _push
add esp, 4
pop edx
pop ecx
pop eax
; completion of call _push
inc ecx
jmp _main_loop_start
cmp byte[stack_size], 1
; if (stack_size! = 1), print the error
jne _print_error
mov eax,[stack]
push eax
call _print_answer
; print a final newline
push 0xA
call _putc
; exit successfully
mov eax, 0x01; 0x01 = exit ()
mov ebx, 0; 0 = no errors
int 0x80; here execution is completed by
push error_msg
call _print_msg
mov eax, 0x01
mov ebx, 1
int 0x80

It will be necessary to add the line error_msg in section .rodata :
    section .rodata
; Assign to some bytes error_msg. Pseudoinstruction of db in NASM
; allows to use single-byte value, string constant or their
; combination. 0xA = new line, 0x0 = zero of the end of the line
error_msg: db "Invalid input", 0xA, 0x0

And we're done! Surprise all your friends if you have them. I hope that now you will be more receptive to high-level languages, especially if you remember that many old programs were written completely or almost completely in assembler, for example, the original RollerCoaster Tycoon!
All the code is here . Thanks for reading! I can continue, if you are interested.

The next steps are

You can practice by implementing several additional functions:
  2. Issue an error message instead of segfault if the program does not receive an argument.  
  3. Add support for additional spaces between operands and operators in the input data.  
  4. Add support for multi-bit operands.  
  5. Allow negative numbers.  
  6. Replace the _strlen on the function of standard library C , and _print_answer replace with the call printf .

    Additional materials

    "X86 Assembler's Guide to the University of Virginia" - a more detailed presentation of many topics discussed by us, including additional information on all popular x86 instructions.
    "The Art of Choosing Intel Registers" . Although most x86 registers are general purpose registers, but many have historical significance. Following these conventions can improve the readability of the code and, as an interesting side effect, even slightly optimize the size of binary files.
    NASM: Intel x86 Instruction Reference - A complete guide to all the little-known x86 instructions.
+ 0 -

Comments 1

Tekreullere 13 September 2018 13:25
To give a guide about that thing which is important for them. Is here i saw in this british essay everyone want important thing which is good and new for them. And which also help them in the many stages of life where they need of that.

Add comment