Assembly Language

5 min readMar 12, 2021

Assembly Language

An assembly Language is the lowest programming language except for the machine language. An assembly language controls hardware directly. Since it handles hardware, Assembly language is highly attached to hardware, so based on hardware architecture, its assembly language is also different. In this article, I will introduce a basic level of loop programs with the Aarch64 version and X86_64 version. Also, I will use GNU Assembler syntax instead of NASM Assembler syntax because NSAM only works with x86 code. If you want to know how to read assembly code about the basics you can find helpful guidelines from this article and this also.

Loop 30

In this article, I will write a simple loop program. It will print the output following:

Loop: 0

Loop: 1

Loop: 2

Loop: 3

Loop: 4

Loop: 5

Loop: 24

Loop: 25

Loop: 26

Loop: 27

Loop: 28

Loop: 29

Loop: 30

This program will generate 0–30, not 00–30, so if the number is less than 10, the program outputs only a single number.

X86_64 Version

.text
.global    _startstart = 0       
max = 31_start:
              movb   $start,%r15b         /* loop index */
loop:
              cmpb    $9, %r15b    /* if the loop index is greater than 9, it jumps to division label */
              jg     division 
             
              movb   $32,%r13b     /* align output verticaly, insert space */
              movb   %r13b,msg+6movb   $0,%r14b
              addb   %r15b,%r14b
              addb   $48,%r14b
              movb   %r14b,msg+7mov    $(len),%rdx                       /* message length */
              mov    $msg,%rsi                       /* message location */              
              jmp    printdivision:
              movb   $0,%al
              addb   %r15b,%al
              movb   $10,%r14b
              movb   $0,%ah   /* the remainder is stored on ah on the byte operation*/
              divb   %r14b            /* quotient -> al, remainder -> dl */
              
              addb   $48,%al
              movb   %al,msg+6                
              addb   $48,%ah
              movb   %ah,msg+7
            
              mov    $(len),%rdx                       /* message length */
              mov    $msg,%rsi                       /* message location */print:       /* standard output */movq    $1,%rdi                         /* file descriptor stdout */
              movq    $1,%rax                         /* syscall sys_write */
              syscall
              
              inc     %r15b                /* increment index */
              cmp     $max,%r15b           /* see if we're done */
              jne     loop                /* loop if we're not */mov     $0,%rdi             /* exit status */
              mov     $60,%rax            /* syscall sys_exit */
              syscall.data
              msg:
              .ascii      "Loop:   \n"
              len = . - msg

I commented on the first code, but I will explain how it works. In the beginning, you can see the .text designates code section. .global is a directive that tells where the code start.

On the _start label, the program initializes the value of an index as $start, which is 0, on r15b, the suffix ‘b’ means it indicates 8-bit registers. then the program starts a loop. In the loop, if the index is greater than 9, it jumps to the division label. But if the index is not, it inserts 32 (32 indicates space in ASCII code) into msg+6.

At the bottom of the code, you can see the .data section. if you set it as .rodata, you can only read data but if you set it as .data, you can edit data, and the msg holds the address of the beginning of the string. Therefore, msg + 6 means the 6th character on the msg string.

By inserting space, we can vertically aline the line of our output. After inserting the space, mov 0 to initialize the r14b. Then add the value of the index into r14b. To represent the number in ASCII format, we have to add 48 (which is 0 in ASCII) to r14b.

And then, we move the value of r14 to msg + 7, which adds the index number on msg + 7 location.

*Notice that suffix in movb means 8 bits, which replace only on the character on the msg string. If you use greater instruction, it would remove the other character also.

If the index is greater than 9, it will jump to the division label, which will divide a number into two single-digit numbers. The reason for dividing numbers is to represent a double-digit number on the string. Before using the divide instruction, you must initialize rdx as zero but in this case, we used an 8-bit register, we have to save the value in a different place. In this case, I will use the ah register. The quotient of dividing operation will go to will move to the al register.

After dividing the operation, we should set the length of the message on rdx, and set the msg on rsi. The rsi register will hold the start address of the msg.

Then next, on the print label, move 1 to rdi to set file descript as standard output. then after moving 1 to rax, we can call syscall, which will print the message as standard output.

Once, the program prints number, it increases the number until the value of the r15b is greater than the max value.

If the index reaches its limitation, mov 0 to the rdi to exit the program. 60 means sys_exit on the syscall, so after setting the rax as 60, we call syscall and end the program.

Aarch64 Version

.text
.globl _startstart = 0
max = 31 // the max value for loop
zero = 48_start:
        mov     x19,start  // x19 loop index
       add     x25,x25,10 // x25 holds number 10mov     x2, len    // set message length
        adr     x1, msg    // add memory addressmov     x23,32     // set x23 to hold 32 which represents space in the ascii format    
                           // in this program, the index is only incremental
                           // but the number is not incremental, the instruction should located 
                           // wihtin loop
loop:
/* Loop start */
        mov     x0, 1  // set std = 1 (stdout)cmp     x19, 9 // if index x19 is greater than 9, mov div
        b.gt    div   
 
        add     x24,x19,48 // add index into x24 with 48 to set number in the ascii formatstrb    w24, [x1,len-2] // write number on the address of the string
        strb    w23, [x1,len-3] // write space on the addres of the string to align vertically
       
        bl      print  // jump to print
div:
        udiv    x23,x19,x25     // divde the index x19 with 10 and save quotinet x23
        msub    x24,x23,x25,x19 // get the remain   
   
        add     x23,x23,48 // set number as ascii format
       add     x24,x24,48 // set number as ascii format
        
       strb    w23,[x1,len-3] // write numbers
       strb    w24,[x1,len-2] // write numbersprint: 
        mov     x8, 64        // set syscall, 64 = writing
        svc     0             // syscall
        add     x19,x19,1     // increase the loop index/* Loop end */cmp     x19, max     // if the index is less than max, jump to loop
        b.lt    loopmov     x0, 0
        mov     x8, 93       // set syscall, 93 = exit
        svc     0.section .data
msg:    .ascii          "Loop:   \n"
len=    . - msg

The Aarch64 version is pretty similar to the x86_64 version, so here I will explain the difference between the assembly languages. There are three main differences between x86_64 and Aaarch64 codes here. In x86_64 code, it calculates from left to right. But in Aarch64, it calculates from right to left. The second is that the purpose of registers is different. Finally, they have different instructions, for instance, Aarch64 use str instruction to write a character but X86_64 does not have it, so it takes more step to write a character. Therefore, instead of explaining the Aarch64 code, I would suggest that reading the Aarch64 code and compare the major differences between the two architecture.

End…

At the beginning of this lab, I was afraid of the assembly languages since it was so different from other high-level programming languages. after finishing this lap, I learned about how the hardware works in the low-level. Even though still I’m learning this, I’m glad to know the behind of the scene of the programming language and compilers.

Assembly Language

Assembly Language

Loop 30

X86_64 Version

Aarch64 Version

End…

Written by Yohan