Assembly Language

Yohan
5 min readMar 12, 2021

Assembly Language

An assembly Language is the lowest programming language except for the machine language. An assembly language controls hardware directly. Since it handles hardware, Assembly language is highly attached to hardware, so based on hardware architecture, its assembly language is also different. In this article, I will introduce a basic level of loop programs with the Aarch64 version and X86_64 version. Also, I will use GNU Assembler syntax instead of NASM Assembler syntax because NSAM only works with x86 code. If you want to know how to read assembly code about the basics you can find helpful guidelines from this article and this also.

Loop 30

In this article, I will write a simple loop program. It will print the output following:

Loop: 0

Loop: 1

Loop: 2

Loop: 3

Loop: 4

Loop: 5

.

.

.

Loop: 24

Loop: 25

Loop: 26

Loop: 27

Loop: 28

Loop: 29

Loop: 30

This program will generate 0–30, not 00–30, so if the number is less than 10, the program outputs only a single number.

X86_64 Version

.text
.global _start
start = 0
max = 31
_start:
movb $start,%r15b /* loop index */
loop:
cmpb $9, %r15b /* if the loop index is greater than 9, it jumps to division label */
jg division

movb $32,%r13b /* align output verticaly, insert space */
movb %r13b,msg+6
movb $0,%r14b
addb %r15b,%r14b
addb $48,%r14b
movb %r14b,msg+7
mov $(len),%rdx /* message length */
mov $msg,%rsi /* message location */
jmp print
division:
movb $0,%al
addb %r15b,%al
movb $10,%r14b
movb $0,%ah /* the remainder is stored on ah on the byte operation*/
divb %r14b /* quotient -> al, remainder -> dl */

addb $48,%al
movb %al,msg+6
addb $48,%ah
movb %ah,msg+7

mov $(len),%rdx /* message length */
mov $msg,%rsi /* message location */
print: /* standard output */movq $1,%rdi /* file descriptor stdout */
movq $1,%rax /* syscall sys_write */
syscall

inc %r15b /* increment index */
cmp $max,%r15b /* see if we're done */
jne loop /* loop if we're not */
mov $0,%rdi /* exit status */
mov $60,%rax /* syscall sys_exit */
syscall
.data
msg:
.ascii "Loop: \n"
len = . - msg

I commented on the first code, but I will explain how it works. In the beginning, you can see the .text designates code section. .global is a directive that tells where the code start.

On the _start label, the program initializes the value of an index as $start, which is 0, on r15b, the suffix ‘b’ means it indicates 8-bit registers. then the program starts a loop. In the loop, if the index is greater than 9, it jumps to the division label. But if the index is not, it inserts 32 (32 indicates space in ASCII code) into msg+6.

At the bottom of the code, you can see the .data section. if you set it as .rodata, you can only read data but if you set it as .data, you can edit data, and the msg holds the address of the beginning of the string. Therefore, msg + 6 means the 6th character on the msg string.

By inserting space, we can vertically aline the line of our output. After inserting the space, mov 0 to initialize the r14b. Then add the value of the index into r14b. To represent the number in ASCII format, we have to add 48 (which is 0 in ASCII) to r14b.

And then, we move the value of r14 to msg + 7, which adds the index number on msg + 7 location.

*Notice that suffix in movb means 8 bits, which replace only on the character on the msg string. If you use greater instruction, it would remove the other character also.

If the index is greater than 9, it will jump to the division label, which will divide a number into two single-digit numbers. The reason for dividing numbers is to represent a double-digit number on the string. Before using the divide instruction, you must initialize rdx as zero but in this case, we used an 8-bit register, we have to save the value in a different place. In this case, I will use the ah register. The quotient of dividing operation will go to will move to the al register.

After dividing the operation, we should set the length of the message on rdx, and set the msg on rsi. The rsi register will hold the start address of the msg.

Then next, on the print label, move 1 to rdi to set file descript as standard output. then after moving 1 to rax, we can call syscall, which will print the message as standard output.

Once, the program prints number, it increases the number until the value of the r15b is greater than the max value.

If the index reaches its limitation, mov 0 to the rdi to exit the program. 60 means sys_exit on the syscall, so after setting the rax as 60, we call syscall and end the program.

Aarch64 Version

.text
.globl _start
start = 0
max = 31 // the max value for loop
zero = 48
_start:
mov x19,start // x19 loop index
add x25,x25,10 // x25 holds number 10
mov x2, len // set message length
adr x1, msg // add memory address
mov x23,32 // set x23 to hold 32 which represents space in the ascii format
// in this program, the index is only incremental
// but the number is not incremental, the instruction should located
// wihtin loop
loop:
/* Loop start */
mov x0, 1 // set std = 1 (stdout)
cmp x19, 9 // if index x19 is greater than 9, mov div
b.gt div

add x24,x19,48 // add index into x24 with 48 to set number in the ascii format
strb w24, [x1,len-2] // write number on the address of the string
strb w23, [x1,len-3] // write space on the addres of the string to align vertically

bl print // jump to print
div:
udiv x23,x19,x25 // divde the index x19 with 10 and save quotinet x23
msub x24,x23,x25,x19 // get the remain

add x23,x23,48 // set number as ascii format
add x24,x24,48 // set number as ascii format

strb w23,[x1,len-3] // write numbers
strb w24,[x1,len-2] // write numbers
print:
mov x8, 64 // set syscall, 64 = writing
svc 0 // syscall
add x19,x19,1 // increase the loop index
/* Loop end */cmp x19, max // if the index is less than max, jump to loop
b.lt loop
mov x0, 0
mov x8, 93 // set syscall, 93 = exit
svc 0
.section .data
msg: .ascii "Loop: \n"
len= . - msg

The Aarch64 version is pretty similar to the x86_64 version, so here I will explain the difference between the assembly languages. There are three main differences between x86_64 and Aaarch64 codes here. In x86_64 code, it calculates from left to right. But in Aarch64, it calculates from right to left. The second is that the purpose of registers is different. Finally, they have different instructions, for instance, Aarch64 use str instruction to write a character but X86_64 does not have it, so it takes more step to write a character. Therefore, instead of explaining the Aarch64 code, I would suggest that reading the Aarch64 code and compare the major differences between the two architecture.

End…

At the beginning of this lab, I was afraid of the assembly languages since it was so different from other high-level programming languages. after finishing this lap, I learned about how the hardware works in the low-level. Even though still I’m learning this, I’m glad to know the behind of the scene of the programming language and compilers.

--

--