Assembly 16-bits: Printing Strings

Mr Empy
6 min readJul 10, 2022

--

Introduction

Assembly is undoubtedly a fantastic programming language where we can use it to control electronic components, and it is the language that is closest to the machine language. At the beginning of the computational age, many developers used assembly to assemble code and write it to a floppy disk, an example is Microsoft’s own MSDOS.

In this article, you will learn how to print a string using 16-bit assembly and how this action occurs.

Requirements

Before we start venturing out, we need proper tools for this matter. Some tools we will need:

  • NASM — To compile the code
  • QEMU — To emulate compiled code

Creating the Code

There are two ways to print strings, so create a file called example1.asm and open it.

In the first few lines, we will add some instructions:

ORG 0x7C00
BITS 16

ORG will set the boot address, which is 0x7C00, a curiosity is that this address is the first to be called after turning on the computer, also known as the MBR (Master Boot Record) address. Another curiosity is that this address has a storage capacity of 512 bytes.

BITS will define what the code is written in 16 bits.

We continue with code development:

mov ah, 0x0E
mov al, "O"
int 0x10

mov ah, 0x0E: we will use the MOV function to set the value of AH to 0x0E (AH = 0x0E). AH is a high-byte register, used to return values from function calls. 0x0E is the BIOS function for writing characters in TTY mode.

mov al, “O”: we will set the value “O” to AL (AL = “O”). AL is similar to AH, the difference is that it is a low-byte register, we will use it to define the character that will be printed on the screen.

int 0x10: add it to execute our call. INT is an instruction present in x86 processors, its functionality is to take the interrupt number formatted as a byte value, executing our call.

Now we will write the last lines:

jmp $
times 510 - ($-$$) db 0
dw 0xAA55

jmp $: we will jump (JMP) to “$”, this will tell our code that we will be in a loop, so the code will not terminate.

times 510 — ($-$$) db 0: we will use it to fill our code with 512 bytes, which is necessary for the code to run perfectly.

dw 0xAA55: it will sign the end of our boot sector code with a magic number called 0xAA55.

Remembering that this code is a boot sector (bootloader).

The complete code will be this:

ORG 0x7C00
BITS 16
mov ah, 0x0E
mov al, "O"
int 0x10
jmp $
times 510 - ($-$$) db 0
dw 0xAA55

Save the file and compile the code using the command:

$ nasm -f bin -o example1.bin example1.asm

After compiling it, run it with the command:

$ qemu-system-x86_64 example1.bin

If we want to add more characters, we can add the three lines of code and change the character. Example:

ORG 0x7C00
BITS 16
mov ah, 0x0E
mov al, "O"
int 0x10
mov ah, 0x0E
mov al, "K"
int 0x10
jmp $
times 510 - ($-$$) db 0
dw 0xAA55

But we can improve string printing because it’s not very pleasant to repeat the code and make some changes, so let’s automate that.

Create a file called example2.asm and write this code:

ORG 0x7C00
BITS 16
mov si, msg
call PrintStr
jmp $

There are 2 lines that we will develop over time.

mov si, msg: SI will be defined as msg, which is a variable where we will store our message, let’s write it last.

call PrintStr: we will create a function that will print our string (msg variable) and call it.

Putting these two lines together, her call would look like “PrintStr(msg);”.

Now we will write the PrintStr function.

PrintStr:
mov ah, 0x0E
mov al, [si]
psloop:
int 0x10
inc si
mov al, [si]
cmp al, 0
jne psloop
ret
ret
msg db "test", 13, 10, 0

Observe the code, in the last line where we define the variable “msg” with the value “test”, it will be moved to the SI register as mentioned above, so the value is stored in SI.

The first two lines contain a code that defines the value of AH and AL (BIOS call). The SI register is moved to AL. Note that the SI contains a bracket between SI ([SI]). That means it’s passing the first character from SI to AL, and not all characters, if I removed the square bracket and tried to compile the code, it wouldn’t be possible.

We will create a subroutine called “psloop”, it will be our loop to print the characters of a variable (SI) until there are no more characters. Inside it we start executing the 0x10 switch that will print the first character.

inc si: INC means increment, this command will tell SI to move to the next character, that is, now it will be the character “e”, before it was the character “t” that was already printed.

mov al, [si]: AL will be set to the value of SI (character “t”).

cmp al, 0: in this command, the value 0 will be compared with AL, this means that if the value is not equal to 0, it will jump to the “psloop” subroutine. This comparison occurs because it checks for the existence of more characters within a variable, if there are more characters, the value will be 1, if there are no more characters, the value will be 0.

If it was in another programming language:

if (al != 0) {
psloop();
}

Describing the function in short, it will print the first character, then it will jump to the next character, check if there is still a character in the variable, if there is, it goes back to the subroutine and prints the character that was defined before the comparison.

The final code:

ORG 0x7C00
BITS 16
mov si, msg ; char* si = msg;
call PrintStr ; PrintStr(si);
jmp $

PrintStr:
mov ah, 0x0E
mov al, [si]
psloop:
int 0x10
inc si
mov al, [si]
cmp al, 0
jne psloop
ret
ret

msg db "test", 13, 10, 0 ; char msg[] = "test";

times 510 - ($-$$) db 0
dw 0xAA55

Compiling:

$ nasm -f bin -o example2.bin example2.asm

Executing:

$ qemu-system-x86_64 example2.bin

Result:

We were able to print our message on the screen!

Read More Articles

--

--

Mr Empy
Mr Empy

Written by Mr Empy

「🎩」Pentester & Bug Hunter 「🌕」Ethical Hacker 「🇧🇷」Brazil 「⚡」17 y/o 「👾」CTF Player 「🤖」Programmer 「▶️」Youtuber

No responses yet