#2 Hello world (part 2 of 2 - programming)
Now that we've covered the basic knowledge that we need to write our first program in the previous blog it's time to start writing code!
What do you need as a novice programmer
If you want to write programs in assembly language you need at least 3 tools:
- A text editor to write your assembly,
- An assembler that assembles ("translates") your assembly language into machine code,
- A MSX computer (emulator or real) to run your program.
You can use 3 separate programs for this and and you probably will when you are more advanced, but to get started there is a perfect all-in-one solution: MSXPen! It's even webbased so you don't have to install anything and you can use it right away on most operating systems.
Visit the website https://msxpen.com/ and you'll see something like this:
On the left side of the screen are two tabs where you can switch between a text editor to write MSX BASIC code and a text editor for assembly language (Asm). On the right side is an emulated MSX computer. It by default a MSX2+ computer but as this model is backwards compatible to MSX1 you don't need to change it for now. But if you want to you can by clicking on the gear icon and then on Select Machine.
How does MSXPen work?
- First you write your code on either the BASIC tab, the Asm tab or on both.
- Then click on Run (or use Ctrl + Alt + R).
- The emulator reboots.
- You will find that your assembly language is assembled into machine code and stored on the virtual disk drive under the name "PROGRAM.BIN"
- The BASIC program is loaded and automatically executed.
- If you make changes on either one of the text tabs or on both you have to press Run again to see this changes in the emulator.
Now let's create our first program. This is going to be the shortest program possible. It's just one byte long and the only thing it does it that it immediately returns to BASIC after it is started.
We will be using the ret instruction what is an abbreviation of return. This instruction comes together with the call instruction. The call instruction makes the program counter jump to another place in memory. The instructions that are stored on this place in memory will be executed until the ret instruction comes up. The program counter then jumps back to the instruction just after the call instruction. Like this:
If you ever programmed in (MSX) BASIC you might notice that it works kind of like GOSUB and RETURN. These 2 instructions also let our program jump to another place in our program, execute some instructions and then return to where it left.
When we start a piece of machine code from BASIC the ret instruction ensures that we return to BASIC again as this machine code is also started by a call instruction. So unless we don't want to return to BASIC our last instruction will always be ret.
When we start a piece of machine code from BASIC the ret instruction ensures that we return to BASIC again as this machine code is also started by a call instruction. So unless we don't want to return to BASIC our last instruction will always be ret.
Now let us first enter one line of BASIC that will load our machine code program from disk and execute it. This line will be automatically executed as soon as the emulator reboots after we press Run. Make sure that the BASIC tab is selected and enter this line:
10 BLOAD"PROGRAM.BIN",R
Then select the Asm tab and enter the following lines of assembly language:
org 0xD000 ; this is the memory address were our program is stored
start: ; this indicates the start of our program
ret ; return to BASIC
end start ; this indicates the end of our program
Notice that there is indentation on line 1, 4 and 6. This is done on purpose. All the orange/brown text after the semicolons are just comments. You can leave them out if you want to, but remember that comments in a program a very important as they explain what the instructions do. Especially in assembly language it might no be so clear what the instructions are for. Comments are not just for others who will read your code. When you see your code in a couple of weeks or months it might be very hard to understand it if you don't use comments.
Now let's get this program started! Press Run and see what happens! If you did not make any typing errors the virtual MSX will reboot and then... nothing happens?
Well, actually something did happen. Our little program (just one byte long) started and then immediately returned back to BASIC. Let's explain our program and see if we can find it back in our virtual MSX. I will go over it line by line. As I told you everything after a semicolon is considered as a comment. It will be ignored by the assembler and so that's why I'm leaving that out now.
Well, actually something did happen. Our little program (just one byte long) started and then immediately returned back to BASIC. Let's explain our program and see if we can find it back in our virtual MSX. I will go over it line by line. As I told you everything after a semicolon is considered as a comment. It will be ignored by the assembler and so that's why I'm leaving that out now.
org 0xD000
The word org is a directive. It's not a Z80 instruction but it provides the assembler with information. In this case it provides the information of what the start of the memory address is of where our machine code should be stored in memory.
Then it says 0xD000. This stands for the heximal value D000. Remember that I told you that programs use different ways to make clear that a value is binary, decimal or hexadecimal? In MSXPen a value is considered hexadecimal when it starts with 0x.
Ok, so org 0xD000 means: store our machine code from memory address D000 (hexadecimal) and beyond. But why is 0xD000 chosen as the right address?
Well, there is a lot to say about how the RAM of a MSX is used, but I'll skip that for now. Just take my word for it that for our first experiments this is a good place in memory. We come back on that later.
Ok, so org 0xD000 means: store our machine code from memory address D000 (hexadecimal) and beyond. But why is 0xD000 chosen as the right address?
Well, there is a lot to say about how the RAM of a MSX is used, but I'll skip that for now. Just take my word for it that for our first experiments this is a good place in memory. We come back on that later.
This is also not a a Z80 instruction but provides information to the assembler. It is called a label and it indicates a certain point in our program. In this case the label indicates where our actual code starts. So all the assembly instructions come after this label.
ret
Well, finally we are there. This is a instruction for the Z80. The instruction ret will be assembled into a code that makes sense to the Z80. It in fact will be the opcode (operation code) with hexadecimal value C9
end start
Sorry, again this not a real Z80 instruction that will be assembled into machine code. It's a pseudo instruction and also provides the assembler with information. It indicates where our code ends. So all the assembly instructions must precede end start
Ok, so if everything went well we should be able to find the byte C9 that stands for the ret instruction at memory address D000. Let's have a look. Go to the virtual MSX on the right side of the screen and enter this:
PRINT HEX$(PEEK(&HD000))
Then press <ENTER> and you will see this:
`
Ok, this little line of BASIC displays the hexadecimal value of memory address D000 and there we indeed find the value C9.
Now can we run our little, useless piece of machine code again? Yes, it's still in memory and there are BASIC instructions to execute a piece of machine code in memory. First enter this:
DEFUSR=&HD000
Press <ENTER> again. This instruction has defined a machine-language routine at address D000.
To start this routine enter this:
A=USR(0)
After pressing <ENTER> the machine code at memory address D000 will be executed.
Ok, it still looks like nothing happens but that's not true. The program counter of the Z80 jumps to memory address D000, it reads the value C9 and executes the ret instruction and we immediately return to BASIC. We can't see it, but is does happen! Now let's expand our program to something we can see. We will display the character 'H' on the screen. Change the code on the Asm tab to this:
CHPUT: equ 0x00A2
It does work.... but there are a lot of repetitions in our code. And what if we want to change our text that we want to display? That would take a lot of time. There is a better, shorter and more flexible way to do this. Change the code therefore to this:
ld hl,text
This line begins with the label CHPUT and it's followed by the directive equ and a hexadecimal value of 00A2. This label does not point to a place in our program. It stands for a symbol and we can use this symbol in our assembly instructions. Every time we use CHPUT it will be replaced by 0x00A2 when our program is assembled. So this symbol CHPUT is the equivalent of the value 00A2 and it's only function is to make our code more readable.
In the previous blog I wrote that there are several pieces of code in the BIOS that we can use in our own programs. They are referred to as BIOS calls. Like I already told a call instruction jumps to another place in memory. Here some instructions are executed and then the program counter jumps back as soon as it executes the ret instruction.
One of the BIOS calls is named CHPUT. It puts a character on the screen. All we have to do is provide the right character code in register a and then make a call to the address of CHPUT in the BIOS. The piece of code in the BIOS then does all the other work when we call it and as soon as it is finished the program counter returns to where it left and the rest of our code is executed.
ld a,'H'
This instructions loads the character code of 'H' in register a. The Z80 works with bytes and not characters. Therefore all characters have a specific number, a code, that can be used for example in this BIOS call. The codes a MSX computer uses are ASCII codes. You can find a table of these ASCII on a lot of places on the internet, for example on Wikipedia. The character 'H' has a code of 72 (or 0x48), so we could also have used ld a,72 or ld a,0x48. But our assembler will automatically use the right code when we use a character in quotation marks. Don't forget the quotation marks! Because if you do the assembler assumes that register H is meant (ld a,H).
call CHPUT
This is the BIOS call that makes the program counter jump to the right place in memory. CHPUT equals 0x00A2 so we could also have used call 0x00A2. Again: using CHPUT as a symbol makes is just more readable. The machine code that the assembler produces is exactly the same.
We now have the knowledge to finish our 'Hello world!' program. After the BIOS call for character 'H' we load the next character 'e' in register a, make the BIOS call again, proceed with the next character and so on. Like this:
It does work.... but there are a lot of repetitions in our code. And what if we want to change our text that we want to display? That would take a lot of time. There is a better, shorter and more flexible way to do this. Change the code therefore to this:
- Load a byte from a certain place in memory where our text is stored into register a
- Check if this byte is zero
- If the byte is zero, return to BASIC, if not proceed
- Use BIOS call CHPUT to display the character on screen
- Point to the next place in memory where our text is stored
- Start this loop again
First lets have a look at line 17:
text: db "Hello world!",0
It starts with the label text and then there is the directive db. This stands for define byte. Because of this directive the assembler adds bytes to our program that we can use as data. All characters between quotation marks are converted to the corresponding ASCII codes and the zero after the comma is just stored like that. So in memory it will look like this:
Now lets examine the other instructions.
This ld instruction loads the memory address of where our text starts into register hl. We don't have to worry about what the exact value is, the assembler calculates this for us. So the value in hl is the address of where character 'H' is stored in memory.
loop:
This is another label that we will use to jump in our loop.
ld a, (hl)
This instruction loads the byte of the memory address that's indicated by hl into register a. The round brackets are very important in assembly. They point out that we want to load or store a byte into memory. The memory address is the value within the brackets. So there's a big difference for example between these two instructions:
- load de, hl > load the value of register hl into register de
- load de,(hl) > load the value found at memory address hl into de
Don't worry if you don't understand the difference right now. We will cover this later on.
cp 0
The cp instruction is a abbreviation of compare. It compares the register with the value 0. If register a equals zero then the zero flag in our flag register will be set (it gets the value of 1). Otherwise the zero flag in the flag register will not be set (it gets the value of 0).
Just to be clear: we could also compare register a to other values, not only zero. So this is for example also a valid instruction: cp 0xA1. And also in this example the zero flag will be set if the value in register a equals to 0xA1.
Just to be clear: we could also compare register a to other values, not only zero. So this is for example also a valid instruction: cp 0xA1. And also in this example the zero flag will be set if the value in register a equals to 0xA1.
ret z
This is also a ret instruction but this time it is conditional. The character z stands for zero. Therefore it will only be executed if the zero flag is set. So if in our program the previous instruction (cp 0) sets the zero flag this instruction is executed and we will return to BASIC, like this:
- If a value of zero is loaded into register a then...
- The cp instruction will set the zero flag and...
- The ret z instruction is executed.
If the zero flag is not set the ret z instruction will do nothing and the Z80 will proceed with the next instruction, in this case call CHPUT.
inc hl
The inc instruction stands for the abbreviation increase. It adds 1 to the value in register hl. This causes hl now to point to the next byte in memory. So at first register hl points to the memory address of character 'H', then to the address of character 'e', and so on.
jr loop
Our last instruction jr stands for jump relative and it's used to make small jumps in our code (maximum 128 bytes forward or backward). The label loop after the instruction indicates where to jump to. So in this case we will jump backward to the place in our program where our label loop points to. The assembler will calculate how many bytes the program counter has to jump back, so we don't have to do this by hand.
Alright! We are there! We've got our first working program in assembly for the MSX that actually does something! I hope that you understand every instruction of it. I have now explained everything very extensively and I 'm planning to this a bit shorter in the following blogs. But I think it is very important to really understand every part of a program.
In the next blog I intend to do more with BIOS calls and characters. And don't forget... experiment a lot and try to figure out things for yourself! You can find a list of BIOS calls here, so go ahead!
Reacties
Een reactie posten