I'm learning MIPS assembly for a computer architecture course. For this course the class is using MARS. While taking in ten numbers to place in an array for an assignment, I decided to test something. I wanted to see if I could create a loop that automatically pushed integers entered by the user onto the stack. Here's my code so far:
# For loop initialization
li $s0, 0 # Set $s0 equal to zero.
li $sp, -40 # Reserve space for 10 integers on the stack.
li $t0, 0 # temp var to increment stack offset.
# Stores ten user inputted numbers onto the stack.
stackLoop:
beq $s0, 10, doneStack
li $v0, 51
la $a0, prompt
syscall
sw $t0($sp)
# Print out stack values.
li $v0, 1
lw $0($sp)
syscall
addi $t0, $t0, 4
addi $s0, $s0, 1
j stackLoop
doneStack:
I had issues when I got to the sw instruction, which threw an error when I specified two arguments because $t0($sp) was not valid. However, specifying only one argument, as can be seen above, appears to have worked. Before getting this to work, I had asked my professor about it and he said it wasn't possible to do. Why does using only one argument work? My conclusion is that sw must default to storing $v0. However, that doesn't excuse the syntax. For example, typing sw $v0, $t0($sp) throws an error at compilation.
Similarly, why does lw work when $0($sp) is used? I'm guessing that lw defaults to loading into $a0, which would explain why li $v0, 1 works. However, if that's the case, why does lw $a0, 0($sp) produce increments by 4, the number of bytes in a word? Wouldn't it refer to the data at 0($sp), which is being popped from the stack at each iteration?
I've looked through some documentation, but all of it uses two arguments for sw and lw. The textbook for my class doesn't even mention what I've done above as being possible. Apologies for the novella-style post, but my curiosity is piqued.
MARS uses a very simple parser with an even simpler tokenizer.
In the distributed JAR, the source files are available, you can look at the Java class mars.mips.instructions.InstructionSet if you feel like crying.
This is perfectly acceptable due to the very nature of MARS: a educational simulator.
The tokenizer breaks a text into tokens on whitespaces (however, the input is broken into lines first), commas and parenthesis.
Thereby these syntaxes are all equivalent:
sw $t0($sp)
sw $t0 ($sp)
sw $t0, ($sp)
There is no default register with sw and lw (or any other instructions) in the MIPS ISA.
This is true for every instruction but, respecting the RISC mentality, only lw and sw (and siblings) can have parenthesis (as they denote an addressing mode) thereby mitigating the problem (still things like or $t0 $t0 $t1 are possible).
Finally, sw $v0, $t0($sp) is not encodable in MIPS.
sw it's a I-type instruction, thus it has a source register (t), a base register (s) and a 16-bit immediate displacement (i):
101011 ttttt sssss iiiiiiiiiiiiiiii
TL;DR: sw $t0($sp) is just a MARS artifact.
Assembly isn't programming language like C, Java, etc... - where you have some free syntax for writing single expression.
Assembly is more like "mnemonics" (names) for machine instructions, which are hardwired in the CPU, by the way how the creators of the CPU designed the transistors layout on the chip, and how it is connected to other parts of computer. So if CPU was designed to have instruction ori $3, $3, 0x25, then you can write it in the source, and the assembler will translate it to the machine code, which for MIPS CPU is word 0x34630025. When the MIPS encounters this particular word in memory at the address pc (program counter), it will execute ori $3, $3, 0x25 and nothing else.
You can't encode ori $3, $3, 0x25 + 0x33 (without pre-calculating the constant before assembling it to simple 0x58), there's no machine opcode which would allow to encode 0x25 + 0x33 as two values which should be added at runtime. A smart assembler will let you write this, and compile it as ori $3, $3, 0x58 (IIRC MARS is not that smart).
So it's not like you can learn some kind of syntax and build instructions of that, you have to learn the instructions as they are, defined by the CPU vendor, and remember what is possible and what is not. Assembly is sort of 1:1 translation from readable mnemonics to binary machine code (although MARS assembler has many "pseudo instructions", which don't translate to single native opcode, but to chain of native opcodes (2-3 usually), which simulate the pseudo instruction behaviour, so it's sort of compiler, although very primitive and all the possible pseudo instructions are documented).
So that's why the possible instructions are limited, you are working directly with the HW transistors on the chip, and you have only what has been put into that design by the CPU designers. If you would want to create some new instruction, you are out of luck with fixed CPUs (although you may use one of the FPGA chips to create your own inner logic, but that's completely different topic).
About lw $0($sp) - that's invalid syntax, it will compile, because MARS assembler is not the greatest SW under the sun, so in cases where I would like it to be a bit smarter it fails (like li $t1,123+34 does NOT work), and in cases where it would be much better to stop and report error, it will actually produce something.
Your lw $0($sp) assembles as lw $0, 0($sp), i.e. it will guess there's missing coma, and missing displacement, and the whole instruction is then just space filler, because you can store into $0 aka $zero anything you want (like that lw would), but you will read back always zero.
Run MARS, open Help F1 and check the tabs "Basic Instructions" and "Extended (pseudo) Instructions", those are all you have available. Unfortunately the syntax used to describe them is example-ish one, not math-ish, so it may sometimes look like something is available, until you figure out it is not, hard way.
Now about lw... the help says lw $t1,-100($t2). If you are seasoned asm developer who knows assembly for several other CPUs, and syntax of several different assemblers, this is completely obvious. If you are new to assembly, than I can see this is quite incomprehensible. The long description doesn't help much either.
But part of trick is to check the green area above the tabs, with "Operand Key for Example Instructions", let try to exploit this for that lw...
$t1, $t2, $t3any integer register
-100 signed 16-bit integer (-32768 to 32767)Load & Store addressing mode -100($t2) sign-extended 16-bit integer added to contents of $t2
As you can see, there's no lw $t1,$t3($t2) (which would imply you can use register on the displacement position).
So how to interpret that help: lw is "load word", it's a basic instruction, it has two operands.
Left operand is target register (can be any of the 32 GPR (General Purpose Register), i.e. $0 to $31, or their alias names like $at, $t0, etc... - this is, where the word value fetched from memory will be stored.
Right operand is of form "displacement_constant($GPR)", which is used to address memory calculation as content of that GPR added together with displacement_constant. I.e. -100($sp) will take value in register $sp and subtract 100 from it, and that will be used as memory address to contact memory chip, and fetch word value from there.
This means on MIPS you can address memory with lw indirectly only through single register, no math expressions like $t0 + $t2 allowed. To do that, you would have to calculate that first, like:
add $at, $t0, $t2 # don't use $at unless you know what you are doing
lw $a0,0($at) # as 99% of pseudo ins. will use $at for their temporaries
Actually after you compile your code, you can see in MARS the disassembled machine code back (that's how I figured out what opcode is used for the ori example, and what kind of abomination did the MARS produce from invalid lw $0($sp)) - in the "Execute" tab (I'm just not sure if I had to configure it somewhat to display everything, including how pseudo-instructions get translated into basic ones, but can't find anything about it in settings, so let's hope that view is default).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With