I will implement a virtual machine in x86 and I wonder what kind of design would yield best results. What should I concentrate on to squish out the juice? I will to implement the whole virtual machine in x86 assembly.
I haven't much instructions and I can choose their form. The instructions project directly into smalltalk's syntax in blocks. I give out the instruction design I were thinking of:
^ ...       # return
^null     # return nothing
object    # address to object
... selector: ... # message pass (in this case arity:1 selector: #selector:)
var := ... # set
var # get
The sort of VM I were thinking about:
mov eax, [esi]
add esi, 2
mov ecx, eax
and eax, 0xff
and ecx, 0xff00 # *256
shr ecx, 5          # *8
jmp [ecx*4 + operations]
align 8:
    operations:
dd retnull
dd ret
# so on...
    retnull:          # jumps here at retnul
# ... retnull action
    ret:
# ... ret action
#etc.
Don't start asking why I need yet another virtual machine implementation. Interpretive routines aren't stock stuff you just pick up whenever you need them. Most virtual machines you are proposing elsewhere are weighted towards portability with the cost of performance. My goal is not the portability, my goal is the performance.
The reason this interpreter is needed at all is because smalltalk blocks doesn't end up gotten interpreted the same way:
A := B subclass: [
    def a:x [^ x*x]
    clmet b [...]
    def c [...]
    def d [...]
]
[ 2 < x ] whileTrue: [...]
(i isNeat) ifTrue: [...] ifFalse: [...]
List fromBlock: [
    "carrots"
    "apples"
    "oranges" toUpper
]
I need the real benefit coming from the interpretive routines, that is the choice of context where to read the program in. Of course, good compiler should just most of the time compile the obvious cases like: 'ifTrue:ifFalse' or 'whileTrue:', or the list example. The need for interpreter doesn't just disappear because you always may hit a case where you can't be sure the block gets the treatment you expect.
I see there is some confusion about portability here, so I feel obliged to clarify matters some. These are my humble opinions so you are, of course, free to object against them.
I assume you came accross http://www.complang.tuwien.ac.at/forth/threading/ if you consider writing a VM seriously, so I won't dwell upon the described techniques.
Already mentioned, targeting a VM has some advantages such as reduced code size, reduced compiler complexity (often translates to faster compilation), portability (note that the point of a VM is portability of the language, so it doesn't matter if the VM itself is not portable).
Considering dynamic nature of your example, your VM will resemble a JIT compiler more than other more popular ones. So, altough S.Lott missed the point in this case, his mentioning of Forth is very on the spot. If I were to design a VM for a very dynamic language, I would separate interpretation into two stages;
A producer stage which consults an AST stream on demand and transforms it to a more meaningful form (for example, taking a block, deciding whether it should be executed right away or stored in somewhere for later execution) possibly introducing new kinds of tokens. Essentially you recover context sensitive information that may be lost in parsing here.
A consumer stage fetching the generated stream from 1 and executes it blindly like any other machine. If you make it Forth like, you can just push a stored stream and be done with it instead of jumping instruction pointer around.
As you say, just mimicking how the damn processor work in another way doesn't accomplish any dynamism (or any other feature worth a damn, like security) that you require. Otherwise, you would be writing a compiler.
Of course, you can add arbitrarily comlex optimizations in stage 1.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With