Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to avoid lots of if-else statements in a parser

Tags:

assembly

I am a novice in the assembler design field. I am designing my own assembler for a machine. Currently, my assembler takes the first token (assumes that its a an instruction) and then tries to generate the corresponding object code. Now I need to match the token against a pool of mnemonics and then generate the corresponding obj code. The problem is I currently use if-else constructs, i.e.

if(strcmp(mnemonic_read, "mov")==0)
// generate code for mov instr
else if(strcmp(mnemonic_read,"cmp")==0)
// generate code for cmp

Can I do all this without using lots of if-else statements? Can I call a function through the mnemonic_read string variable?

like image 950
var Avatar asked Oct 14 '25 14:10

var


2 Answers

This is a common problem, with a common solution (which harold suggests).

You may want to look into lex / yacc, or flex / bison, which work well in a *nix environment. Antlr does a similar thing, but uses Java.

For example, you can use lex (From http://dinosaur.compilertools.net/):

Lex source is a table of regular expressions and corresponding program fragments. The table is translated to a program which reads an input stream, copying it to an output stream and partitioning the input into strings which match the given expressions. As each such string is recognized the corresponding program fragment is executed.

So in lex you can specify the tokens (which are matched by regular expressions), and the corresponding code to be generated. You can also feed the tokens into yacc (yet another compiler compiler) which you can use to generate a compiler for your new language.

Here is a useful guide with examples: http://ds9a.nl/lex-yacc/cvs/lex-yacc-howto.html

like image 126
superdesk Avatar answered Oct 17 '25 11:10

superdesk


You should use a kind of a hash table instead of dozens and dozens of if-else-statements or even a switch-construct.

Also be sure to seperate your "assembler logic" from simple parser logic.

like image 41
akluth Avatar answered Oct 17 '25 11:10

akluth