I'm trying to create a FSM in C that follows this diagram:

and has output that looks like this:

I started by defining an enum that held all the possible states:
typedef enum {
Start = 0,
Build_Id = 1,
Identifier = 2,
Build_Num = 3,
Number = 4,
Error = 5
} State_type;
This is the code I currently have:
State_type analyzeData(State_type* currState, char c);
int main(int argc, char** argv) {
  State_type currState = 0;
  for (int i = 1; i < strlen(*argv); ++i) {
    analyzeData(&currState, *argv[i]);
  }
}
State_type analyzeData(State_type* currState, char c) {
  if (currState == 0) {
    if (isblank(c)) {
      *currState = (State_type) 0;
      return *currState;
    }
    else if (isdigit(c)) {
      *currState = (State_type) 3;
      return *currState;
    }
    else if (isalpha(c)) {
      *currState = (State_type) 1;
      return *currState;
    }
  }
}
My plan was to basically use a series of if-else statements for all the other possible states. I guess I'm a little confused on whether I'm even approaching this correctly. I've been trying to read the answers on other FSM questions but nothing is making sense. Can someone point me in the right direction?
A state machine is any object that behaves different based on its history and current inputs. Many embedded systems consist of a collection of state machines at various levels of the electronics or software.
A system where particular inputs cause particular changes in state can be represented using finite state machines. This example describes the various states of a turnstile. Inserting a coin into a turnstile will unlock it, and after the turnstile has been pushed, it locks again.
A finite state machine is a machine that can, at any point in time, be in a specific state from a finite set of possible states. It can move (transition) to another state by accepting an input. If the machine allows for outputs, it can produce an output.
You define an enum that lists your states - good!
typedef enum {
    Start_state = 0,
    Build_Id_state = 1,
    Identifier_state = 2,
    Build_Num_state = 3,
    Number_state = 4,
    Error_state = 5
} State_type;
Slight change to your state transition code,
int
main(int argc, char** argv) {
  State_type currState = 0;
  Action_t action;
  char* p = *argv; char symbol;
  int len = strlen(p);
  //C-strings are zero-indexed
  for (int i=0; i < len; ++i) {
    action = analyzeData(&currState, classify(symbol=*p++));
    switch(action) {
        case None_act: break;
        case Gather_act: //appropriate symbol gathering
        case Emit_act: //handle ident/number print/save
        case Stop_act: //appropriate behavior, e.g. i=len
        ...
    }
  }
}
Build a state transition table holding these entries:
typedef struct state_table_entry_s {
    State_type state;
    Transition_t trans; //could define as bit-field
    State_type nextstate;
    Action_t action; //semantic action
} state_table_entry_t;
Define your state transition table, which makes clear that you have not defined behavior for certain transitions. (Make the table two-dimensional, and you can more quickly process state&transition)
state_table_entry_t states[] = {
    {Start_state, Letter_class, None_act, Build_Id}
   ,{Start_state, Digit_class, None_act, Build_Num}
   ,{Start_state, Blank_class, None_act, Start_state}
   ,{Start_state, Semicolon_class, Stop_act, Start_state}
   ,{Build_Id_state, Letter_class, Gather_act, Build_Id_state}
   ,{Build_Id_state, Digit_class, Gather_act, Build_Id_state}
   ,{Build_Id_state, Underscore_class, Gather_act, Build_Id_state}
   ,{Build_Id_state, Blank_class, None_act, Identifier_state}
   ,{Identifier_state, Blank_class, Emit_act, Start_state}
   ,{Build_Num_state, Digit_class, Gather_act, Build_Num_state}
   ,{Build_Num_state, Blank_class, None_act, Number_state}
   ,{Number_state, Blank_class, Emit_act, Start_state}
   ,{Stop_state, <any>, Error_act, Stop_state}
   ,{Error_state, <any>, None_act, Stop_state}
};
Notice how the above 'state transition table' clearly documents your state machine? And you could (easily) load this table from a configuration file?
Stop. Did you define (appropriate) actions for every (State X Transition) pair?
//States:
Start_state
Build_Id_state
Identifier_state
Build_Num_state
Number_state
Error_state
//Transitions:
Letter_class
Digit_class
Underscore_class
Blank_class
Semicolon_class
Other_class
For the above, you need to define your state transition classes:
typedef enum {
    Letter_class
   ,Digit_class
   ,Underscore_class
   ,Blank_class
   ,Semicolon_class
   ,Other_class
} Transition_t;
And you need to define your actions:
typedef enum {
    None_act
   ,Gather_act
   ,Emit_act
   ,Stop_act
   ,Error_act
} Action_t;
Convert your characters/symbols into their transition class (you could use ctype.h and the isalpha(), isdigit() macros),
Transition_t classify(char symbol) {
    Transition_t class = Other_class;
    if (isblank(c)) {
      return(class = Blank_class); break;
    }
    else if(isdigit(symbol)) {
      return(class = Digit_class);
    }
    else if (isalpha(symbol)) {
      return(class = Letter_class); break;
    }
    else {
      switch(tolower(symbol)) {
        case ' ':
            return(class = Blank_class); break;
        case '_':
            return(class = Underscore_class); break;
        case ';':
            return(class = Semicolon_class); break;
        default :
            return(class = Other_class); break;
      }
    }
    return(class = Other_class); break;
}
Find your matching state in the state table (you can make this much more efficient), and your matching transition in the transition table, then take the semantic action,
Action_t
analyzeData(State_type& currState, Transition_t class) {
  for( int ndx=0; ndx<sizeof(states)/sizeof(states[0]); ++ndx ) {
    if( (states[ndx].state == currState)
    &&. (states[ndx].trans == class) ) { //state match
        semantic_action(states[ndx].action);
        currState = states[ndx].nextState;
        return(states[ndx].action);
    }
  }
}
You will need to define your 'semantic_action' function, and of course you will need to 'gather' your input, so that you can perform the output at appropriate action times. And your 'emit_act' will need to cleanup.
You'd probably be better off with a switch statement (or even a transition table), but the basic structure is the same.
If you are using an enumeration, use it. Don't use magic numbers. The point of defining an enumeration is to be able to use meaningful names instead of numbers.
If you are going to return the new state, there is absolutely no point using an in-out parameter. Use the prototype
State_type analyzeData(State_type currState, char c);
/* Better would be int c. See below. */
Then a typical state case might be:
case Start:
  if (isblank(c)) return Start;
  else (isdigit(c)) return Build_Num;
  else (isalpha(c)) return Build_Id;
  else return Error;
Also note that isalpha and friends take an int, not a char. If char is signed (which is common) and the value happens to be negative, Undefined Behaviour will result.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With