Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Capturing words within spaces and quotation marks?

The idea, explicit in the title, is to capture words within spaces and quotation marks here's an example of the input we are dealing with:

Input:

The Brown "Fox Jumps Over" "The Lazy" Dog

Currently my code can capture words within spaces, as many of you know, a basic strtok() is enough. Here's my code so far:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>

int main () {
   char command[BUFSIZ];
   char *token;
   fgets(command,BUFSIZ,stdin);
   
   token = strtok(command, " ");

   while( token != NULL ) {
      printf( " %s\n", token );
    
      token = strtok(NULL, " ");
   }
   
   return 0;
}

And as expected, my code prints the following:

Current Output:

The
Brown
"Fox
Jumps
Over"
"The
Lazy"
Dog

But the whole idea and problem is to get the following output:

The
Brown
Fox Jumps Over
The Lazy
Dog

All the help is welcome and I thank you in advance. (PS: The included libraries are the only ones allowed.)

like image 711
mcsmachado Avatar asked Oct 19 '25 00:10

mcsmachado


1 Answers

This program works for your input, it employs a tiny state machine that prevents splitting between quotes. strtok is pretty limited for cases more complicated than a single split token IMO:

#include <stdio.h>
#include <stdlib.h>

void prn(char* str) {
    printf("<< %s >>\n", str);
}

int main(){
    char command[BUFSIZ];
    char state = 0;
    char *start = NULL;
    char *cur = NULL;
    
    fgets(command, BUFSIZ, stdin);
    start = cur = command;
    
    while (*cur) {
        if (state == 0 && *cur == ' ') {
            /* space outside quotes */
            *cur = 0;
            prn(start);
            start = cur+1;
            cur++;
        } else if (*cur == '"') {
            /* quote found */
            *cur = 0;
            if (state) {
                /* end quote -- print */
                prn(start);
                
                /* skip past spaces */
                cur++;
                while (*cur == ' ')
                    cur++;
            } else {
                /* in quote, move cursor forward */
                cur++;
            }
            /* flip state and reset start */
            state ^= 1;
            start = cur;
        } else {
            cur++;
        }
        if (cur - command >= BUFSIZ) {
            fprintf(stderr, "Buffer overrun\n");
            return -1;
        }
    }
    /* print the last string */
    prn(start);
    
    return 0;
}

The output:

➜ echo -n 'The Brown "Fox Jumps Over" "The Lazy" Dog' |./a.out
<< The >>
<< Brown >>
<< Fox Jumps Over >>
<< The Lazy >>
<< Dog >>

[edit: tidied following feedback, printing delimited to catch any sneaky spaces creeping through]

like image 79
w08r Avatar answered Oct 21 '25 15:10

w08r



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!