Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Last string in array of strings (parsed from strtok) has garbage

I'm a little baffled here. In main:

int main() {
    char **symbols = (char **) malloc(3 * sizeof(char *)); // allocate 3 (char *)'s
    for (int i = 0; i < 3; i++)
        symbols[i] = (char *)malloc(3); // allocate string of length 3

}

Then the user enters three string symbols, space delimited, on a single line:

111 010 101

I then parse this buffered string into an array of strings thusly:

void parseSymbols(char *line, int k, char **symbols) {
    // k == 3
    // Ignore leading spaces
    while (*line != '\0' && is_whitespace(*line))
            line++;

    char *p = strtok(line, " ");
    int cnt = 0;
    symbols[cnt++] = p;
    while (p) {
            p = strtok(NULL, " \n");
            symbols[cnt++] = p;
    }

    // Let's call this FOOBAR
    //for (int i = 0; i < k; i++)
    //        printf("%d. %s\n", i, symbols[i]);

}

Back in main, when I printf the 3 strings in symbols, I get this:

0. '111'
1. '010'
2. ' s'

But when I un-comment out the last two lines of parseSymbols, I get:

0. '111'
1. '010'
2. '101'

Why does the FOOBAR block "fix" my string array, and more importantly, how can I get parseSymbols working properly without having to print something to screen? Does symbols[2] need to be terminated with a '\0'? (But doesn't strtok do that for me?)

like image 761
Lee Wang Avatar asked Nov 28 '25 02:11

Lee Wang


1 Answers

Your first problem is the fact that you are causing a memory leak by assigning the resulting pointer of strtok to the symbol table. This is because you are just copying the reference to the next start of a token rather than copying the the resulting string in these lines: symbols[cnt++] = p;

Next you should make sure that you are not exceeding k by assinging results to your symbol table. Your code always writes NULL at the last position of your symbol table. As soon as you are parsing 3 symbols, you will write to unallocated memory causing undefined behavior.

I'd recommend to correct these things first and then try again.

Please note that strtok modifies your original buffer by replacing a delimiter with '\0' at the end of a token, so there is no need to copy the string. Please also note that strtok skips consecutive occurence of one of the delimiters. So your first loop can be replaced by a check if the length of the first token is >0 (or in other words the first byte of the resulting string is != '\0'

Please note that C-Strings always need 1 byte more space than you want to store. This extra byte is used for '\0' Termination. In your example you are parsing chunks of 3 bytes while allocating only 3 bytes per chunk (where should be 4): symbols[i] = (char *)malloc(3); // allocate string of length 3

like image 194
junix Avatar answered Nov 30 '25 16:11

junix



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!