I have large strings that resemble the following...
some_text_token 24.325973 -20.638823 -1.964366 0.753947 -1.290811 -3.547422 0.813014 -3.547227 0.472015 3.723311 -0.719116 3.676793 other_text_token 24.325973 20.638823 -1.964366 0.753947 -1.290811 -3.547422 -1.996611 -2.877422 0.813014 -3.547227 1.632365 2.083673 0.472015 3.723311 -0.719116 3.676793 ...
...from which I'm trying to efficiently, and in the interleaved sequence they appear in the string, grab...
...but I'm having trouble.
I've tried strtod and successfully grabbed the floats from the string, but I can't seem to get a loop using strtod to report back to me the interleaved text tokens and blank lines. I'm not 100% confident strtod is the "right track" given the interleaved tokens and blank lines that I'm also interested in.
The tokens and blank lines are present in the string to give context to the floats so my program knows what the float values occurring after each token are to be used for, but strtod seems more geared, understandably, toward just reporting back floats it encounters in a string without regard for silly things like blank lines or tokens.
I know this isn't very hard conceptually, but being relatively new to C/C++ I'm having trouble judging what language features I should focus on to take best advantage of the efficiency C/C++ can bring to bear on this problem.
Any pointers? I'm very interested in why various approaches function more or less efficiently. Thanks!!!
Using C, I would do something like this (untested):
#include <stdio.h>
#define MAX 128
char buf[MAX];
while (fgets(buf, sizeof buf, fp) != NULL) {
    double d1, d2;
    if (buf[0] == '\n') {
        /* saw blank line */
    } else if (sscanf(buf, "%lf%lf", &d1, &d2) != 2) {
        /* buf has the next text token, including '\n' */
    } else {
        /* use the two doubles, d1, and d2 */
    }
}
The check for blank line is first because it's relatively inexpensive. Depending upon your needs:
MAX,buf ends with a newline, if it doesn't, then the line was too long (go to 1 or 3 in that case),malloc() and realloc() to dynamically allocate the buffer (see this for more),sscanf() returns the number of input items successfully matched and assigned.I am also assuming that blank lines are really blank (just the newline character by itself).  If not, you will need to skip leading white-space.  isspace() in ctype.h is useful in that case.
fp is a valid FILE * object returned by fopen().
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With