Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C: using pointer as string: unpredictable behavior

I'm writing a C program to find the longest line in the user's input and print the line's length and the line itself. It succeeds at counting the characters but unpredictably fails at storing the line itself. Maybe I'm misunderstanding C's memory management and someone can correct me.

EDIT: followup question: I understand now that the blocks following the dummy char are unallocated and thus open range for the computer to do anything with them, but then why does the storage of some chars still work? In the second example I mention, the program stores characters in the 'unallocated' blocks even though it 'shouldn't'. Why?

Variables:

  • getchar() is stored in c every time i getchar()
  • i is the length (so far) of the current line i'm getchar()ing from
  • longest_i is the length of the longest line so far
  • twostr points to the beginning of the first of two strings: the first for the current line, the second for the longest line so far. When a line is discovered to be the longest, it is copied into the second string. If a future line is even longer, it overrides some of the second string but that's OK because I won't use it anymore -- the second string will now begin at a location farther to the right.
  • dummy gives twostr a place to point to

This is how I visualize the memory used by the program's variables:

 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|\n| 7|11|15|c |u |r |r |e |n |t |\0|e |s |t |\0|p |r |e |v |l |o |n |g |e |s |t |\0|
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

true statements:

&c == 11
&i == 12
&longest_i == 13
&twostr = 14
&dummy = 15

program:

#include <stdio.h>

int main()
{
    char c = '\0';
    int i, longest_i;
    char *twostr;
    longest_i = i = 0;
    char dummy = '\0';
    twostr = &dummy;

    while ((c=getchar()) != EOF)
    {
        if (c != '\n')
        {
            *(twostr+i) = c;
            i++;
        }
        else
        {
            *(twostr+i) = '\0';
            if (i > longest_i)
            {
                longest_i = i;
                for (i=0; (c=*(twostr+i)) != '\0'; ++i)
                    *(twostr+longest_i+1+i) = c;
            }
            i = 0;
        }
    }

    printf("length is %d\n", longest_i);
    for (i=0; (c=*(twostr+longest_i+1+i)) != '\0'; ++i)
        putchar(c);

    return 0;
}

From *(twostr+longest_i+1)) until '\0' is unpredictable. Examples:

input:

longer line
line

output:

length is 11
@

input:

this is a line
this is a longer line
shorter line

output:

length is 21
this is a longer lineÔÿ"
like image 228
Jordan Avatar asked Jun 21 '26 18:06

Jordan


1 Answers

You're not actually allocating any memory to write into!

char dummy = '\0'; // creates a char variable and puts \0 into it
twostr = &dummy; // sets twostr to point to the address of dummy

After this, you're simply writing into the memory which comes after the char set aside by dummy, and writing over who-knows-what.

The easiest fix in this case would be to make dummy a pointer to a char, and then malloc a buffer to use for your strings (make it longer than the longest string you expect!)

For instance, buffer below would point to 256 bytes (on most systems) of memory, allowing for a string up to 255 characters long (as you have the null terminator (\0) to store at the end).

char * buffer = (char *)malloc(sizeof(char) * 256);

Edit: This would allocate memory from the heap, which you should later free up by calling free(buffer); when you're done with it. The alternative is to use up space on the stack as per Anders K's solution.

like image 78
Matt Lacey Avatar answered Jun 24 '26 09:06

Matt Lacey



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!