Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Writing own memmem for Windows

Tags:

c

I notcied that memmem is not available in MSVC for Windows, so I tried to write something for it. I have the following code:

void *memmem(const void *haystack_start, size_t haystack_len, const void *needle_start, size_t needle_len)
{
    const unsigned char *haystack = (const unsigned char *)haystack_start;
    const unsigned char *needle = (const unsigned char *)needle_start;
    const unsigned char *h = NULL;
    const unsigned char *n = NULL;
    size_t x = needle_len;

    /* The first occurrence of the empty string is deemed to occur at
    the beginning of the string.  */
    if (needle_len == 0) {
        return (void *)haystack_start;
    }

    /* Sanity check, otherwise the loop might search through the whole
        memory.  */
    if (haystack_len < needle_len) {
        return NULL;
    }

    for (; *haystack && haystack_len--; haystack++) {
        x = needle_len;
        n = needle;
        h = haystack;

        if (haystack_len < needle_len)
            break;

        if ((*haystack != *needle) || (*haystack + needle_len != *needle + needle_len))
            continue;

        for (; x; h++, n++) {
            x--;

            if (*h != *n)
                break;

            if (x == 0)
                return (void *)haystack;
        }
    }

    return NULL;
}

But, I do not think it works correctly. If I try something like this:

static const char haystack[24] = {
    0x4e, 0x65, 0x76, 0x65, 0x72, 0x20, 0x67, 0x6f,
    0x6e, 0x6e, 0x61, 0x20, 0x67, 0x69, 0x76, 0x65,
    0x20, 0x79, 0x6f, 0x75, 0x20, 0x75, 0x70, 0x2c,
};

static const char needle[8] = {
    0x20, 0x79, 0x6f, 0x75, 0x20, 0x75, 0x70, 0x2c
};

char *res = memmem(haystack, sizeof(haystack), needle, sizeof(needle));
printf("%s", res);

The result is null. Any ideas where the problem might be?


1 Answers

I think you are overcomplicating this.

void *memmem(const void *haystack, size_t haystack_len, 
    const void * const needle, const size_t needle_len)
{
    if (haystack == NULL) return NULL; // or assert(haystack != NULL);
    if (haystack_len == 0) return NULL;
    if (needle == NULL) return NULL; // or assert(needle != NULL);
    if (needle_len == 0) return NULL;
    
    for (const char *h = haystack;
            haystack_len >= needle_len;
            ++h, --haystack_len) {
        if (!memcmp(h, needle, needle_len)) {
            return h;
        }
    }
    return NULL;
}

Until haystack_len is greater or equal to needle_len, you should memory compare needle with current position in haystack. If it's true, return haystack.

  1. There is no need to explicitly cast a const void * pointer const unsigned char *haystack = (const unsigned char *)haystack_start; is just const unsigned char *haystack = haystack_start;
  2. As said in comments by @molbdnilo (*haystack != *needle) || (*haystack + needle_len != *needle + needle_len)) is just the same think. It becomes obvious, once you use [] operator rather then *: haystack[0] != needle[0] || haystack[0] + needle_len != needle[0] + needle_len. Even if you meant ... != needle[needle_len] this is out-of-bound access of needle.
  3. The for is just strange:

for (; *haystack && haystack_len--; haystack++) {
      if (haystack_len < needle_len)
            break;

Why not:

for (; *haystack && haystack_len < needle_len; haystack_len--, haystack++)

?

And the expression *haystack is just invalid, you are not checking null-terminated string like in case of strstr. haystack points to any bytes in memory and may have zero as values. The haystack_len keeps the length of haystack.

  1. You can use memcmp to compare memory, no need to write that part yourself.
like image 76
KamilCuk Avatar answered Sep 22 '25 10:09

KamilCuk