I've been writing my own naive implementation of the strcmp function in x86_64 assembly.
I've written a c test program to compare the behavior with the real strcmp from the C library.
Even if both function seems to return the same value, the condition
(strcmp(s1,s2) == my_strcmp(s1,s2)) seems to be false.
Weirdly, my_strcmp(s1,s2) == strcmp(s1,s2) seems to be true.
What am I missing ?
While troubleshooting, i've compiled with my implementation of strcmp in C and the same test program, just to verify : it works well, all behaviors are expected.
my_strcmp.s :
; Function: int my_strcmp(const char *dest, const char *src);
; Arguments:
; rdi s1
; rsi s2
; Variable:
; rcx counter
; Returned value :
section .text
global my_strcmp
my_strcmp:
; Function prologue (optional)
push rbp
mov rbp, rsp
xor rcx,rcx
loop:
mov al, byte [rdi]
mov bl, byte [rsi]
cmp al, bl
jne not_equal
test al,al
je null_byte_found
inc rdi
inc rsi
jmp loop
not_equal:
sub al, bl
movsx rax, al
jmp end
null_byte_found:
xor rax,rax
jmp end
end:
; Function epilogue (optional)
mov rsp, rbp
pop rbp
ret
main.c ( the test program) :
#include <stdio.h>
#include <stdint.h>
#include <unistd.h>
#include <string.h>
extern int my_strcmp(const char *s1, const char *s2);
void test_my_strcmp();
int main() {
test_my_strcmp();
return 0;
}
void test_my_strcmp(){
char s1[10] = "123456789";
char s2[10] = "12345AAA";
int a = strcmp(s1,s2);
int b = my_strcmp(s1,s2);
if ( a == b)
printf("a == b\n");
if (strcmp(s1,s2) == -11)
printf("strcmp return value is -11\n");
if (-11 == my_strcmp(s1,s2))
printf("my_strcmp retun value is -11\n");
if (strcmp(s1,s2) == my_strcmp(s1,s2))
printf("OK1\n");
if (my_strcmp(s1,s2) == strcmp(s1,s2))
printf("OK2\n");
}
my_strcmp.c (just to verify) :
int my_strcmp(const char *s1, const char *s2){
while (*s1 && *s2 && (*s1 == *s2))
{
s1++;
s2++;
}
return (*s1 - *s2);
}
After creating the 3 files my_strcmp.s, my_strcmp.c and main.c,
nasm -f elf64 -g my_strcmp.s -o my_strcmp_assembly.o
gcc -c my_strcmp.c
gcc main.c my_strcmp.o -o c_version.out
gcc main.c my_strcmp_assembly.o -o asm_version.out
./c_version.out && ./asm_version.out
a == b
strcmp return value is -11
ft_strcmp retun value is -11
OK1
OK2
a == b
strcmp return value is -11
ft_strcmp retun value is -11
OK2
gcc version : 11.4.0 (up to date) nasm version : 2.15.05 (up to date)
``
You are using %bl as a temp register.
%rbx during your function prolog/epilog or use another register.See: Section 3.2.1 of https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf for a list of registers and how they're used by caller/callee
In addition to what @Craig Estey and problems commented elsewhere,return (*s1 - *s2); is a problem when *s1 and *s2 differ in sign. @Nate Eldredge
strcmp() performs as if each character was an unsigned char.
For all functions in this subclause, each character shall be interpreted as if it had the type
unsigned char(and therefore every possible object representation is valid and has a different value). C2x dr § 7.26.1 4
A better re-invented strmcp()
int my_strcmp(const char *s1, const char *s2){
const unsigned char *uc1 = (const unsigned char *) s1;
const unsigned char *uc2 = (const unsigned char *) s2;
while ((*uc1 == *uc2) && *uc1) { // Do an unsigned access and compare
uc1++;
uc2++;
}
return ((*uc1 > *uc2) - (*uc1 < *uc2)); // No overflow possible
}
Pedantically, this also fixes potential overflow of subtracting 2 int values that may overflow and avoids (nearly extinct) problems with non-2's complement encoding of a signed char.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With