Background
I was writing code that uses functions from ctype.h to identify things in strings. I accidentally passed the string (char*) to the function(s) which take and int type, causing the program to segfault. It was easy enough to see that I forgot to dereference the string pointer, but GCC gave me no warnings even when compiling with the following arguments:
gcc -o main main.c -Wall -Wextra -Werror -pedantic -pedantic-errors -std=c99 -Wconversion
I am using Debian GNU/Linux bookworm 12.5 x86_64 and gcc (Debian 12.2.0-14) 12.2.0 which is all up to date. Here's a example of the problem:
/* main.c */
#include <ctype.h>
#include <stdio.h>
int main(void)
{
char msg[] = "hello";
int res = isspace(msg); // char* gets cast to int without warning
// It should be `isspace(*msg)`
// This also segfaults
printf("%i\n", res);
return 0;
}
Questions
You're passing in a value that is outside the range of values the function expects. Doing so triggers undefined behavior, as per section 7.4p1 of the C standard regarding functions defined in ctype.h:
The header <ctype.h> declares several functions useful for classifying and mapping characters. In all cases the argument is an
int, the value of which shall be representable as anunsigned charor shall equal the value of the macroEOF. If the argument has any other value, the behavior is undefined
And since this is undefined behavior, crashing is one possible outcome.
As for why there's no warning generated by the compiler, we need to look at the preprocessor output. The call to isspace gets converted to the following after the preprocessor:
int res = ((*__ctype_b_loc ())[(int) ((msg))] & (unsigned short int) _ISspace);
From this, we can see that isspace is implemented as a macro which uses a lookup table with the given argument as an index, and we can see that the argument is explicitly casted to int. This explicit cast explains why there's no warning.
The above also explains the crash, since a pointer value will likely be far out of the bounds of this lookup table and therefore attempt to access memory it doesn't have access to.
Library functions being implemented as macros does in fact conform to the C standard. Additionally, such functions defined as macros must also be defined as an actual function. This is dictated by section 7.1.4p1 of the C standard:
Any function declared in a header may be additionally implemented as a function-like macro defined in the header, so if a library function is declared explicitly when its header is included, one of the techniques shown below can be used to ensure the declaration is not affected by such a macro. Any macro definition of a function can be suppressed locally by enclosing the name of the function in parentheses, because the name is then not followed by the left parenthesis that indicates expansion of a macro function name. For the same syntactic reason, it is permitted to take the address of a library function even if it is also defined as a macro. 185)
- This means that an implementation shall provide an actual function for each library function, even if it also provides a macro for that function.
The above also mentioned that the use of the macro version of function can be suppressed by putting parenthesis around the function name:
int res = (isspace)(msg);
And in this case the compiler will produce a warning for pointer-to-integer conversion.
Likely, with your compiler, isspace() is implemented as a macro that includes a typecast of whatever argument it gets to char or int.
Obviously, when the compiler sees a cast, it will just assume, "well, he said so". Macros are not type-checked at all (well, you can't specify a type, so how should the compiler check it).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With