This is an ANSI C question. I have the following code.
#include <stdio.h>
#include <locale.h>
#include <wchar.h>
int main()
{
if (!setlocale(LC_CTYPE, "")) {
printf( "Can't set the specified locale! "
"Check LANG, LC_CTYPE, LC_ALL.\n");
return -1;
}
wint_t c;
while((c=getwc(stdin))!=WEOF)
{
printf("%lc",c);
}
return 0;
}
I need full UTF-8 support, but even at this simplest level, can I improve this somehow? Why is wint_t used, and not wchar, with appropriate changes?
wint_t is capable of storing any valid value of wchar_t. A wint_t is also capable of taking on the result of evaluating the WEOF macro (note that a wchar_t might be too narrow to hold the result).
As @musiphil so nicely put in his comment, which I'll try to expand here, there is a conceptual difference between wint_t and wchar_t.
Their different sizes are a technical aspect that derives from the fact each has very distinct semantics:
wchar_t is large enough to store characters, or codepoints if you prefer. As such, they are unsigned. They are analogous to char, which was, in virtually all platforms, limited to 8-bit 256 values. So wide-char strings variables are naturally arrays or pointers of this type.
Now enter string functions, some of which need to be able to return any wchar_t plus additional statuses. So their return type must be larger than wchar_t. So wint_t is used, which can express any wide char and also WEOF. Being a status, it can also be negative (and usually is), hence wint_t is most likely signed. I say "possibly" because the C standard does not mandate it to be. But regardless of sign, status values need to be outside the range of wchar_t. They are only useful as return vales, and never meant to store such characters.
The analogy with "classic" char and int is great to clear any confusion: strings are not of type int [], they are char var[] (or char *var). And not because char is "half the size of int", but because that's what a string is.
Your code looks correct: c is used to check the result of getwch() so it is wint_t. And if its value is not WEOF, as your if tests, then it's safe to assign it to a wchar_t character (or a string array, pointer, etc)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With