Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

wchar_t argv in C -- Unicode

Does GCC support the Microsoft equivalent of wmain()? I'm writing C program and need to use Unicode throughout. If not, can char be converted to wchar_t?

like image 479
David Watson Avatar asked Dec 06 '25 13:12

David Watson


2 Answers

You don't need wchar_t for Unicode. You can use char for the utf-8 encoding of Unicode. Plus, wchar_t can be different sizes. On Windows, it is 16 bits, but on many Linux/Unix platforms it is 32 bits.

For more info specific to GCC, see this post I found via a Google search:

http://article.gmane.org/gmane.comp.gnu.mingw.user/22962

(According to that, the answer to your question of whether GCC supports wmain is "no".)

like image 112
dappawit Avatar answered Dec 08 '25 07:12

dappawit


Many of C's standard string functions are encoding agnostic. You can use char* to store UTF-8 encoded strings and use them safely with:

strcpy strncpy strcat strncat strcmp strncmp strdup strchr 
strrchr strcspn strspn strpbrk strstr strtok

Some other functions will not give you correct results with Unicode strings. For example, strlen always count bytes, not characters. The number of characters can be counted in C in a portable way using mbstowcs(NULL,s,0). It will return the number of characters in s successfully translated to wchar_t. This works for UTF-8 like for any other supported encoding, as long as the appropriate locale has been selected.

If you want to do advanced operations on Unicode strings like complex code page conversions, regular expressions, text wrapping on word boundaries etc, I suggest you use a good library like ICU.

Refer: Using Unicode in C/C++.

like image 25
Vijay Mathew Avatar answered Dec 08 '25 06:12

Vijay Mathew



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!