I'm looking at the IsCharAlphaNumeric Windows API function. As it only takes a single TCHAR, it obviously can't make any decisions about surrogate pairs for UTF16 content. Does that mean that there are no alphanumeric characters that are surrogate pairs?
Characters outside the BMP can be letters. (Michael Kaplan recently discussed a bug in the classification of the character U+1F48C.) But IsCharAlphaNumeric cannot see characters outside the BMP (for the reasons you noted), so you cannot obtain classification information for them that way.
If you have a surrogate pair, call GetStringType with cchSrc = 2 and check for C1_ALPHA and C1_DIGIT.
Edit: The second half of this answer is incorrect GetStringType does not support surrogate pairs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With