http://msdn.microsoft.com/en-us/library/1x308yk8.aspx
This allows me to do this:
var str = "string ";
Char.IsWhiteSpace(str, 6);
Rather than:
Char.IsWhiteSpace(str[6]);
Seems unusual, so I looked at the reflection:
[TargetedPatchingOptOut("Performance critical to inline across NGen image boundaries")]
public static bool IsWhiteSpace(char c)
{
if (char.IsLatin1(c))
{
return char.IsWhiteSpaceLatin1(c);
}
return CharUnicodeInfo.IsWhiteSpace(c);
}
[SecuritySafeCritical]
public static bool IsWhiteSpace(string s, int index)
{
if (s == null)
{
throw new ArgumentNullException("s");
}
if (index >= s.Length)
{
throw new ArgumentOutOfRangeException("index");
}
if (char.IsLatin1(s[index]))
{
return char.IsWhiteSpaceLatin1(s[index]);
}
return CharUnicodeInfo.IsWhiteSpace(s, index);
}
Three things struck me:
ArgumentOutOfRangeException, while index below 0 would give string's standard IndexOutOfRangeException
SecuritySafeCriticalAttribute which I've read the general blerb about, but still unclear what it is doing here and if it is linked to the upper bound check.TargetedPatchingOptOutAttribute is not present on other Is...(char) methods. Example IsLetter, IsNumber etc.Because not every character fits in a C# char. For instance, "𠀀" takes 2 C# chars, and you couldn't get any information about that character with just a char overload. With String and an index, the methods can see if the character at index i is a High Surrogate char, and then read the Low Surrogate char at next index, add them up according to the algorithm, and retrieve info about the code point U+20000.
This is how UTF-16 can encode 1 million different code points, it's a variable-width encoding. It takes 2-4 bytes to encode a character, or 1-2 C# chars.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With