I have a Javascript regex that tokenizes words from a sentence which is like the following:
/\\[^]|\.+|\w+|[^\w\s]/g
Like if a sentence is entered like
Hello World.the above regex will tokenize it into words:
Hello,World,.
I am trying to convert the above regex in C#, but its not able to group it. I have tried removing the / and the \g from the beginning and the end respectively, in order to make it compatible with .NET regex engine. But its still not working.
Below is the C# code I am trying:
public static void Main()
{
string pattern = @"\\[^]|\.+|\w+|[^\w\s]";
string input = @"hello world.";
foreach (Match m in Regex.Matches(input, pattern, RegexOptions.ECMAScript))
{
Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
}
}
Can anyone help me converting the above regex into C#?
Note that RegexOptions.ECMAScript just makes sure shorthand character classes (here, \w and \s) only match ASCII letters, digits and whitespace. You can't expect this option to "convert" the whole pattern for use in .NET regex library.
Here, [^] construct was used in JS regex to match any char. You may use . with a RegexOptions.Singleline option (and then you will have to remove the RegexOptions.ECMAScript option) instead of [^], or just use [\s\S] to match any char:
public static void Main()
{
string pattern = @"\\.|\.+|\w+|[^\w\s]";
string input = @"hello world.";
foreach (Match m in Regex.Matches(input, pattern, RegexOptions.Singleline))
{
Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
}
}
See the C# demo, its output:
'hello' found at index 0.
'world' found at index 6.
'.' found at index 11.
NOTE: \w and \s are Unicode aware in .NET regex, the match all Unicode letters with some diacritics, too. If you only want to handle ASCII, use
string pattern = @"\\.|\.+|[A-Za-z0-9_]+|[^A-Za-z0-9_\f\n\r\t\v\u00a0\u1680\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff]";
More details
\w in .NET regex\s in .NET regexIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With