I have a string like
a,[1,2,3,{4,5},6],b,{c,d,[e,f],g},h
After split by , I expect getting 5 items, the , in the braces or brackets are ignored.
a
[1,2,3,{4,5},6]
b
{c,d,[e,f],g}
h
There are no whitespaces in the string. Is there a regular expression can make it happen?
You could use this:
var input = "a,[1,2,3,{4,5}],b,{c,d,[e,f]},g";
var result =
    (from Match m in Regex.Matches(input, @"\[[^]]*]|\{[^}]*}|[^,]+")
     select m.Value)
    .ToArray();
This will find any matches like:
[ followed by any characters other than ], then terminated by ]
{ followed by any characters other than }, then terminated by }
,
This will work, for you sample input, but it cannot handle nested groups like [1,[2,3],4] or {1,{2,3},4}. For that, I'd recommend something a bit more powerful regular expressions. Since you've mentioned in your comments that you're trying to parse Json, I'd recommend you check out the excellent Json.NET library.
Regular expressions * cannot be used to parse nested structures **.
( ∗ True regular expressions without non-regular extensions )
( ∗∗ Nested structures of arbitrary depth and interleaving )
But parsing by hand is not that difficult. First you need to find the , that are not in brackets or braces.
string input = "a,[1,2,3,{4,5},6],b,{c,d,[e,f],g},h";
var delimiterPositions = new List<int>();
int bracesDepth = 0;
int bracketsDepth = 0;
for (int i = 0; i < input.Length; i++)
{
    switch (input[i])
    {
        case '{':
            bracesDepth++;
            break;
        case '}':
            bracesDepth--;
            break;
        case '[':
            bracketsDepth++;
            break;
        case ']':
            bracketsDepth--;
            break;
        default:
            if (bracesDepth == 0 && bracketsDepth == 0 && input[i] == ',')
            {
                delimiterPositions.Add(i);
            }
            break;
    }
}
And then split the string at these positions.
public List<string> SplitAtPositions(string input, List<int> delimiterPositions)
{
    var output = new List<string>();
    for (int i = 0; i < delimiterPositions.Count; i++)
    {
        int index = i == 0 ? 0 : delimiterPositions[i - 1] + 1;
        int length = delimiterPositions[i] - index;
        string s = input.Substring(index, length);
        output.Add(s);
    }
    string lastString = input.Substring(delimiterPositions.Last() + 1);
    output.Add(lastString);
    return output;
}
Even if it looks ugly and there is no regex involved (not sure if it's a requirement or a nice-to-have in the original question), this alternative should work:
class Program
{
    static void Main(string[] args)
    {
        var input = "a,[1,2,3,{4,5}],b,{c,d,[e,f]},g";
        var output = "<root><n>" +
            input.Replace(",", "</n><n>")
            .Replace("[", "<n1><n>")
            .Replace("]", "</n></n1>")
            .Replace("{", "<n2><n>")
            .Replace("}", "</n></n2>") +
            "</n></root>";
        var elements = XDocument
            .Parse(output, LoadOptions.None)
            .Root.Elements()
            .Select(e =>
            {
                if (!e.HasElements)
                    return e.Value;
                else
                {
                    return e.ToString()
                        .Replace(" ", "")
                        .Replace("\r\n", "")
                        .Replace("</n><n>", ",")
                        .Replace("<n1>", "[")
                        .Replace("</n1>", "]")
                        .Replace("<n2>", "{")
                        .Replace("</n2>", "}")
                        .Replace("<n>", "")
                        .Replace("</n>", "")
                        .Replace("\r\n", "")
                        ;
                }
            }).ToList();
    }
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With