Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split String - the Cartesian way

Given the following string:

"foo bar-baz-zzz"

I want to split it at the characters " " and "-", preserving their value, but get all combinations of inputs.

i want to get a two-dimensional array containing

{{"foo", "bar", "baz", "zzz"}
,{"foo bar", "baz", "zzz"}
,{"foo", "bar-baz", "zzz"}
,{"foo bar-baz", "zzz"}
,{"foo", "bar", "baz-zzz"}
,{"foo bar", "baz-zzz"}
,{"foo", "bar-baz-zzz"}
,{"foo bar-baz-zzz"}}

Is there any built-in method in Java to split the string this way? Maybe in a library like Apache Commons? Or do I have to write a wall of for-loops?


2 Answers

Here is a recursive solution that works. I used a List<List<String>> rather than a 2-dimensional array to make things easier. The code is a bit ugly and could probably be tidied up a little.

Sample output:

$ java Main foo bar-baz-zzz
Processing: foo bar-baz-zzz
[foo, bar, baz, zzz]
[foo, bar, baz-zzz]
[foo, bar-baz, zzz]
[foo, bar-baz-zzz]
[foo bar, baz, zzz]
[foo bar, baz-zzz]
[foo bar-baz, zzz]
[foo bar-baz-zzz]

Code:

import java.util.*;

public class Main {
  public static void main(String[] args) {
    // First build a single string from the command line args.
    StringBuilder sb = new StringBuilder();
    Iterator<String> it = Arrays.asList(args).iterator();
    while (it.hasNext()) {
      sb.append(it.next());

      if (it.hasNext()) {
        sb.append(' ');
      }
    }

    process(sb.toString());
  }

  protected static void process(String str) {
    System.err.println("Processing: " + str);
    List<List<String>> results = new LinkedList<List<String>>();

    // Invoke the recursive method that does the magic.
    process(str, 0, results, new LinkedList<String>(), new StringBuilder());

    for (List<String> result : results) {
      System.err.println(result);
    }
  }

  protected static void process(String str, int pos, List<List<String>> resultsSoFar, List<String> currentResult, StringBuilder sb) {
    if (pos == str.length()) {
      // Base case: Reached end of string so add buffer contents to current result
      // and add current result to resultsSoFar.
      currentResult.add(sb.toString());
      resultsSoFar.add(currentResult);
    } else {
      // Step case: Inspect character at pos and then make recursive call.
      char c = str.charAt(pos);

      if (c == ' ' || c == '-') {
        // When we encounter a ' ' or '-' we recurse twice; once where we treat
        // the character as a delimiter and once where we treat it as a 'normal'
        // character.
        List<String> copy = new LinkedList<String>(currentResult);
        copy.add(sb.toString());
        process(str, pos + 1, resultsSoFar, copy, new StringBuilder());

        sb.append(c);
        process(str, pos + 1, resultsSoFar, currentResult, sb);
      } else {
        sb.append(c);
        process(str, pos + 1, resultsSoFar, currentResult, sb);
      }
    }
  }
}
like image 171
Adamski Avatar answered Jun 07 '26 11:06

Adamski


Here's a much shorter version, written in a recursive style. I apologize for only being able to write it in Python. I like how concise it is; surely someone here will be able to make a Java version.

def rec(h,t):
  if len(t)<2: return [[h+t]]
  if (t[0]!=' ' and t[0]!='-'): return rec(h+t[0], t[1:])
  return rec(h+t[0], t[1:]) + [ [h]+x for x in rec('',t[1:])]

and the result:

>>> rec('',"foo bar-baz-zzz")
[['foo bar-baz-zzz'], ['foo bar-baz', 'zzz'], ['foo bar', 'baz-zzz'], ['foo bar'
, 'baz', 'zzz'], ['foo', 'bar-baz-zzz'], ['foo', 'bar-baz', 'zzz'], ['foo', 'bar
', 'baz-zzz'], ['foo', 'bar', 'baz', 'zzz']]
like image 45
redtuna Avatar answered Jun 07 '26 11:06

redtuna