I need to split each line of text into an array using a loop. The problem is that there's no obvious delimiter to use given the formatting of the text file (which I can't change):
Adam Rippon New York, NY 77.58144.6163.6780.94
Brandon Mroz Broadmoor, CO 70.57138.1266.8471.28
Stephen Carriere Boston, MA 64.42138.8368.2770.56
Grant Hochstein New York, NY 64.62133.8867.4468.44
Keegan Messing Alaska, AK 61.15136.3071.0266.28
Timothy Dolensky Atlanta, AL 61.76123.0861.3063.78
Max Aaron Broadmoor, CO 86.95173.4979.4893.51
Jeremy Abbott Detroit, MI 99.86174.4193.4280.99
Jason Brown Skokie Value,IL 87.47182.6193.3489.27
Joshua Farris Broadmoor, CO 78.37169.6987.1783.52
Richard Dornbush All Year, CA 92.04144.3465.8278.52
Douglas Razzano Coyotes, AZ 75.18157.2580.6976.56
Ross Miner Boston, MA 71.94152.8772.5380.34
Sean Rabbit Glacier, CA 60.58122.7656.9066.86
Lukas Kaugars Broadmoor, CO 64.57114.7550.4766.28
Philip Warren All Year, CA 55.80113.2457.0258.22
Daniel Raad Southwest FL 52.98108.0358.6151.42
Scott Dyer Brooklyn, OH 55.78100.9744.3357.64
Robert PrzepioskiRochester, NY 47.00100.3449.2651.08
Ideally I would like each name to be in [0] (or first name in [0] last name in [1]), each location to be in [2] or also in two different indexes for city and state, and then each score to be in their own index. For each person there are four separate numbers. Like for example Adam Rippon's scores are 77.58, 144.61, 63.67, 80.94
I can't split by spaces because some of the cities have a space between their name (like New York would then be split into New and York in two different array elements while Broadmoor would be in one element). Can't split cities by commas because Southwest FL has no comma. I also can't split the numbers by decimal point because those numbers would be wrong. So is there an easy way to go about doing this? Like perhaps a way to split numbers by the amount of decimal places?
It looks like there is a fixed size for each column. So in your case, column 1 is 17 characters long, the second column is 16 characters long and the last one is 21 characters long.
Now you can simply iterate through the lines and make use of the substring() method. Something like...
String firstColumn = line.substring(0, 17).trim();
String secondColumn = line.substring(17, 33).trim();
String thirdColumn = line.substring(33, line.length).trim();
To extract the numbers, we could use a regular expression that searches for all numbers with two decimal places.
Pattern pattern = Pattern.compile("(\\d+\\.[0-9]{2})");
Matcher matcher = pattern.matcher(thirdColumn);
while(matcher.find())
{
System.out.println(matcher.group());
}
So in this case 47.00100.3449.2651.08 will output
47.00
100.34
49.26
51.08
It looks like each column has a fixed size (number of characters). As you already said you cannot split by tabs or spaces because of the last line where there is no tab or space between name and city.
I propose to read one line and then split the String by line.substring(startIndex,endIndex). For example line.substring(0,18) for the name (if I counted correctly). Then you can split this name in first and lastname by using the space as delimiter.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With