Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove row in csv format containing cell 'NULL' in Java

Tags:

java

I am working on a solution in Java that removes a row if one of the cells contain a NULL, such as row with the name Jenny. The table is given in CSV format.Every two consecutive cells in each row are separated by a single comma ',' symbol. Every two consecutive rows are separated by a new-line '\n' symbol. For example, the table from the task statement, written in CSV format, is the single string:

"id,name,age,score\n1,Jack,NULL,12\n17,Betty,28,11". 

You may assume that each row has the same number of cells. Need to watch out for words such as 'ANNULLED'.

  1. Given S = "id,name,age,score\n1,Jenny,NULL,14\n17,Daryll,31,11", your function should return "id,name,age,score\n17,Betty,28,11".
+----+-------+------+-------+
| id | name  | age  | score |
+----+-------+------+-------+
| 1b | Jenny | NULL | 14    |
+----+-------+------+-------+
| 17 | Daryll| 32   | 11    |
+----+-------+------+-------+

My solution has been marked as only 33 out of 100 efficiency wise and I wonder if anyone could advise how I can improve my solution please.

public static void main(String[] args) {
    String S = "id,name,age,score\n1,Jack,NULL,12\n17,Betty,28,11";
    System.out.println(solution(S));
}


public static String solution(String S) {
    ArrayList<String> al = new ArrayList<String>();
    String[] rows = S.split("\n");
    String finalString = "";

    for (int i = 0; i < rows.length; i++) {
        al.add(rows[i]);
    }

    for (int i = 0; i < al.size(); i++) {
        if (al.get(i).contains(",NULL,") || al.get(i).contains("NULL\n") || al.get(i).contains(",NULL")) {
            al.remove(i);
        }
    }

    for (int i = 0; i < al.size(); i++) {
        finalString += al.get(i) + "\n";
    }
    
    String finalerString = finalString.substring(0, finalString.length() - 1);
    return finalerString;
}
like image 336
KeenLearnerA Avatar asked Apr 23 '26 19:04

KeenLearnerA


2 Answers

In terms of efficiency, the most efficient solution would be iterating over the input String, S once and, writing the solution into a stream. Now let's compare it with your implementation.

  • S.split("\n"); requires full iteration over S pulse you create additional memory of the size of S
  • al.add(rows[i]); you convert the String array to List, I don't see any need for that, but I don't think it affects the performance much.
  • al.get(i).contains(",NULL,") that iterate over the entire String, and you got three such conditions, so you iterate over S three times here(in the worst case since if the first or condition is true the others will not be checked).
  • finalString += al.get(i) + "\n"; Here you are reconstructing the solution without the filtered rows. The problem with += on Strings is that it each time creates a new instance of the String. This makes your implementation O(N^2) instead of O(N). Switch to a StringBuilder and you will return into the O(N) space. With modern ADE like IntelliJ, you will get a warning and it will update your code automatically.
  • finalString.substring(0, finalString.length() - 1); Again in here you are creating a new instance of the String so it costs you S

To implement a simple-efficient solution I would do the following

  • Split S by \n as you did
  • Iterate over the String array and filter out the rows with Null
  • Then use a StringBuilder to combine the results

For the filtering I would do something like:

al.get(i).startsWith("NULL\n") || al.get(i).endsWith(",NULL") || al.get(i).contains(",NULL,") 

The methods startsWith, endsWith are O(1) compare to contains that is O(N), so the filtering will be O(1) in the best case and O(N) in the worst case.

As I said earlier this will not be the most efficient solution, but it is pretty simple and pretty efficient. It requires only two iterations over S in the worst case.

like image 128
Ilya Gazman Avatar answered Apr 25 '26 09:04

Ilya Gazman


The most efficient way to do this is to just work on the string form.

Here's how to perform the requested operation with a regular expression, which I admit is not the most efficient way to work on the string, but still shouldn't be too bad:

import java.util.regex.Pattern;

class Test {
    public static void main(String[] args) {

        String data = "id,name,age,score\n1,Jenny,NULL,14\n17,Daryll,31,11";

        Pattern exp = Pattern.compile("^(.*,)*NULL(,.*)*(\\n|$)", Pattern.MULTILINE);
        String r = exp.matcher(data).replaceAll("");
        System.out.println(r);
    }
}

I'm not much of a regex weenie. There's likely a more concise regular expression than this, but this gives you the idea and it does work.

Result:

id,name,age,score
17,Daryll,31,11
like image 32
CryptoFool Avatar answered Apr 25 '26 07:04

CryptoFool