I have to write a PHP script that will import data from a given CSV file into MySQL database. The given CSV file can contain up to 200,000 rows. I tried the following but problems arise :
Now, I am thinking of a solution by splitting the CSV file into smaller chunks, then inserting them asynchronously. I am already done with the splitting of CSV, but I currently have no idea how to asynchronously insert into my database for quick and safe way. But I heard that I will be using Ajax here.
Any solution you can recommend? Thanks a lot in advance!
Thanks to everyone who gave answers to this question. I have discovered a solution! Just wanted to share it, in case someone needs to create a PHP script that will import a huge CSV file into MySQL database (asynchronously and fast!) I have tested my code with 400,000 rows and the importing is done in seconds. I believe it would work with larger files, you just have to modify maximum upload file size.
In this example, I will be importing a CSV file that contains two columns (name, contact_number) into a MySQL DB that contains the same columns.
Your CSV file should look like this :
Ana, 0906123489
John, 0908989199
Peter, 0908298392
...
...
So, here's the solution.
First, create your table
CREATE TABLE `testdb`.`table_test`
( `id` INT NOT NULL AUTO_INCREMENT ,
`name` VARCHAR(100) NOT NULL ,
`contact_number` VARCHAR(100) NOT NULL ,
PRIMARY KEY (`id`)) ENGINE = InnoDB;
Second, I have 4 PHP files. All you have to do is place this into a single folder. PHP files are as follows :
index.php
<form action="upload.php" method="post" enctype="multipart/form-data">
<input type="file" name="csv" value="" />
<input type="submit" name="submit" value="Save" /></form>
connect.php
<?php
//modify your connections here
$servername = "localhost";
$username = "root";
$password = "";
$dbname = "testDB";
$conn = new mysqli($servername, $username, $password, $dbname);
if ($conn->connect_error) {
    die("Connection failed: " . $conn->connect_error);
} 
?>
senddata.php
<?php
include('connect.php');
$data = $_POST['file'];
$handle = fopen($data, "r");
$test = file_get_contents($data);
if ($handle) {
    $counter = 0;
    //instead of executing query one by one,
    //let us prepare 1 SQL query that will insert all values from the batch
    $sql ="INSERT INTO table_test(name,contact_number) VALUES ";
    while (($line = fgets($handle)) !== false) {
      $sql .= "($line),";
      $counter++;
    }
    $sql = substr($sql, 0, strlen($sql) - 1);
     if ($conn->query($sql) === TRUE) {
    } else {
     }
    fclose($handle);
} else {  
} 
//unlink CSV file once already imported to DB to clear directory
unlink($data);
?>
upload.php
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/1.11.1/jquery.js"></script>
<script>
//Declaration of function that will insert data into database
 function senddata(filename){
        var file = filename;
        $.ajax({
            type: "POST",
            url: "senddata.php",
            data: {file},
            async: true,
            success: function(html){
                $("#result").html(html);
            }
        })
        }
 </script>
<?php
$csv = array();
$batchsize = 1000; //split huge CSV file by 1,000, you can modify this based on your needs
if($_FILES['csv']['error'] == 0){
    $name = $_FILES['csv']['name'];
    $ext = strtolower(end(explode('.', $_FILES['csv']['name'])));
    $tmpName = $_FILES['csv']['tmp_name'];
    if($ext === 'csv'){ //check if uploaded file is of CSV format
        if(($handle = fopen($tmpName, 'r')) !== FALSE) {
            set_time_limit(0);
            $row = 0;
            while(($data = fgetcsv($handle)) !== FALSE) {
                $col_count = count($data);
                //splitting of CSV file :
                if ($row % $batchsize == 0):
                    $file = fopen("minpoints$row.csv","w");
                endif;
                $csv[$row]['col1'] = $data[0];
                $csv[$row]['col2'] = $data[1];
                $min = $data[0];
                $points = $data[1];
                $json = "'$min', '$points'";
                fwrite($file,$json.PHP_EOL);
                //sending the splitted CSV files, batch by batch...
                if ($row % $batchsize == 0):
                    echo "<script> senddata('minpoints$row.csv'); </script>";
                endif;
                $row++; 
            }
            fclose($file);
            fclose($handle);
        }
    }
    else
    {
        echo "Only CSV files are allowed.";
    }
    //alert once done.
    echo "<script> alert('CSV imported!') </script>";
}
?>
That's it! You already have a pure PHP script that can import multiple number of rows in seconds! :) (Thanks to my partner who taught and gave me an idea on how to use ajax)
The main slowness comes from sending every single line as it's own request. I would suggest to send the query with every 1000 or 500 rows in the same format used by mysqldump --opt, so build a long string in the way
 insert into datatable (name, prename, commen) 
   values ('wurst', 'hans', 'someone')
   , ('bush', 'george', 'otherone')
   , ...
   ;
You should check how long your lines are allowed to be or if the MySQL- Server is in your control you could extend the maximal query length.
If this is still too long (I mean 200K is not much at all), then you could try to improve the csv-reading.
It is a bit work splitting into those chunks, but you could write a small chunk-class for this, so adding the rows gets a bit easier.
The usage of this class looked like
$chunk->prepare("insert into datatable (name, prename, comment) values");
$chunk->setSize(1000);
foreach ($row...){
   if($query = $chunk->addRow(...)){
       callUpdate($query);
   }
}
if($query = $chunk->clear()){
  callUpdate($query);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With