I have several thousand records (stored in in a table in a MYSQL table) that I need to "batch process." All of the records contain a large JSON. In some cases, the JSON is over 1MB (yes, my DB is well over 1GB).
I have a function that grabs a record, decodes the JSON, changes some data, re-encodes the PHP array back to a JSON, and saves it back to the db. Pretty simple. FWIW, this is within the context of a CakePHP app.
Given an array of ID's, I'm attempting to do something like this (very simple mock code):
foreach ($ids as $id) {
    $this->Model->id = $id;
    $data = $this->Model->read();
    $newData = processData($data);
    $this->Model->save($newData);
}
The issue is that, very quickly, PHP runs out of memory. When running a foreach like this, it's almost as if PHP moves from one record to the next, without releasing the memory required for the preceding operations.
Is there anyway to run a loop in such a way that memory is freed before moving on to the next iteration of the loop, so that I can actually process the massive amount of data?
Edit: Adding more code. This function takes my JSON, converts it to a PHP array, does some manipulation (namely, reconfiguring data based on what's present in another array), and replacing values in the the original array. The JSON is many layers deep, hence the extremely long foreach loops.
function processData($theData) {
    $toConvert = json_decode($theData['Program']['data'], $assoc = true);
    foreach($toConvert['cycles'] as $cycle => $val) {
        foreach($toConvert['cycles'][$cycle]['days'] as $day => $val) {
            foreach($toConvert['cycles'][$cycle]['days'][$day]['sections'] as $section => $val) {
                foreach($toConvert['cycles'][$cycle]['days'][$day]['sections'] as $section => $val) {
                    foreach($toConvert['cycles'][$cycle]['days'][$day]['sections'][$section]['exercises'] as $exercise => $val) {
                        if (isset($toConvert['cycles'][$cycle]['days'][$day]['sections'][$section]['exercises'][$exercise]['selectedFolder'])) {
                            $folderName = $toConvert['cycles'][$cycle]['days'][$day]['sections'][$section]['exercises'][$exercise]['selectedFolder']['folderName'];
                            if ( isset($newFolderList['Folders'][$folderName]) ) {
                                $toConvert['cycles'][$cycle]['days'][$day]['sections'][$section]['exercises'][$exercise]['selectedFolder'] = $newFolderList['Folders'][$folderName]['id'];
                            }
                        }
                        if (isset($toConvert['cycles'][$cycle]['days'][$day]['sections'][$section]['exercises'][$exercise]['selectedFile'])) {
                            $fileName = basename($toConvert['cycles'][$cycle]['days'][$day]['sections'][$section]['exercises'][$exercise]['selectedFile']['fileURL']);
                            if ( isset($newFolderList['Exercises'][$fileName]) ) {
                                $toConvert['cycles'][$cycle]['days'][$day]['sections'][$section]['exercises'][$exercise]['selectedFile'] = $newFolderList['Exercises'][$fileName]['id'];
                            }
                        }
                    }
                }
            }
        }
    }
    return $toConvert;
}
Model->read() essentially just tells Cake to pull a record from the db, and returns it in an array. There's plenty of stuff that's happening behind the scenes, someone more knowledgable would have to explain that.
The first step I would do is make sure everything is passed by reference.
Eg,
foreach ($ids as $id) {
processData($data);
}
function processData(&$d){}
http://php.net/manual/en/language.references.pass.php
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With