Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XML DOMDocument optimization

I have a 5MB XML file

I'm using the following code to get all nodeValue

$dom = new DomDocument('1.0', 'UTF-8');
if(!$dom->load($url))
    return;

$games = $dom->getElementsByTagName("game");
foreach($games as $game)
{
            
}

This takes 76 seconds and there are around 2000 games tag. Is there any optimization or other solution to get the data?

like image 421
OHLÁLÁ Avatar asked Jun 21 '26 12:06

OHLÁLÁ


2 Answers

I once wrote a blog article about loading huge XML files with XMLReader - you probably can use some of it.

Using DOM or SimpleXML is no option, since both load the whole document into memory.

like image 127
cweiske Avatar answered Jun 24 '26 03:06

cweiske


You can use DOMXpath for querying, which is way faster than the DOMDocument:: getElementsByTagName() method.

<?php
$xpath = new \DOMXpath($dom);
$games = $xpath->query("//game");

foreach ($games as $game) {
    // Code here
}

In one of my tests with a fairly large file, this approach took < 1 sec to complete the iteration of 24k elements, whereas the DOMDocument:: getElementsByTagName() method was taking ~27 min (and the time took to iterate to the next object was exponential).

like image 31
paul.ago Avatar answered Jun 24 '26 03:06

paul.ago



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!