Say I have this XML and I need to remove empty elements (elements that don't contain data at all) such as:
...
<date>
<!-- keep oneDay -->
<oneDay>
<startDate>1450288800000</startDate>
<endDate>1449086400000</endDate>
</oneDay>
<!-- remove range entirely -->
<range>
<startDate/>
<endDate/>
</range>
<!-- remove deadline entirely -->
<deadline>
<date/>
</deadline>
<data>
...
The output then should be
...
<oneDay>
<startDate>1450288800000</startDate>
<endDate>1449086400000</endDate>
</oneDay>
...
I'm looking for a dynamic solution that would work on any cases like this regardless of the literal name of the element.
It turns out that using //*[not(normalize-space())] returns all elements without non-empty text content (no need for recursion).
foreach($xpath->query('//*[not(normalize-space())]') as $node ) {
$node->parentNode->removeChild($node);
}
Check out @har07's solution for more details
The xPath approach provided by @manuelbc works but only on child elements (meaning that the children will be gone but the parent nodes of those will stay... empty as well).
However, this will work recursively until the XML document is out of empty nodes.
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->loadxml('<XML STRING GOES HERE>');
$xpath = new DOMXPath($doc);
while (($notNodes = $xpath->query('//*[not(node())]')) && ($notNodes->length)) {
foreach($notNodes as $node) {
$node->parentNode->removeChild($node);
}
}
$doc->formatOutput = true;
echo $doc->saveXML();
The XPath in the other answer only returns empty elements in the sense that the element has no child node of any kind (no element node, no text node, nothing). To get all empty elements according to your definition, that is element without non-empty text content, try using the following XPath instead :
//*[not(normalize-space())]
eval.in demo
output :
<?xml version="1.0"?>
<data>
<!-- keep oneDay -->
<oneDay>
<startDate>1450288800000</startDate>
<endDate>1449086400000</endDate>
</oneDay>
<!-- remove range entirely -->
<!-- remove deadline entirely -->
</data>
You can do it with XPath
<?php
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->loadxml('<date>
<!-- keep oneDay -->
<oneDay>
<startDate>1450288800000</startDate>
<endDate>1449086400000</endDate>
</oneDay>
<!-- remove range entirely -->
<range>
<startDate/>
<endDate/>
</range>
<!-- remove deadline entirely -->
<deadline>
<date/>
</deadline>
<data>');
$xpath = new DOMXPath($doc);
foreach( $xpath->query('//*[not(node())]') as $node ) {
$node->parentNode->removeChild($node);
}
$doc->formatOutput = true;
echo $doc->savexml();
See original solution here: Remove empty tags from a XML with PHP
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With