Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove empty elements from XML in php

Say I have this XML and I need to remove empty elements (elements that don't contain data at all) such as:

...
<date>
    <!-- keep oneDay -->
    <oneDay>
        <startDate>1450288800000</startDate>
        <endDate>1449086400000</endDate>
    </oneDay>
    <!-- remove range entirely -->
    <range>
        <startDate/>
        <endDate/>
    </range>
    <!-- remove deadline entirely -->
    <deadline>
        <date/>
    </deadline>
<data>
...

The output then should be

...
<oneDay>
    <startDate>1450288800000</startDate>
    <endDate>1449086400000</endDate>
</oneDay>
...

I'm looking for a dynamic solution that would work on any cases like this regardless of the literal name of the element.

SOLUTION (UPDATED)

It turns out that using //*[not(normalize-space())] returns all elements without non-empty text content (no need for recursion).

foreach($xpath->query('//*[not(normalize-space())]') as $node ) {
    $node->parentNode->removeChild($node);
} 

Check out @har07's solution for more details

SOLUTION

The xPath approach provided by @manuelbc works but only on child elements (meaning that the children will be gone but the parent nodes of those will stay... empty as well).

However, this will work recursively until the XML document is out of empty nodes.

$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->loadxml('<XML STRING GOES HERE>');

$xpath = new DOMXPath($doc);

while (($notNodes = $xpath->query('//*[not(node())]')) && ($notNodes->length)) {
  foreach($notNodes as $node) {
    $node->parentNode->removeChild($node);
  }
}

$doc->formatOutput = true;
echo $doc->saveXML();
like image 537
samd Avatar asked Jun 25 '26 18:06

samd


2 Answers

The XPath in the other answer only returns empty elements in the sense that the element has no child node of any kind (no element node, no text node, nothing). To get all empty elements according to your definition, that is element without non-empty text content, try using the following XPath instead :

//*[not(normalize-space())]

eval.in demo

output :

<?xml version="1.0"?>
<data>
  <!-- keep oneDay -->
  <oneDay>
    <startDate>1450288800000</startDate>
    <endDate>1449086400000</endDate>
  </oneDay>
  <!-- remove range entirely -->
  <!-- remove deadline entirely -->
</data>
like image 109
har07 Avatar answered Jun 27 '26 07:06

har07


You can do it with XPath

<?php
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->loadxml('<date>
    <!-- keep oneDay -->
    <oneDay>
        <startDate>1450288800000</startDate>
        <endDate>1449086400000</endDate>
    </oneDay>
    <!-- remove range entirely -->
    <range>
        <startDate/>
        <endDate/>
    </range>
    <!-- remove deadline entirely -->
    <deadline>
        <date/>
    </deadline>
<data>');

$xpath = new DOMXPath($doc);

foreach( $xpath->query('//*[not(node())]') as $node ) {
    $node->parentNode->removeChild($node);
}

$doc->formatOutput = true;
echo $doc->savexml();

See original solution here: Remove empty tags from a XML with PHP

like image 24
manuelbcd Avatar answered Jun 27 '26 09:06

manuelbcd



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!