I want to print the child elements of the root node. This is my XML file.
<?xml version="1.0"?> <!-- Comment--> <company> <staff id="1001"> <firstname>yong</firstname> <lastname>mook kim</lastname> <nickname>mkyong</nickname> <salary>100000</salary> </staff> <staff id="2001"> <firstname>low</firstname> <lastname>yin fong</lastname> <nickname>fong fong</nickname> <salary>200000</salary> </staff> </company> According to my understanding, root node is 'company' and its child nodes must be 'staff' and 'staff' (as there are 'staff' nodes 2 times). But when I am trying to get them through my java code I am getting 5 child nodes. Where are the 3 extra text nodes coming from?
Java Code:
package com.training.xml; import java.io.File; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import org.w3c.dom.Document; import org.w3c.dom.Node; import org.w3c.dom.NodeList; public class ReadingXML { public static void main(String[] args) { try { File file = new File("D:\\TestFile.xml"); DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder dBuilder = dbFactory.newDocumentBuilder(); Document doc = dBuilder.parse(file); doc.getDocumentElement().normalize(); System.out.println("root element: " + doc.getDocumentElement().getNodeName()); Node rootNode = doc.getDocumentElement(); System.out.println("root: " + rootNode.getNodeName()); NodeList nList = rootNode.getChildNodes(); for(int i = 0; i < nList.getLength(); i++) { System.out.println("node name: " + nList.item(i).getNodeName() ); } } catch(Exception e) { e.printStackTrace(); } } } OUTPUT:
root element: company root: company node name: #text node name: staff node name: #text node name: staff node name: #text Why the three text nodes are coming over here?
Why the three text nodes are coming over here ?
They're the whitespace between the child elements. If you only want the child elements, you should just ignore nodes of other types:
for (int i = 0;i < nList.getLength(); i++) { Node node = nList.item(i); if (node.getNodeType() == Node.ELEMENT_NODE) { System.out.println("node name: " + node.getNodeName()); } } Or you could change your document to not have that whitespace.
Or you could use a different XML API which allows you to easily ask for just elements. (The DOM API is a pain in various ways.)
If you only want to ignore element content whitespace, you can use Text.isElementContentWhitespace.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With