Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to parse nested XML data and extract only the userid from it?

I have below xml layout from which I am suppose to extract all the "userid" value which is inside <key> </key> and load them to HashSet in Java

<?xml version="1.0" encoding="UTF-8"?>
<response>
   <plds>
      <fld>consumerid</fld>
      <fld>last_set</fld>
   </plds>
   <record>
      <data>934463448   1417753752</data>
      <key_data>
         <key>
            <name>userid</name>
            <value>934463448</value>
         </key>
      </key_data>
   </record>
   <record>
      <data>1228059948  1417753799</data>
      <key_data>
         <key>
            <name>userid</name>
            <value>1228059948</value>
         </key>
      </key_data>
   </record>
</response>

I will be getting above xml data from a url and it is possible that I can get big XML file. What is the best way to parse the above XML and extract all the "userid" and load it in the HashSet in Java?

This is what I have started -

public static Set<String> getUserList(String host, String count) {

    Set<String> usrlist = new HashSet<String>();
    String url = "urlA"; // this url will return me above XML data
    InputStream is = new URL(url).openStream();
    BufferedReader rd = new BufferedReader(new InputStreamReader(is, Charset.forName("UTF-8")));

    // not sure what I should do here which can
    // parse my above xml and extract all the
    // userid and load it into usrlist hash set

    return usrlist;
}

UPDATE:-

This is what I have tried -

public static Set<String> getUserList() {

    Set<String> usrlist = new HashSet<String>();
    String url = "urlA"; // this url will return me above XML data
    InputStream is = new URL(url).openStream();
    BufferedReader rd = new BufferedReader(new InputStreamReader(is, Charset.forName("UTF-8")));

    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    DocumentBuilder builder = factory.newDocumentBuilder();
    Document doc = builder.parse(new URL(url).openStream());

    XPathFactory xPathfactory = XPathFactory.newInstance();
    XPath xpath = xPathfactory.newXPath();
    XPathExpression expr = xpath.compile("//record/key_data/key[name='userid']/value");
    NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
    for (int i = 0; i < nodes.getLength(); i++) {
        usrlist.add(nodes.item(i).getNodeValue());
    }

    return usrlist;
}

But I am not getting any user id in the usrlist object? Anything wrong I am doing here?

like image 794
AKIWEB Avatar asked Dec 06 '25 18:12

AKIWEB


2 Answers

StAX is an efficient way to parse large xmls

    XMLStreamReader r = XMLInputFactory.newInstance().createXMLStreamReader(is);
    while(r.hasNext()) {
        if (r.next() == XMLStreamReader.START_ELEMENT && r.getLocalName().equals("value")) {
            String value = r.getElementText();
            System.out.println(value);
        }
    }
like image 70
Evgeniy Dorofeev Avatar answered Dec 08 '25 07:12

Evgeniy Dorofeev


If the above document is relatively small, you could load the entire document and then apply the below xpath to extract the keys in the document:

//record/key_data/key[name='userid']/value

Edit

I think you have a bug - use getTextContent() to obtain text(), not getNodeValue():

for (int i = 0; i < nodes.getLength(); i++) {
    usrlist.add( nodes.item(i).getTextContent());
}

Debug Code:

Set<String> usrlist = new HashSet<String>();
String myXml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
        "<response>\n" +
        ...
        "</response>";
InputStream is = new ByteArrayInputStream( myXml.getBytes( ) );
BufferedReader rd = new BufferedReader(new InputStreamReader(is, Charset.forName("UTF-8")));

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(new InputSource(rd));

XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//record/key_data/key[name='userid']/value");
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
    usrlist.add( nodes.item(i).getTextContent());
}

return usrlist;
like image 24
StuartLC Avatar answered Dec 08 '25 07:12

StuartLC



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!