xml2js: how is the output?

Question

I m trying to use the node.js module xml2js

My code is quite simple:

function testparse(pathname, callback) {
    var parser = require('xml2js').Parser(),
        util = require('util'),
        fs = require('fs'),
    fs.readFile(pathname, function (err, data) {
        parser.parseString(data, function(err, result) {
            console.log('Complete result:');
            console.log(util.inspect(result, {depth: null})); //Work
            console.log('Try to access element:');
            console.log(result.smil.body); //Work
            console.log(result.smil.body.update); //Undefined
        });
    });
}

My xml file is as:

<?xml version="1.0"?>
<smil>
    <head/>
    <body>
        <update /*some field*//>
        <stream name="name"/>
        <playlist /*some field*/>
            <video /*some field*//>
            <video /*some field*//>
            <video /*some field*//>
        </playlist>
    </body>
</smil>

The output give me:

Complete result:
{ smil:
    { head: [''],
      body:
        [ { update: [[Object]],
            stream: [[Object]],
            playlist: [[Object]] } ] } }
Try to access element:
[Object]
Undefined

I have succeed in accessing body by trying, but now I m stuck, is there a template or example of how xml2js output the parsed xml somewhere?

Garth Kidd · Accepted Answer

TL;DR

It's harder than it looks. Read the Open311 JSON and XML Conversion page for details of other JSON-side representations. All of them "use and abuse" arrays, extra layers of objects, members with names that didn't appear in the original XML, or all three.

Long Answer

xml2js has an un-enviable task: convert XML to JSON in a way that can be reversed, without knowing the schema in advance. It seems obvious, at first:

<name>Fred</name> → { name: "Fred" }
<chacha /> → { chacha: null }

Easy so far, right? How about this, though?

<x><y>z</y><x>

Removing the human friendly names drives home the uncertainty facing xml2js. At first, you might think this is quite reasonable:

{ x: { y: "z" } }

Later, you trip over this XML text and realise your guessed-at schema was wrong:

<x><y>z</y><y>z2</y></x>

Uh oh. Maybe we should have used an array. At least all the members have the same tag:

{ x: [ "z", "z2" ] }

Inevitably, though, that turns out to be short-sighted:

<x><y>z</y><y>z2</y><m>n</m>happy</x>

Uh...

{ x: [ { y: "z" }, { y : "z2" }, { m: "n" }, "happy" ] }

... and then someone polishes you off with some attributes and XML namespaces.

The way to construct a more concise output schema feels obvious to you. You can infer details from the tag and attribute names. You understand it.

The library does not share that understanding.

If the library doesn't know the schema, it must either "use and abuse" arrays, extra layers of objects, special attribute names, or all three.

The only alternative is to employ a variable output schema. That keeps it simple at first, as we saw above, but you'll quickly find yourself writing a great deal of conditional code. Consider what happens if children with the same tag name are collapsed into a list, but only if there are more than one:

if (Array.isArray(x.y)) {
    processTheYChildren(x.y);
} else if (typeof(x.y) === 'object') {
    // only one child; construct an array on the fly because my converter didn't
    processTheYChildren([x.y]);
} else ...

xml2js: how is the output?

Tags:

node.js

xml

xml-parsing

DrakaSAN

1 Answers

TL;DR

Long Answer

Garth Kidd

Recent Activity

Donate For Us

xml2js: how is the output?

Tags:

node.js

xml

xml-parsing

DrakaSAN

1 Answers

TL;DR

Long Answer

Garth Kidd

Related questions

Recent Activity

Donate For Us