How do I use python to print/dump the "absolute path" and value for an XML document?
for example:
<A>
<B>foo</B>
<C>
<D>On</D>
</C>
<E>Auto</E>
<F>
<G>
<H>shoo</H>
<I>Off</I>
</G>
</F>
</A>
to
/A/B, foo
/A/C/D, On
/A/E, Auto
/A/F/G/H, shoo
/A/F/G/I, Off
from lxml import etree
root = etree.XML(your_xml_string)
def print_path_of_elems(elem, elem_path=""):
for child in elem:
if not child.getchildren() and child.text:
# leaf node with text => print
print "%s/%s, %s" % (elem_path, child.tag, child.text)
else:
# node with child elements => recurse
print_path_of_elems(child, "%s/%s" % (elem_path, child.tag))
print_path_of_elems(root, root.tag)
Another way to do this would be something like:
from lxml import etree
XMLDoc = etree.parse(open('file.xml'))
for Node in XMLDoc.xpath('//*'):
if not Node.getchildren() and Node.text:
print XMLDoc.getpath(Node), Node.text
Depending on the structure of your document you may get node numbers in the xpath which you might have to strip out.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With