Let's say I have a matrix stored in an XML-file in the following format:
<?xml version="1.0"?>
<Matrix>
<Value Col="0" Row="0">0.19343</Value>
<Value Col="1" Row="0">0.95079</Value>
<Value Col="2" Row="0">0.89542</Value>
<Value Col="0" Row="1">0.14391</Value>
<Value Col="1" Row="1">0.094629</Value>
<Value Col="2" Row="1">0.52303</Value>
</Matrix>
What is the best way to parse these values into a numpy array using xml.etree in Python without knowing the size of the dimensions of the matrix? Otherwise, I guess i could just simply do:
import xml.etree.ElementTree as ET
import numpy as np
rowcnt = 2
colcnt = 3
xmltree = ET.parse('some_xmlfile.xml')
matrix = np.zeros(shape=(rowcnt, colcnt))
for m in xmltree.iter('Matrix'):
for v in m.iter('Value'):
col = int(v.attrib['Col'])
row = int(v.attrib['Row'])
matrix[row, col] = float(v.text)
print matrix
I'm not claiming that this is the best way to create a numpy array from your XML file, but this should work for an arbitrary number of columns (although the rows must be the same size), and for arbitrarily ordered <Value> elements.
import numpy as np
import xml.etree.ElementTree as ET
from collections import defaultdict
root = ET.parse('some_xmlfile.xml').getroot()
data = defaultdict(list)
# group into rows of (col, val) tuples
for val in root.iter('Value'):
data[int(val.attrib['Row'])].append((int(val.attrib['Col']), val.text))
# sort columns and format into a space separated string
rows = []
for row in data:
rows.append(' '.join([cols[1] for cols in sorted(data[row])]))
# build array from matrix string
matrix = np.array(np.mat(';'.join(rows)))
>>> matrix
array([[ 0.19343 , 0.95079 , 0.89542 ],
[ 0.14391 , 0.094629, 0.52303 ]])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With