So i am trying to parse some opendata to build a database. Here's what I've done :
# -*- coding: utf-8 -*-
import urllib
import xml.etree.ElementTree as ET
url = 'http://opendata.cwb.gov.tw/govdownload?dataid=C-A0008-001&authorizationkey=rdec-key-123-45678-011121314'
root = ET.parse(urllib.urlopen(url)).getroot()
locations = root.findall('dataset/location')
print type(locations)
print "Counts:", len(locations)
it returned:
Counts: 0
I tried to parse some other xml data(change the url) and it worked fine
the xml data I'm working on is roughly like:
<?xml version="1.0" encoding="UTF-8"?><cwbopendata xmlns="urn:cwb:gov:tw:cwbcommon:0.1">
<identifier>0f819d32-297a-4512-9654-990a565bd080</identifier>
<sender>[email protected]</sender>
<sent>2016-05-23T16:07:06+08:00</sent>
<status>Actual</status>
<msgType>Issue</msgType>
<dataid>CWB_A0008</dataid>
<scope>Public</scope>
<dataset>
<location>
<stationId>72C44</stationId>
<time>
<dataTime>105 4_2</dataTime>
</time>
<weatherElement>
<elementName>平均氣溫</elementName>
<elementValue>
<value>21.1</value>
</elementValue>
.
.
.
</location>
<location>
.
.
.
Sorry I'm new to python and ElementTree and hope to get some good advices, thanks
Your XML has default namespace which URI is 'urn:cwb:gov:tw:cwbcommon:0.1'. So all elements without prefix, within element where default namespace is declared would be considered in that namespace :
>>> ns = {'d': 'urn:cwb:gov:tw:cwbcommon:0.1'}
>>> locations = root.findall('d:dataset/d:location', ns)
>>> print "Counts:", len(locations)
Counts: 17
Related : Parsing XML with namespace in Python via 'ElementTree'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With