Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make Pandas unpack JSON data into proper DataFrame instead of list of dicts

Tags:

python

pandas

I'm trying to parse the data at http://dev.hsl.fi/tmp/citybikes/stations_20170503T071501Z into a Pandas DataFrame. Using read_json gives me a list of dicts instead of a proper DataFrame with the variable names as columns:

In [1]:

data = pd.read_json("http://dev.hsl.fi/tmp/citybikes/stations_20170503T071501Z")
print(data)

Out[1]:

                                                result
0    {'name': '001 Kaivopuisto', 'coordinates': '60...
1    {'name': '002 Laivasillankatu', 'coordinates':...
..                                                 ...
149  {'name': '160 Nokkala', 'coordinates': '60.147...
150  {'name': '997 Workshop Helsinki', 'coordinates...

[151 rows x 1 columns]

This happens with all orient option. I've tried json_normalize() to no avail as well and a few other things I found here. How could I make this into a sensible DataFrame? Thanks!

like image 460
basse Avatar asked Oct 26 '25 05:10

basse


1 Answers

Option 1
Use pd.DataFrame on the list of dictionaries

pd.DataFrame(data['result'].values.tolist())

   avl_bikes          coordinates  free_slots                    name  operative style  total_slots
0         12  60.155411,24.950391          18         001 Kaivopuisto       True    CB           30
1          3  60.159715,24.955212           9     002 Laivasillankatu       True                 12
2          0  60.158172,24.944808          16  003 Kapteeninpuistikko       True                 16
3          0  60.160944,24.941859          14           004 Viiskulma       True                 14
4         16  60.157935,24.936083          16           005 Sepänkatu       True                 32

Option 2
Use apply

data.result.apply(pd.Series)

   avl_bikes          coordinates  free_slots                    name  operative style  total_slots
0         12  60.155411,24.950391          18         001 Kaivopuisto       True    CB           30
1          3  60.159715,24.955212           9     002 Laivasillankatu       True                 12
2          0  60.158172,24.944808          16  003 Kapteeninpuistikko       True                 16
3          0  60.160944,24.941859          14           004 Viiskulma       True                 14
4         16  60.157935,24.936083          16           005 Sepänkatu       True                 32

Option 3
Or you could fetch the json yourself and strip out the results

import urllib, json
url = "http://dev.hsl.fi/tmp/citybikes/stations_20170503T071501Z"
response = urllib.request.urlopen(url)
data = json.loads(response.read())

df = pd.DataFrame(data['result'])
df

   avl_bikes          coordinates  free_slots                    name  operative style  total_slots
0         12  60.155411,24.950391          18         001 Kaivopuisto       True    CB           30
1          3  60.159715,24.955212           9     002 Laivasillankatu       True                 12
2          0  60.158172,24.944808          16  003 Kapteeninpuistikko       True                 16
3          0  60.160944,24.941859          14           004 Viiskulma       True                 14
4         16  60.157935,24.936083          16           005 Sepänkatu       True                 32
like image 67
piRSquared Avatar answered Oct 28 '25 19:10

piRSquared



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!