I am retrieving multiple data frames in csv format from a website. I save the data frames in a an empty list and then read one by one. I can not append them into a single data frame since they have different column names and column orders. So I have the following questions:
Can I create a data frame with a different name inside the loop I use to read the files, so instead of saving them to a list I create a new dataframe for every file retrieved? If this is not possible/recommendable is there a way to iterate my list to extract the data frames? Currently I read one dataframe at the time but I would love to come up with a way to automate this code to create something like data_1, data_2, etc. Right now my code is not terribly time consuming since I only have 4 data frames, but this can become burdensome with more data. Here is my code:
import pandas as pd
import urllib2
import csv
#we write the names of the files in a list so we can iterate to download the files
periods=['2012-1st-quarter','2012-2nd-quarter', '2012-3rd-quarter', '2012-4th-quarter']
general=[]
#we generate a loop to read the files from the capital bikeshare website
for i in periods:
url = 'https://www.capitalbikeshare.com/assets/files/trip-history-data/'+i+'.csv'
response = urllib2.urlopen(url)
x=pd.read_csv(response)
general.append(x)
q1=pd.DataFrame(general[0])
Thanks!
It would be better if you use a dict, also you can directly pass a url to pandas.read_csv. So the simplified code would look like this:
import pandas as pd
periods = ['2012-1st-quarter','2012-2nd-quarter', '2012-3rd-quarter', '2012-4th-quarter']
url = 'https://www.capitalbikeshare.com/assets/files/trip-history-data/{}.csv'
d = {period: pd.read_csv(url.format(period)) for period in periods}
Then you can access a specific DataFrame like this:
d['2012-4th-quarter']
To iterate through all Dataframes:
for period, df in d.items():
print period
print df
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With