I am trying to create a Pandas dataframe from a series of lists of unequal lengths. Ideally, what I'd like to do is have the values from the shorter lists repeat so that they match the longer lists that I'm trying to column bind together.
Here's is an example of what I'm trying to do:
name = ['acme corp']
id_num = ['123456']
year = ['2017']
vendors = ['toyota','honda']
paymets = ['100','5000']
name | id_num | year | vendor| payment|
acme corp | 123456 | 2017 | toyota| 100
acme corp | 123456 | 2017 | honda| 5000
In case it matters, I am running this process in a for loop that is extracting data from 1.8 million xml files and then appending the data from each into a csv. Thanks for any pointers you can offer me!
Use the parameter data with the list of variables, then apply a couple of transformations:
pd.DataFrame(data=[name, id_num, year, vendors, paymets])
Out[99]:
0 1
0 acme corp None
1 123456 None
2 2017 None
3 toyota honda
4 100 5000
pd.DataFrame(data=[name, id_num, year, vendors, paymets]).T.ffill()
Out[100]:
0 1 2 3 4
0 acme corp 123456 2017 toyota 100
1 acme corp 123456 2017 honda 5000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With