I've been searching extensively but can't get my head around this issue:
I have a dataframe in pandas that looks like this:
date ticker Name NoShares SharePrice Volume Relation
2/1/10 aaa zzz 1 1 1 d
2/1/10 aaa yyy 1 2 5 o
2/1/10 aaa zzz 2 5 2 d
2/5/10 bbb xxx 5 5 1 do
2/5/10 ccc www 5 5 1 d
2/5/10 ccc www 5 5 1 d
2/5/10 ddd vvv 5 5 1 o
2/6/10 aaa zzz 1 1 3 d
Requirements
so my output would look like this:
date ticker Name NoShares SharePrice Volume Relation
2/1/10 aaa zzz 3 3.6 1 d
2/1/10 aaa yyy 1 2 5 o
2/5/10 bbb xxx 5 5 1 do
2/5/10 ccc www 10 5 1 d
2/5/10 ddd vvv 5 5 1 o
2/6/10 aaa zzz 1 1 3 d
I tried the documentation and other answers on stack overflow but don't seem to be able to get it right. Appreciate the help. Cheers.
here's my solution:
grpby = df.groupby(['date','Name'])
a = grpby.apply(lambda x: np.average(a = x['SharePrice'],weights=x['NoShares'])).to_frame(name='SharePrice')
b = grpby.agg({'NoShares':'sum','Volume':'mean','Relation':'max'})
print b.join(a)
Volume Relation NoShares SharePrice
date Name
2/1/10 yyy 5.0000 o 1 2.0000
zzz 1.5000 d 3 3.6667
2/5/10 vvv 1.0000 o 5 5.0000
www 1.0000 d 10 5.0000
xxx 1.0000 do 5 5.0000
2/6/10 zzz 3.0000 d 1 1.0000
just reset_index() afterwards.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With