I have a column which has the ticketID for a show,(each family member uses the same ticketID ) i want to create a new cloumn which is family size by counting how many times the ticketID is repeated.
ticketID
113796
2543
19950
382653
349211
3101297
PC 17562
113503
113503
try this:
In [123]: df = pd.DataFrame({'ticketID':np.random.randint(0, 3, 5)})
In [124]: df
Out[124]:
ticketID
0 1
1 2
2 1
3 1
4 2
In [125]: df['family_size'] = df.ticketID.map(df.ticketID.value_counts())
In [126]: df
Out[126]:
ticketID family_size
0 1 3
1 2 2
2 1 3
3 1 3
4 2 2
You could use transform
In [152]: df
Out[152]:
ticketID
0 1
1 2
2 1
3 1
4 2
In [153]: df['family_size'] = df.groupby('ticketID')['ticketID'].transform('size')
In [154]: df
Out[154]:
ticketID family_size
0 1 3
1 2 2
2 1 3
3 1 3
4 2 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With