I have df:
domain           orgid
csyunshu.com    108299
dshu.com        108299
bbbdshu.com     108299
cwakwakmrg.com  121303
ckonkatsunet.com    121303
I would like to add a new column with replaces domain column with numeric ids per orgid:
domain           orgid   domainid
csyunshu.com    108299      1
dshu.com        108299      2
bbbdshu.com     108299      3
cwakwakmrg.com  121303      1
ckonkatsunet.com 121303     2
I have already tried this line but it does not give the result I want:
df.groupby('orgid').count['domain'].reset_index()
Can anybody help?
The Hello, World! of pandas GroupBy You call . groupby() and pass the name of the column that you want to group on, which is "state" . Then, you use ["last_name"] to specify the columns on which you want to perform the actual aggregation. You can pass a lot more than just a single column name to .
Example 1: We can have all values of a column in a list, by using the tolist() method. Syntax: Series. tolist(). Return type: Converted series into List.
You can call rank on the groupby object and pass param method='first':
In [61]:
df['domainId'] = df.groupby('orgid')['orgid'].rank(method='first')
df
Out[61]:
             domain   orgid  domainId
0      csyunshu.com  108299         1
1          dshu.com  108299         2
2       bbbdshu.com  108299         3
3    cwakwakmrg.com  121303         1
4  ckonkatsunet.com  121303         2
If you want to overwrite the column you can do:
df['domain'] = df.groupby('orgid')['orgid'].rank(method='first')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With