How to convert a set of categories into a DataFrame?
For example:
A = [{'a', 'c'}, {'a', 'b'}, {'b', 'd'}, {'e'}]
To:
'a', 'b', 'c', 'd', 'e'
1 1 , 0 , 1 , 0 , 0
2 1 , 1 , 0 , 0 , 0
3 0 , 1 , 0 , 1 , 0
4 0 , 0 , 0 , 0 , 1
Let's try explode then crosstab:
s = pd.Series(A).explode()
pd.crosstab(s.index, s)
Output:
col_0 a b c d e
row_0
0 1 0 1 0 0
1 1 1 0 0 0
2 0 1 0 1 0
3 0 0 0 0 1
Option 2: get_dummies on the explode:
pd.get_dummies(pd.Series(A).explode()).sum(level=0)
Output:
a b c d e
0 1 0 1 0 0
1 1 1 0 0 0
2 0 1 0 1 0
3 0 0 0 0 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With