I have a Pandas dataframe that looks as follows:
name1 country1 name2 country2
A GER B USA
C GER E GER
D GER Y AUS
E GER A USA
I want to get a new dataframe with two columns name and country that contains the unique pairs of (name1, country1) and (name2,country2).
The expected result should look like this:
name country
A GER
C GER
D GER
E GER
B USA
A USA
Y AUS
I have found something similar for single columns here. However, I do not know how to transform this solution to my problem, that is, pairs of columns.
First filter columns by filter, transpose, flatten values and create new DataFrame by constructor:
a = df.filter(like='name').values.T.ravel()
b = df.filter(like='country').values.T.ravel()
df = pd.DataFrame({'name':a, 'country':b}, columns=['name','country'])
print (df)
name country
0 A GER
1 C GER
2 D GER
3 E GER
4 B USA
5 E GER
6 Y AUS
7 A USA
Another solution with undocumented function lreshape:
df = pd.lreshape(df, {'name':['name1','name2'],
'country':['country1','country2']})
print (df)
name country
0 A GER
1 C GER
2 D GER
3 E GER
4 B USA
5 E GER
6 Y AUS
7 A USA
And last for unique pairs use drop_duplicates:
df = df.drop_duplicates()
print (df)
name country
0 A GER
1 C GER
2 D GER
3 E GER
4 B USA
6 Y AUS
7 A USA
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With