Pythonic way to create pairs of values in a column in dataframe

Question

I have a Dataframe that looks like this:

OwnerID    Value
1            A
1            B
1            C
1            D

This is the shortened version, I have thousands of values for OwnerID. I'd like to create pairs for the Value column where each Value is paired with every other Value, and have the result as list of pairs.

For example, for the OwnerID 1, the resultset should be the following lists:

[A,B]

[A,C]

[A,D]

[B,C]

[B,D]

[C,D]

I could write 2 for loops to achieve this, but that wouldn't be very efficient or pythonic. Would someone know a better way to achieve this?

Any help would be much appreciated.

MaxU - stop WAR against UA · Accepted Answer

Pandas solution (using .merge() and .query() methods):

Data:

In [10]: df
Out[10]:
   OwnerID Value
0        1     A
1        1     B
2        1     C
3        1     D
4        2     X
5        2     Y
6        2     Z

Solution:

In [9]: pd.merge(df, df, on='OwnerID', suffixes=['','2']).query("Value != Value2")
Out[9]:
    OwnerID Value Value2
1         1     A      B
2         1     A      C
3         1     A      D
4         1     B      A
6         1     B      C
7         1     B      D
8         1     C      A
9         1     C      B
11        1     C      D
12        1     D      A
13        1     D      B
14        1     D      C
17        2     X      Y
18        2     X      Z
19        2     Y      X
21        2     Y      Z
22        2     Z      X
23        2     Z      Y

If you need only lists:

In [17]: pd.merge(df, df, on='OwnerID', suffixes=['','2']) \
           .query("Value != Value2") \
           .filter(like='Value').values
Out[17]:
array([['A', 'B'],
       ['A', 'C'],
       ['A', 'D'],
       ['B', 'A'],
       ['B', 'C'],
       ['B', 'D'],
       ['C', 'A'],
       ['C', 'B'],
       ['C', 'D'],
       ['D', 'A'],
       ['D', 'B'],
       ['D', 'C'],
       ['X', 'Y'],
       ['X', 'Z'],
       ['Y', 'X'],
       ['Y', 'Z'],
       ['Z', 'X'],
       ['Z', 'Y']], dtype=object)

Vaishali · Answer

import itertools as iter
df2 = df.groupby('OwnerID').Value.apply(lambda x: list(iter.combinations(x, 2)))

will return the desired output for each unique owner id

OwnerID
1    [(A, B), (A, C), (A, D), (B, C), (B, D), (C, D)]

Pythonic way to create pairs of values in a column in dataframe

Tags:

python

pandas

Patthebug

2 Answers

MaxU - stop WAR against UA

Vaishali

Recent Activity

Donate For Us

Pythonic way to create pairs of values in a column in dataframe

Tags:

python

pandas

Patthebug

2 Answers

MaxU - stop WAR against UA

Vaishali

Related questions

Recent Activity

Donate For Us