Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create list with all unique possible combination based on condition in dataframe in Python

I have the following dataset:

d = {
'Company':['A','A','A','A','B','B','B','B','C','C','C','C','D','D','D','D'],
'Individual': [1,2,3,4,1,5,6,7,1,8,9,10,10,11,12,13]
}

Now, I need to create a list in Python of all pairs of elements of 'Company', that correspond to the values in 'Individual'.

E.g. The output for above should be as follows for the dataset above: ((A,B),(A,C),(B,C),(C,D)).The first three tuples, since Individual 1 is affiliated with A,B and C and the last one since, Individual 10 is affiliated with C and D.

Further Explanation - If individual =1, the above dataset has 'A','B' and 'C' values. Now, I want to create all unique combination of these three values (tuple), therefore it should create a list with the tuples (A,B),(A,C) and (B,C). The next is Individual=2. Here is only has the value 'A' therefore there is no tuple to append to the list. For next individuals there's only one corresponding company each, hence no further pairs. The only other tuple that has to be added is for Individual=10, since it has values 'C' and 'D' - and should therefore add the tuple (C,D) to the list.

like image 230
Jan Ohlenbusch Avatar asked Mar 21 '26 13:03

Jan Ohlenbusch


1 Answers

Here is a solution to your refined question:

from collections import defaultdict
from itertools import combinations

data = {'Company':['A','A','A','A','B','B','B','B','C','C','C','C','D','D','D','D'],
        'Individual': [1,2,3,4,1,5,6,7,1,8,9,10,10,11,12,13]}

d = defaultdict(set)

for i, j in zip(data['Individual'], data['Company']):
    d[i].add(j)

res = {k: sorted(map(sorted, combinations(v, 2))) for k, v in d.items()}

# {1: [['A', 'B'], ['A', 'C'], ['B', 'C']],
#  2: [],
#  3: [],
#  4: [],
#  5: [],
#  6: [],
#  7: [],
#  8: [],
#  9: [],
#  10: [['C', 'D']],
#  11: [],
#  12: [],
#  13: []}
like image 198
jpp Avatar answered Mar 24 '26 06:03

jpp