I have a list of skills as follows:
skills = ['Listening', 'Written_Expression','Clerical',
'Night_Vision', 'Accounting']
I have a separate list of sets, each of which contains the skills related to a particular job:
job_skills =
[{'Listening','Written_Expression','Clerical','Night_Vision'},
{'Chemistry','Written_Expression','Clerical','Listening'},
.
.
]
I want to count the frequency with which each combination of 2 unique skills is a subset of a set in job_skills and return a list of lists/sets with the combinations and frequencies as follows:
skill_pairs = [{'Listening', 'Written_Expression', 2},
{'Listening', 'Clerical', 2},
.
.
{'Night_Vision', 'Accounting', 0}]
At the moment I'm doing the following:
skill_combos = []
for idx, i in enumerate(skills):
for jdx, j in enumerate(skills[idx+1:]):
temp = []
for job in range(len(job_skills)):
temp.append(set([i,j]).issubset(job_skills[job])
skill_combos.append([i,j,sum(temp)])
This gets the job done but its slow given that I have approx half a million skill combinations. Is there a faster way of doing this? Ideally not using 3 loops.
Thanks
You only need to count the combinations that are present, the rest is zero, for example:
from collections import Counter
from itertools import combinations
job_skills = [{'Listening', 'Written_Expression', 'Clerical', 'Night_Vision'},
{'Chemistry', 'Written_Expression', 'Clerical', 'Listening'}]
counts = Counter(combo for skill_set in job_skills for combo in combinations(skill_set, 2))
for key, value in counts.items():
print(key, value)
Output
('Clerical', 'Written_Expression') 2
('Clerical', 'Listening') 2
('Clerical', 'Night_Vision') 1
('Written_Expression', 'Listening') 2
('Written_Expression', 'Night_Vision') 1
('Listening', 'Night_Vision') 1
('Clerical', 'Chemistry') 1
('Written_Expression', 'Chemistry') 1
('Listening', 'Chemistry') 1
See itertools.combinations and collections.Counter. If you want a dictionary that returns 0 for the ones that are missing, wrap counts with a defaultdict:
total = defaultdict(int)
total.update(counts)
print(total[('Night_Vision', 'Accounting')])
Output
0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With