I have 2 lists of dictionaries. List A is 34,000 long, list B is 650,000 long. I am essentially inserting all the List B dicts into the List A dicts based on a key match. Currently, I am doing the obvious, but its taking forever (seriously, like a day). There must be a quicker way!
for a in listA:
a['things'] = []
for b in listB:
if a['ID'] == b['ID']:
a['things'].append(b)
from collections import defaultdict
dictB = defaultdict(list)
for b in listB:
dictB[b['ID']].append(b)
for a in listA:
a['things'] = []
for b in dictB[a['ID']]:
a['things'].append(b)
this will turn your algorithm from O(n*m) to O(m)+O(n), where n=len(listA), m=len(listB)
basically it avoids looping through each dict in listB for each dict in listA by 'precalculating' what dicts from listB match each 'ID'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With