Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use numpy to get count of intersecting elements in list of arrays (avoid for loop)

Tags:

python

numpy

I have an array of values called MyFruits like:

[apple, orange, banana, apple, pear]

Then I have a list of arrays like:

[apple, orange]
[blueberry, watermelon, pear]
[grape, orange, grape, orange]
[]
[cantaloupe]

For each of the arrays in the list, I want to get the count of elements that intersect with MyFruits array divided by the total number of elements in the array. So the output would be:

2 / 2 = 1
1 / 3 = 0.66667
2 / 4 = 0.5
0 / 0 = (in this case 0)
0 / 1 = 0

essentially:

[1, 0.66667, 0.5, 0, 0]

I've been doing this in Python with for loops, but the data set is huge and it's incredibly slow. Someone suggested using numpy, but I'm having difficulty understanding.

like image 206
MAR Avatar asked Nov 24 '25 18:11

MAR


1 Answers

Suppose you have two list, one of length M and another of length N. If done by straightforward linear searches, it would take O(M * N) string comparisons to find which elements are in both lists.

You can improve on that using Python sets. Convert the lists to Python sets and use set intersection (&) to find their common elements. Then the complexity reduces to O(M + N).

like image 187
Pascal Getreuer Avatar answered Nov 26 '25 11:11

Pascal Getreuer



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!