Question description
Lets say we have two simple arrays:
query = np.array([100, 4000, 500, 700, 400, 100])
match = np.array([6, 100, 4000, 100, 10, 8, 10])
I want to find the indexes of all matching values between the query and match. So in this case the result would be:
value   query   match
100        0    1
100        0    3
100        5    1
100        5    3
4000       1    2
In reality these arrays will contain millions of items
"Stupid" loop solution
qs = []
query_locs = []
match_locs = []
for i in np.arange(query.size):
    q = query[i]
    # Get matching indexes in "match"
    match_loc = np.where(match == q)[0]
    n = match_loc.size
    # Update location arrays
    match_locs.extend(match_loc)
    query_locs.extend(np.repeat(i,n))
    # Store the matching value
    qs.extend(np.repeat(q,n))
result = np.vstack((qs, query_locs, match_locs)).T
print(result)
 [[ 100    0    1]
 [ 100    0    3]
 [4000    1    2]
 [ 100    5    1]
 [ 100    5    3]]
(Maybe numba could make this loop pretty fast however when I tried this I got some errors about the signatures, so not sure about that)
Numpy buildins
There are quite some buildin numpy function to solve this problem for unique values, like using searchsorted, intersect1d, however, as also described in the doc, they "Return the sorted, unique values" and thus do not take duplicates into account. Some examples on StackOverflow for this problem with unique values:
I could imagine there would be a faster way to do this with numpy instead of a loop, so curious to see an answer!
You may transform 1d-arrays to dataframes and make a join, like this:
query = np.array([100, 4000, 500, 700, 400, 100])
match = np.array([6, 100, 4000, 100, 10, 8, 10])
dfquery = pd.DataFrame(range(len(query)), index=query, columns=['query'])
dfmatch = pd.DataFrame(range(len(match)), index=match, columns=['match'])
dfquery.join(dfmatch, how='inner')
Result:
    query   match
100     0       1
100     0       3
100     5       1
100     5       3
4000    1       2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With