I have two one-dimensional NumPy arrays X and Y. I need to calculate the mean absolute difference between each element of X and each element of Y. The naive way is to use a nested for loop:
import numpy as np
np.random.seed(1)
X = np.random.randint(10, size=10)
Y = np.random.randint(10, size=10)
s = 0
for x in X:
    for y in Y:
        s += abs(x - y)
mean = s / (X.size * Y.size)
#3.4399999999999999
Question: Does NumPy provide a vectorized, faster version of this solution?
Edited: I need the mean absolute difference (always non-negative). Sorry for the confusion.
If I correctly understand what your definition is here, you can just use broadcasting.
np.mean(np.abs(X[:, None] - Y))
If you tile on opposite axes, then you can abs the diff like:
x = np.tile(X, (X.size, 1))
y = np.transpose(np.tile(Y, (Y.size, 1)))
mean_diff = np.sum(np.abs(x-y)) / (X.size * Y.size))
import numpy as np
X = np.random.randint(10, size=10)
Y = np.random.randint(10, size=10)
s = 0
for x in X:
    for y in Y:
        s += abs(x - y)
mean = s / (X.size * Y.size)
print(mean)
x = np.tile(X, (X.size, 1))
y = np.transpose(np.tile(Y, (Y.size, 1)))
print(np.sum(np.abs(x-y)) / (X.size * Y.size))
3.48
3.48
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With