Say I have a list of vectors and I want to generate a distance matrix from it. What is a neat way to do it? Eg, I have a list of 3 vectors:
k = [[2, 4, 7], [3, 4, 7], [5,1,3]]
distance = pairwise_distances(v1, v2, metric='cosine', n_jobs=-1)
Desired output: A numpy array of cosine-distances of the given vector list.
array([[ 1. , 0.00638545, 0.28778769],
[ 0.00638545, 1. , 0.21402251],
[ 0.28778769, 0.21402251, 1. ]])
This is what I have done: Got all the combinations using itertools.combinations. Computed the distances for each pair. Then, it becomes a little messy to actually place the distance measures at the "right" cell (need indices of the original vectors in the list).
combs = list(itertools.combinations(k, 2))
print combs
Is there a "neater" or "pythonic" way of getting at the final distance matrix?
Based on @Divakar's suggestion, I was able to get what I wanted. here is snippet for those who are looking for an answer:
distance_vectors = [cosine_distance([pair[0]], [pair[1]]) for pair in combs]
print distance_vectors
distance_vectors = [x[0][0] for x in distance_vectors]
print distance_vectors
X = squareform(np.array(distance_vectors))
print X
And, yes (thanks to @Warren), the diagonal of a distance matrix is zero.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With