Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

compute distance matrix from list of vectors

Say I have a list of vectors and I want to generate a distance matrix from it. What is a neat way to do it? Eg, I have a list of 3 vectors:

k = [[2, 4, 7], [3, 4, 7], [5,1,3]]
distance = pairwise_distances(v1, v2, metric='cosine', n_jobs=-1)

Desired output: A numpy array of cosine-distances of the given vector list.

array([[ 1.        ,  0.00638545,  0.28778769],
       [ 0.00638545,  1.        ,  0.21402251],
       [ 0.28778769,  0.21402251,  1.        ]])

This is what I have done: Got all the combinations using itertools.combinations. Computed the distances for each pair. Then, it becomes a little messy to actually place the distance measures at the "right" cell (need indices of the original vectors in the list).

combs = list(itertools.combinations(k, 2))
print combs

Is there a "neater" or "pythonic" way of getting at the final distance matrix?

like image 338
user1717931 Avatar asked Sep 08 '25 12:09

user1717931


1 Answers

Based on @Divakar's suggestion, I was able to get what I wanted. here is snippet for those who are looking for an answer:

distance_vectors = [cosine_distance([pair[0]], [pair[1]]) for pair in combs]
print distance_vectors

distance_vectors = [x[0][0] for x in distance_vectors]
print distance_vectors
X = squareform(np.array(distance_vectors))
print X

And, yes (thanks to @Warren), the diagonal of a distance matrix is zero.

like image 186
user1717931 Avatar answered Sep 10 '25 03:09

user1717931