Weighted distance in sklearn KNN

Question

I'm making a genetic algorithm to find weights in order to apply them to the euclidean distance in the sklearn KNN, trying to improve the classification rate and removing some characteristics in the dataset (I made this with changing the weight to 0). I'm using Python and the sklearn's KNN. This is how I'm using it:

def w_dist(x, y, **kwargs):
   return sum(kwargs["weights"]*((x-y)*(x-y)))

KNN = KNeighborsClassifier(n_neighbors=1,metric=w_dist,metric_params={"weights": w})
KNN.fit(X_train,Y_train)
neighbors=KNN.kneighbors(n_neighbors=1,return_distance=False)
Y_n=Y_train[neighbors]
tot=0
for (a,b)in zip(Y_train,Y_vecinos):
    if a==b:
        tot+=1

reduc_rate=X_train.shape[1]-np.count_nonzero(w)/tamaño
class_rate=tot/X_train.shape[0]

It's working really well, but it's very slow. I have been profiling my code and the slowest part is the evaluation of the distance.

I want to ask if there is some different way to tell KNN to use weights in the distance (I must use the euclidean distance, but I remove the square root).

Thanks!

piman314 · Accepted Answer

There is indeed another way, and it's inbuilt into scikit-learn (so should be quicker). You can use the wminkowski metric with weights. Below is an example with random weights for the features in your training set.

knn = KNeighborsClassifier(metric='wminkowski', p=2, 
                           metric_params={'w': np.random.random(X_train.shape[1])})

Weighted distance in sklearn KNN

Tags:

python

scikit-learn

Antonio Manuel

1 Answers

piman314

Recent Activity

Donate For Us

Weighted distance in sklearn KNN

Tags:

python

scikit-learn

Antonio Manuel

1 Answers

piman314

Related questions

Recent Activity

Donate For Us