Is k nearest neighbours regression inherently slow?

Question

I am trying to use k nearest neighbours implementation from scikit learn on a fairly large dataset. The problem is that predictions take a very long time, almost as long as training which doesn't make sense. Is it an issue with the algorithm, or the fact that scikit learn isn't made for large datasets (no GPU support).

For further information, I am trying to predict lidar intensity based on x, y, z and object label. Each lidar scan has ~100,000 points, so I'm trying to predict the intensity for each point.

rth · Accepted Answer

Things to try to make scikit-learn's KNeighborsClassifier run faster:

different algorithm parameter: kd_tree, ball_tree for low dimensional data, brute for high dimensional data
n_jobs parameter. Using a larger n_jobs doesn't necessarily make things faster, sometimes the opposite.
make sure you are using the latest version: there have been performance improvements in v0.22 and some not yet merged optimizations (scikit-learn#14543)
use an external approximate nearest neighbours library (e.g. Annoy) together with pre-computed sparse distances using metric="precomputed"

Is k nearest neighbours regression inherently slow?

Tags:

python

machine-learning

scikit-learn

knn

Ivan Novikov

1 Answers

rth

Recent Activity

Donate For Us

Is k nearest neighbours regression inherently slow?

Tags:

python

machine-learning

scikit-learn

knn

Ivan Novikov

1 Answers

rth

Related questions

Recent Activity

Donate For Us