Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between fit() and fit_predict() in SpectralClustering

I am trying to understand and use the spectral clustering from sklearn. Let us say we have X matrix input and we create a spectral clustering object as follows:

clustering = SpectralClustering(n_clusters=2,
         assign_labels="discretize",
         random_state=0)

Then, we call a fit_predict using the spectral cluster object.

clusters =  clustering.fit_predict(X)

What confuses me is that when does 'the affinity matrix for X using the selected affinity is created'? Because as per the documentation the fit_predict() method 'Performs clustering on X and returns cluster labels.' But it doesn't explicitly say that it also computes 'the affinity matrix for X using the selected affinity' before clustering.

I appreciate any help or tips.

like image 730
ewalel Avatar asked Oct 19 '25 13:10

ewalel


1 Answers

As already implied in another answer, fit_predict is just a convenience method in order to return the cluster labels. According to the documentation, fit

Creates an affinity matrix for X using the selected affinity, then applies spectral clustering to this affinity matrix.

while fit_predict

Performs clustering on X and returns cluster labels.

Here, Performs clustering on X should be understood as what is described for fit, i.e. Creates an affinity matrix [...].

It is not difficult to verify that calling fit_predict is equivalent to getting the labels_ attribute from the object after fit; using some dummy data, we have

from sklearn.cluster import SpectralClustering
import numpy as np

X = np.array([[1, 2], [1, 4], [10, 0],
               [10, 2], [10, 4], [1, 0]])

# 1st way - use fit and get the labels_
clustering = SpectralClustering(n_clusters=2,
     assign_labels="discretize",
     random_state=0)

clustering.fit(X)
clustering.labels_
# array([1, 1, 0, 0, 0, 1])

# 2nd way - using fit_predict
clustering2 = SpectralClustering(n_clusters=2,
     assign_labels="discretize",
     random_state=0)

clustering2.fit_predict(X)
# array([1, 1, 0, 0, 0, 1])

np.array_equal(clustering.labels_, clustering2.fit_predict(X))
# True
like image 86
desertnaut Avatar answered Oct 21 '25 01:10

desertnaut



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!