I'm trying to get the variances from the eigen vectors.
What is the difference between explained_variance_ratio_ and explained_variance_ in PCA?
The percentage of the explained variance is:
explained_variance_ratio_
The variance i.e. the eigenvalues of the covariance matrix is:
explained_variance_
Formula:
explained_variance_ratio_ = explained_variance_ / np.sum(explained_variance_)
Example:
import numpy as np
from sklearn.decomposition import PCA
X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
pca = PCA(n_components=2)
pca.fit(X)  
pca.explained_variance_
array([7.93954312, 0.06045688]) # the actual eigenvalues (variance)
pca.explained_variance_ratio_ # the percentage of the variance
array([0.99244289, 0.00755711])
Also based on the above formula:
7.93954312 / (7.93954312+ 0.06045688) = 0.99244289
From the documentation:
explained_variance_ : array, shape (n_components,) The amount of variance explained by each of the selected components.
Equal to n_components largest eigenvalues of the covariance matrix of X.
New in version 0.18.
explained_variance_ratio_ : array, shape (n_components,) Percentage of variance explained by each of the selected components.
If n_components is not set then all components are stored and the sum of the ratios is equal to 1.0.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With