In machine learning, more features or dimensions can decrease a model’s accuracy since there is more data that needs to be generalized and this is known as the curse of dimensionality.
Dimensionality reduction is a way to reduce the complexity of a model and avoid overfitting. Principal Component Analysis (PCA) algorithm is used to compress a dataset onto a lower-dimensional feature to reduce the complexity of the model.
When/How should I consider that my data set has many numbers of features and I should look for PCA for dimension reduction?
Let me provide another view into this.
In general, you can use Principal Component Analysis for two main reasons:
For compression:
For visualization purposes, using 2 or 3 components.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With