I am trying to create a correlation matrix of the variables from IMDB movie prediction dataset from kaggle. When I try to plot the correlation matrix I get the following question marks in the matrix.
All the variables are numeric. How do i understand the question marks?
numeric_col <- sapply(df, is.numeric)
movie_numeric <- df[, numeric_col]
Correlation <- cor(movie_numeric)
corrplot(Correlation)
Like @neilfws said in his comment - NA
values are represented by question marks.
You can try to avoid having NA
values by using only pairwise-complete observations when computing the correlation matrix:
Correlation <- cor(movie_numeric, use="pairwise.complete.obs")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With