I'm creating a model with several thousand variables, all of which have a majority of values equal to NA. I am able to successfully run logistic regression on some variables but not others.
Here's my code to input the large amount of vars:
model_vars <- names(dataset[100:4000])
vars<- paste("DP ~ ", paste(model_vars, collapse= " + "))
This formats it with the dependant variable and each Independant variable having a "+" between. I then run this through the glm function:
glm(vars, data = training, family = binomial)
Here is the error I get when certain variables are included:
Error in family$linkfun(mustart) :
Argument mu must be a nonempty numeric vector
I cannot figure out why this is occuring and why the regression works for certain variables and not others. I can't see any trend in the variables that cause the error. Could someone clarify why this error shows up?
Data must be numeric (no NA
, Inf
, NaN
, True
, False
etc.)!
I had the error:
Error in family$linkfun(mustart) :
Argument mu must be a nonempty numeric vector
when using logistic regression with glm()
, like:
glm(y~x,data=df, family='binomial')
after subsetting and standardizing data frames in a loop.
It turned out, that (some of) the subsetted and standardized data frames contained NA
, which caused the error.
For others with that cryptic error message. Perhaps the data frame is empty?
This reproduces the message:
d=data.frame(x=c(NA),y=c(NA))
d=d[complete.cases(d),]
m=glm(y~.,d,family = 'binomial')
Error in family$linkfun(mustart) : Argument mu must be a nonempty numeric vector
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With