Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to run lm regression for every column in R

I have data frame as:

df=data.frame(x=rnorm(100),y1=rnorm(100),y2=rnorm(100),y3=...)

I want to run a loop which regresses each column starting from the second column on the first column:

for(i in names(df[,-1])){
    model = lm(i~x, data=df)
}

But I failed. The point is that I want to do a loop of regression for each column and some column names is just a number (e.g. 404.1). I cannot find a way to run a loop for each column using the above command.

like image 941
Sheldon Avatar asked Oct 17 '25 14:10

Sheldon


2 Answers

Your code looks fine except when you call i within lm, R will read i as a string, which you can't regress things against. Using get will allow you to pull the column corresponding to i.

df=data.frame(x=rnorm(100),y1=rnorm(100),y2=rnorm(100),y3=rnorm(100))

storage <- list()
for(i in names(df)[-1]){
  storage[[i]] <- lm(get(i) ~ x, df)
}

I create an empty list storage, which I'm going to fill up with each iteration of the loop. It's just a personal preference but I'd also advise against how you've written your current loop:

 for(i in names(df[,-1])){
    model = lm(i~x, data=df)
}

You will overwrite model, thus returning only the last iteration results. I suggest you change it to a list, or a matrix where you can iteratively store results.

Hope that helps

like image 189
Gin_Salmon Avatar answered Oct 20 '25 02:10

Gin_Salmon


Another solution with broom and tidyverse:

library(tidyverse)
library(broom)
df <- data.frame(x=rnorm(100),y1=rnorm(100),y2=rnorm(100))

result <- df %>% 
  gather(measure, value, -x) %>%
  nest(-measure) %>%
  mutate(fit = map(data, ~ lm(value ~ x, data = .x)),
         tidied = map(fit, tidy)) %>%
  unnest(tidied)
like image 21
Anastasia Vishnyakova Avatar answered Oct 20 '25 02:10

Anastasia Vishnyakova



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!