I'd like to do a polynomial feature expansion for a data frame -- for example, a quadratic expansion of a df with (x1, x2, x3) should give a df with (x1, x2, x3, x1^2, x2^2, x3^2, x1x2, x1x3, x2x3). I'm currently using poly(df$x1, df$x2, df$x3, degree=2, raw=T) but this requires an unnecessary amount of typing if I have large number of columns. (And poly(df[,1:20], degree=2, raw=T) doesn't work.) What's the best way to do this?
Edit: I have too many columns for poly (vector is too large error). Got it to work with a simple for loop:
polyexp = function(df){
df.polyexp = df
colnames = colnames(df)
for (i in 1:ncol(df)){
for (j in i:ncol(df)){
colnames = c(colnames, paste0(names(df)[i],'.',names(df)[j]))
df.polyexp = cbind(df.polyexp, df[,i]*df[,j])
}
}
names(df.polyexp) = colnames
return(df.polyexp)
}
Just add additional loops to compute higher-order terms.
You could do this with do.call:
do.call(poly, c(lapply(1:20, function(x) dat[,x]), degree=2, raw=T))
Basically do.call takes as the first argument the function to be called (poly in your case) and as a second argument a list. Each element of this list is then passed as an argument to your function. Here we make a list containing all of the columns you want to process (I've used lapply to get that list without too much typing) followed by the two additional arguments you want to pass.
To see it working on a simple example:
dat <- data.frame(x=1:5, y=1:5, z=2:6)
do.call(poly, c(lapply(1:3, function(x) dat[,x]), degree=2, raw=T))
# 1.0.0 2.0.0 0.1.0 1.1.0 0.2.0 0.0.1 1.0.1 0.1.1 0.0.2
# [1,] 1 1 1 1 1 2 2 2 4
# [2,] 2 4 2 4 4 3 6 6 9
# [3,] 3 9 3 9 9 4 12 12 16
# [4,] 4 16 4 16 16 5 20 20 25
# [5,] 5 25 5 25 25 6 30 30 36
# attr(,"degree")
# [1] 1 2 1 2 2 1 2 2 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With