Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split a "formula" in R

Tags:

r

r-package

I'm trying to make a small R package with my limited knowledge in R programming. I am trying to use the following argument:

formula=~a+b*X

where X is vector, 'a' and 'b' are constants in a function call.

What I'm wondering is once I input the formula, I want to extract (a,b) and X separately and use them for other data manipulations inside the function call. Is there a way to do it in R?

I would really appreciate any guidance.

Note: Edited my question for clarity

I'm looking for something similar to model.matrix() output. The above mentioned formula can be more generalized to accommodate 'n' number of variables, say,

~2+3*X +4*Y+...+2*Z

In the output, I need the coefficients (2 3 4 ...2) as a vector and [1 X Y ... Z] as a covariate matrix.

like image 261
Vineetha Avatar asked Dec 03 '25 13:12

Vineetha


2 Answers

The question is not completely clear so we will assume that the question is, given a formula using standard formula syntax, how do we parse out the variables names (or in the second answer the variable names and constants) giving as output a character vector containing them.

1) all.vars Try this:

fo <- a + b * X  # input
all.vars(fo)

giving:

[1] "a" "b" "X"

2) strapplyc Also we could do it with string manipulation. In this case it also parses out the constants.

library(gsubfn)
fo <- ~ 25 + 35 * X  # input
strapplyc(gsub(" ", "", format(fo)), "-?[0-9.]+|[a-zA-Z0-9._]+", simplify = unlist)

giving:

[1] "25" "35" "X" 

Note: If all you are trying to do is to evaluate the RHS of the formula as an R expression then it is just:

X <- 1:3
fo <- ~ 1 + 2 * X
eval(fo[[2]])

giving:

[1] 3 5 7

Update: Fixed and added second solution and Note.

like image 133
G. Grothendieck Avatar answered Dec 05 '25 03:12

G. Grothendieck


A call is a list of symbols and/or other calls and its elements can be accessed through normal indexing operations, e.g.

f <- ~a+bX
f[[1]]
#`~`
f[[2]]
#a + bX
f[[2]][[1]]
#`+`
f[[2]][[2]]
#a

However notice that in your formula bX is one symbol, you probably meant b * X instead.

f <- ~a + b * X

Then a and b typically would be stored in an unevaluated list.

vars <- call('list', f[[2]][[2]], f[[2]][[3]][[2]])
vars
#list(a, b)

and vars would be passed to eval at some point.

like image 40
Ernest A Avatar answered Dec 05 '25 03:12

Ernest A



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!