Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Temporarily store variable in series of pipes dplyr

Tags:

r

dplyr

Is there a way to pause a series of pipes to store a temporary variable that can be used later on in pipe sequence?

I found this question but I'm not sure that it was doing the same thing I am looking for.

Here's a sample dataframe:

library(dplyr)
set.seed(123)
df <- tibble(Grp = c("Apple","Boy","Cat","Dog","Edgar","Apple","Boy","Cat","Dog","Edgar"),
             a = sample(0:9, 10, replace = T),
             b = sample(0:9, 10, replace = T),
             c = sample(0:9, 10, replace = T),
             d = sample(0:9, 10, replace = T),
             e = sample(0:9, 10, replace = T),
             f = sample(0:9, 10, replace = T),
             g = sample(0:9, 10, replace = T))

I am going to convert df to long format but, after having done so, I will need to apply the number of rows before the gather.

This is what my desired output looks like. In this case, storing the number of rows before the pipe begins would look like:

n <- nrow(df)

df %>% 
  gather(var, value, -Grp) %>% 
  mutate(newval = value * n)
# A tibble: 70 x 4
   Grp   var   value newval
   <chr> <chr> <int>  <int>
 1 Apple a         2     20
 2 Boy   a         7     70
 3 Cat   a         4     40
 4 Dog   a         8     80
 5 Edgar a         9     90
 6 Apple a         0      0
 7 Boy   a         5     50
 8 Cat   a         8     80
 9 Dog   a         5     50
10 Edgar a         4     40
# ... with 60 more rows

In my real world problem, I have a long chain of pipes and it would be a lot easier if I could perform this action within the pipe structure. I would like to do something that looks like this:

df %>% 
  { "n = nrow(.)" } %>% # temporary variable is created here but df is passed on
  gather(var, value, -Grp) %>% 
  mutate(newval = value * n)

I could do something like the following, but it seems really sloppy.

df %>% 
  mutate(n = nrow(.)) %>% 
  gather(var, value, -Grp, -n) %>% 
  mutate(newval = value * mean(n))

Is there a way to do this or perhaps a good workaround?

like image 428
hmhensen Avatar asked Sep 19 '25 06:09

hmhensen


1 Answers

You could use a code block for a local variable. This would look like

df %>% 
{ n = nrow(.)
  gather(., var, value, -Grp) %>% 
  mutate(newval = value * n)
}

Notice how we have to pass the . to gather as well here and the pipe continues inside the block. But you could put other parts afterwards

df %>% 
{ n = nrow(.)
  gather(., var, value, -Grp) %>% 
  mutate(newval = value * n)
} %>% 
select(newval)
like image 136
MrFlick Avatar answered Sep 20 '25 22:09

MrFlick