Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: define distinct pattern from values of multiple variables [duplicate]

Tags:

r

dplyr

Here's what I have:

data.frame(x=c(0,0,0,1,1,1), y=c(0,0,1,0,1,1))

  x y
1 0 0
2 0 0
3 0 1
4 1 0
5 1 1
6 1 1

Here's what I want:

data.frame(x=c(0,0,0,1,1,1), y=c(0,0,1,0,1,1), pattern=c(1,1,2,3,4,4))

  x y pattern
1 0 0       1
2 0 0       1
3 0 1       2
4 1 0       3
5 1 1       4
6 1 1       4

That is, I have a bunch of columns (not just two), and thousands of rows. I want to go through each row, figure out what the distinct combinations of x, y, z, etc. are, call each one a distinct pattern, and return that pattern for each row.

(Context: I have gene expression data for several genes over many time points. I want to try to see which genes oscillate similarly over time by defining patterns based on whether something's up or down-regulated at any particular time point).

Thanks.

like image 489
Stephen Turner Avatar asked Feb 03 '26 07:02

Stephen Turner


1 Answers

You can use dplyr::group_indices():

NSE version

group_indices(df, x, y)
# [1] 1 1 2 3 4 4

SE version

group_indices_(df, .dots = names(df))
# [1] 1 1 2 3 4 4

The unfortunate side of this function is that it doesn't work with mutate function (yet), so you have to use it as:

df$pattern <- group_indices(df, x, y)

From the linked answer, it seems that even though the non-standard evaluation version doesn't work with mutate, the standard evaluation version does:

df %>% mutate(pattern = group_indices_(df, .dots = c('x', 'y')))
like image 196
Psidom Avatar answered Feb 05 '26 23:02

Psidom