Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add an ID based on groups of zeroes

Tags:

r

My dataset is large, containing many observations (Dependent variable = DV) on individuals (Name) across set periods (Period) of a testing session. A small example of my dataset is as follows:

ExampleData <- data.frame(Name = c("Tom","Tom","Tom","Tom","Tom","Tom","Tom","Tom", "Tom", "Tom", 
                                   "Ben","Ben","Ben","Ben","Ben","Ben","Ben","Ben", "Ben", "Ben"),
                          Period = c(0,0,1,1,1,0,0,0,1,1, 
                                      0,0,0,1,1,1,0,0,1,1),
                          DV = runif(20, 1.5, 2.8))

When ExampleData$Period==1 an individual is undergoing an exercise test, which varies in time/ length. Breaks in between each test are represented by ExampleData$Period==0. To avoid manually entering when a person is undergoing a test and adding the sequential periods in, I wish to include a column that declares when a group of 1's, seperated by a group of 0's, is a new period - across each person's data. How do I please go about doing this?

My anticipated output would be:

ExampleData$Descriptor <- c(NA,NA,"Period One", "Period One","Period One",NA,NA,NA,"Period Two","Period Two",
                        NA,NA,NA,"Period One","Period One","Period One",NA,NA,"Period Two","Period Two")

My question is similar to another of mine, located here, although I now have multiple entries for each individual. I have tried the dplyr syntax of:

Test_df <- ExampleData %>%
  mutate(
    Descriptor = case_when(
      Period > 0 ~ "Period",
      Period == 0 ~ "Rest"),
    rleid = cumsum(Descriptor != lag(Descriptor, 1, default = "NA")), 
    Descriptor = case_when(
      Descriptor == "Period" ~ paste0(Descriptor, rleid %/% 2),
      TRUE ~ "Rest"),
    rleid = NULL
  )

Although, how do I account for each different Name/ individual in my dataset?

Thank you.

like image 562
user2716568 Avatar asked Dec 31 '25 15:12

user2716568


2 Answers

Here's an alternative approach with dplyr

library(dplyr)

ExampleData %>% 
  group_by(Name) %>% 
  mutate(Descriptor = with(rle(Period == 1), 
             rep(replace(paste("Period", cumsum(values)), !values, NA), lengths)))

# # A tibble: 20 x 4
# # Groups:   Name [2]
# Name Period       DV Descriptor
# <fctr>  <dbl>    <dbl>      <chr>
#   1    Tom      0 2.641044       <NA>
#   2    Tom      0 2.692745       <NA>
#   3    Tom      1 1.515797   Period 1
#   4    Tom      1 2.601471   Period 1
#   5    Tom      1 1.669399   Period 1
#   6    Tom      0 2.700371       <NA>
#   7    Tom      0 1.993971       <NA>
#   8    Tom      0 2.203379       <NA>
#   9    Tom      1 2.488742   Period 2
#  10    Tom      1 1.596458   Period 2
#  11    Ben      0 2.578924       <NA>
#  12    Ben      0 1.916804       <NA>
#  13    Ben      0 2.676466       <NA>
#  14    Ben      1 2.508759   Period 1
#  15    Ben      1 2.447217   Period 1
#  16    Ben      1 2.728756   Period 1
#  17    Ben      0 2.326854       <NA>
#  18    Ben      0 1.748016       <NA>
#  19    Ben      1 1.703044   Period 2
#  20    Ben      1 1.783434   Period 2
like image 111
talat Avatar answered Jan 03 '26 04:01

talat


Here is an option using data.table

library(data.table)
setDT(ExampleData)[ , grp := rleid(Period == 1), .(Name)][Period == 1, 
    Descriptor := paste("Period", match(grp, unique(grp))), Name][, grp := NULL][]
#     Name Period       DV Descriptor
# 1:  Tom      0 2.764916         NA
# 2:  Tom      0 1.537837         NA
# 3:  Tom      1 1.848110   Period 1
# 4:  Tom      1 2.621724   Period 1
# 5:  Tom      1 2.206875   Period 1
# 6:  Tom      0 1.715299         NA
# 7:  Tom      0 1.882378         NA
# 8:  Tom      0 2.244155         NA
# 9:  Tom      1 2.094944   Period 2
#10:  Tom      1 1.713493   Period 2
#11:  Ben      0 1.794261         NA
#12:  Ben      0 1.608199         NA
#13:  Ben      0 2.053490         NA
#14:  Ben      1 1.791563   Period 1
#15:  Ben      1 1.652090   Period 1
#16:  Ben      1 2.510483   Period 1
#17:  Ben      0 2.345984         NA
#18:  Ben      0 2.754110         NA
#19:  Ben      1 1.675527   Period 2
#20:  Ben      1 1.709622   Period 2
like image 42
akrun Avatar answered Jan 03 '26 05:01

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!