Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Complex conditional mutating: Create new variable with conditional mutating using only past observations for a given participant?

I have a dataframe (fbwb) with multiple assessments of bullying (1-6) using multiple measures (1-3) in a group of participants. The df looks like this:

fbwb <- read.table(text="id year bully1 bully2 bully3 cbully bully_ever 
100 1 NA 1 NA 1 1
100 2 1 1 NA 1 1
100 3 NA 0 NA 0 1
101 1 NA NA 1 1 1
102 1 NA 1 NA 1 1
102 2 NA NA NA NA 1
102 3 NA 1 1 1 1
102 4 0 0 0 0 1
103 1 NA 1 NA 1 1
103 2 NA 0 0 0 1", header=TRUE)

Where bully1, bully2, and bully3 are binary variables that each = 1 if bullying was reported on the respective measure. cbully is binary and = 1 if any of the 3 bullying variables = 1 for a given year. bully_ever is binary and = 1 if bullying was reported on any measure in any year for a given participant.

I want to create a new binary variable in my df called bully_past. bully_past represents the case when cbully = 1 in ANY PAST YEAR. This is subtly different from bully_ever. For example, if a participant has been assessed 4 times:

  • bully_past should use info from years 3, 2, and 1 AT YEAR 4.
  • bully_past should use info from years 2 and 1 AT YEAR 3.
  • bully_past should use info from year 1 AT YEAR 2.
  • bully_past should be NA at year 1.

I have tried quite a few things, but the most recent rendition is the following:

fbwb <- fbwb %>%
  dplyr::group_by(id) %>%
  dplyr::mutate(bully_past = case_when(cbully == 1 & year == (year - 1) |
                                         cbully == 1 & year == (year - 2) |
                                         cbully == 1 & year == (year - 3) |
                                         cbully == 1 & year == (year - 4) |
                                         cbully == 1 & year == (year - 5) ~ 1,
                                       (is.na(cbully) & year == (year - 1) &
                                         is.na(cbully) & year == (year - 2) &
                                         is.na(cbully) & year == (year - 3) &
                                         is.na(cbully) & year == (year - 4) &
                                         is.na(cbully) & year == (year - 5)) ~ NA_real_,
                                       TRUE ~ 0)) %>%
  dplyr::ungroup()

This does not work because the syntax for indicating which years to use is not correct - so it generates a column of NA values. I have made other attempts, but I have not been able to manage to take into account observations from ALL PREVIOUS YEARS.

It can be done in Stata using this code:

gen bullyingever = bullying
sort iid time
replace bullyingever = 1 if bullying[_n - 1]==1 & iid[_n - 1]==iid
replace bullyingever = 1 if bullying[_n - 2]==1 & iid[_n - 2]==iid
replace bullyingever = 1 if bullying[_n - 3]==1 & iid[_n - 3]==iid
replace bullyingever = 1 if bullying[_n - 4]==1 & iid[_n - 4]==iid
replace bullyingever = 1 if bullying[_n - 5]==1 & iid[_n - 5]==iid

I appreciate any input on how to accomplish this in R, preferably using dplyr.

like image 812
dam4l10 Avatar asked Dec 14 '25 10:12

dam4l10


2 Answers

Here we can write a helper function that can look at previous events both using cumsum (to keep a cumulative account of events which lets you look into the past) and lag() in order to look exclusively behind the current value. So we have

had_previous_event <- function(x) {
  lag(cumsum(!is.na(x) & x==1)>0)
}

You can then use that with your dplyr chain

fbwb %>%
  arrange(id, year) %>% 
  group_by(id) %>%
  mutate(bully_past = had_previous_event(cbully))

This returns TRUE/FALSE but if you want zero/one you can change that to

  mutate(bully_past = as.numeric(had_previous_event(cbully)))
like image 156
MrFlick Avatar answered Dec 16 '25 00:12

MrFlick


One solution can be using dplyr and ifelse as:

library(dplyr)

  fbwb  %>% group_by(id) %>%
  arrange(id, year) %>%
  mutate(bully_past_year = ifelse(is.na(lag(cbully)), 0L, lag(cbully))) %>%
  mutate(bully_past = ifelse(cumsum(bully_past_year)>0L, 1L, 0 )) %>%
  select(-bully_past_year) %>% as.data.frame()

  #    id   year bully1 bully2 bully3 cbully bully_ever bully_past
  # 1  100    1     NA      1     NA      1          1          0
  # 2  100    2      1      1     NA      1          1          1
  # 3  100    3     NA      0     NA      0          1          1
  # 4  101    1     NA     NA      1      1          1          0
  # 5  102    1     NA      1     NA      1          1          0
  # 6  102    2     NA     NA     NA     NA          1          1
  # 7  102    3     NA      1      1      1          1          1
  # 8  102    4      0      0      0      0          1          1
  # 9  103    1     NA      1     NA      1          1          0
  # 10 103    2     NA      0      0      0          1          1  
like image 20
MKR Avatar answered Dec 15 '25 22:12

MKR



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!