Finding Mean Values in a Column by Intervals

Question

I'm tracking violent events around a treatment-week to see if the treatment has an effect on the event count around it. The table is broken into country-week and records violent events for most countries between 1989-2019. It is similar to this, but with a total of approx. 120 countries and 70k rows:

    Country  Year  Week  Event_Count  Treatment
    -------  ----  ----  -----------  ---------
    Angola   1995  33    3            NA
    Angola   1995  34    2            NA
    Angola   1995  35    3            NA
    Angola   1995  36    0            NA
    Angola   1995  37    4            1
    Angola   1995  38    1            NA
    Angola   1995  39    0            NA
    Angola   1995  40    2            NA
    Angola   1995  41    3            NA

I'd like to find the average weekly event count broken into four-week intervals around the treatment week (in this case, week 37 denoted by the "1" in the Treatment column) up until twelve weeks in both directions. In other words, weeks -1 to -4 would be 2 events as this is the average weekly event count for weeks 33 thru 36. Week 0 (in this case, week 37) would be 4. Weeks 1 to 4 (38 thru 41) would be 1.5. And weeks -5 to -8 (weeks 29 to 32), 5 to 8 (42 to 45), -9 to -12 (25 to 28), and 9 to 12 (46 to 49) would be the average weekly values within their respective bins. I'm using a regression discontinuity design, so I'd like to do the same thing for countries that just missed receiving the treatment, denoted by "0" in the treatment column. All other values in the treatment column are "NA". I was thinking about coding like this:

    aggregate(df[row_interval, column_interval], list(df$Country), mean)

but since I'll be looking at several treatment weeks within the same country, and the treatment weeks change regularly (e.g., Angola, 1995 Week 37; Algeria, 1998 Week 12, etc.), this wouldn't work. Ideally, I'd like to place the output for treatments of both "0" and "1" in the same row as the treatment week. For example:

    Country Year Week Event_Count Treatment -12to-9 -8to-5 -4to-1 0 1to4 5to8 9to12
    ------- ---- ---- ----------- --------- ------- ------ ------ - ---- ---- -----
    Algeria 2002  14     4           0         3      0.5    1    4  2    5    2

I'm trying to put the outputs for similar weeks in the same column, regardless of receiving "0" or "1" for treatment. E.g., all values for weeks 1 to 4 after the treatment would fall under the "1to4" column.

Thank you in advance for your help! I'm fairly new to stack and r, so I apologize for any confusion.

Park · Accepted Answer

I write longer data that contains total 25 weeks.

Country  Year  Week Event_Count Treatment
   <chr>   <dbl> <dbl>       <dbl>     <dbl>
 1 Angola   1995    25           3        NA
 2 Angola   1995    26           2        NA
 3 Angola   1995    27           4        NA
 4 Angola   1995    28           1        NA
 5 Angola   1995    29           0        NA
 6 Angola   1995    30           4        NA
 7 Angola   1995    31           1        NA
 8 Angola   1995    32           0        NA
 9 Angola   1995    33           3        NA
10 Angola   1995    34           2        NA
# ... with 15 more rows

Same as your example, Week 37 has Treatment. Define function gola as,

gola <- function(df) {
  idx <- which(!is.na(df$Treatment))
  res <- c()
  for (i in idx){
    trt <- df$Event_Count
    x.3 <- mean(trt[(i-12):(i-9)])
    x.2 <- mean(trt[(i-8):(i-5)])
    x.1 <- mean(trt[(i-4):(i-1)])
    x <- trt[i]
    x1 <- mean(trt[(i+1):(i+4)])
    x2 <- mean(trt[(i+5):(i+8)])
    x3 <- mean(trt[(i+9):(i+12)])
    res <- rbind(res,c(x.3, x.2, x.1, x, x1, x2, x3))
    
  }
  colnames(res) <- c("_12to_9", "_8to_5", "_4to_1", "0", "1to4", "5to8", "9to12")
  res <- cbind(df[idx,], res)
  return(res) 
}

Then, gola(df) will return

  Country Year Week Event_Count Treatment _12to_9 _8to_5 _4to_1 0 1to4 5to8 9to12
1  Angola 1995   37           4         1     2.5   1.25      2 4  1.5    2  1.75

Finding Mean Values in a Column by Intervals

Tags:

r

mean

Dylan Forrester

1 Answers

Park

Recent Activity

Donate For Us

Finding Mean Values in a Column by Intervals

Tags:

r

mean

Dylan Forrester

1 Answers

Park

Related questions

Recent Activity

Donate For Us