I have a dataframe and I would like to filter out rows that match some condition, and the subsequent N rows following it. For example, consider a data frame which contains a hour and minutes column (representing a timestamp per row). Let's say I would like the first two records after the 0th and 6th hour. Is it possible to do this in a nice way?
set.seed(3)
df <- 
    data.frame(hour = 0:11, minutes = runif(12, 0, 59), count = rpois(12, 3)) %>%
    arrange(hour, minutes)
which produces
> df
   hour   minutes count
1     0  9.914450     3
2     1 47.643468     3
3     2 22.711599     5
4     3 19.336325     5
5     4 35.523940     1
6     5 35.659249     4
7     6  7.353373     5
8     7 17.381455     2
9     8 34.078985     2
10    9 37.227777     0
11   10 30.208938     1
12   11 29.796411     1
The normal filter returns two rows:
> df %>%
+     filter(hour%%6 == 0)
  hour  minutes count
1    0 9.914450     3
2    6 7.353373     5
However, the answer should be:
  hour   minutes count
1    0  9.914450     3
2    1 47.643468     3
3    6  7.353373     5
4    7 17.381455     2
In this case it is possible to use modulo arithmetic on the column used for filtering, but in the general case this may not possible.
The original example is provided below, where by here I wanted the first two records in each hour. In this case, Akrun's answer is good and exploits the group structure in the data. E.g.
library(dplyr)
set.seed(0)
df <- 
    data.frame(hour = rep(0:11, 3), minutes = runif(36, 0, 59), count = rpois(36, 3)) %>%
    arrange(hour, minutes)
looks like:
   hour    minutes count
1     0  7.4077507     2
2     0 10.4168484     3
3     0 52.9051348     4
4     1 15.6650111     4
5     1 15.7660195     5
6     1 40.5343480     4
7     2 21.9553101     1
8     2 22.6621194     4
9     2 22.7807315     2
10    3  0.7900297     3
11    3 33.7983484     4
12    3 45.4206438     3
...
One could do
df %>% mutate(is_even_hour = ifelse(hour %% 2 == 0, 1, 0)) %>%
    filter(is_even_hour == 1) %>%
    group_by(hour, is_even_hour) %>%
    filter(row_number() <= 2) %>%
    ungroup %>%
    select(-is_even_hour)
which gives
hour   minutes count
   <int>     <dbl> <int>
1      0  7.407751     2
2      0 10.416848     3
3      2 21.955310     1
4      2 22.662119     4
5      4 22.560889     2
6      4 29.364255     5
7      6 20.080591     2
8      6 53.004991     3
9      8 35.374384     4
10     8 38.987070     3
11    10  3.645390     4
12    10 10.986838     5
I could think of this base R solution using sapply. 
Basically, the idea is to find out indices which are completely divisible by 6 and then using seq to generate next indices to be selected. 
So here as you want 2 rows after every index length.out is 2, if in future you want more (as mentioned in the comments) you can change this to whatever number you want. 
y <- which(df$hour%%6 == 0)
df[sapply(y, function(x) seq(x, length.out = 2)), ]
#    hour minutes   count
#1    0  9.914450      3
#2    1  47.643468     3
#7    6  7.353373      5
#8    7  17.381455     2
After grouping by 'hour', we can do this in a single filter step
df %>%
     group_by(hour) %>%
     filter(!hour%%2 & row_number() <3)
#     hour   minutes count
#    <int>     <dbl> <int>
#1      0  7.407751     2
#2      0 10.416848     3
#3      2 21.955310     1
#4      2 22.662119     4
#5      4 22.560889     2
#6      4 29.364255     5
#7      6 20.080591     2
#8      6 53.004991     3
#9      8 35.374384     4
#10     8 38.987070     3
#11    10  3.645390     4
#12    10 10.986838     5
For the updated post
i1 <- df %>% 
          filter(hour%%6 == 0) %>%
          .$hour %>% 
          rep(., each =2)+ 0:1 %>% 
          match(., df$hour) 
df[i1,]
#   hour   minutes count
#1    0  9.914450     3
#2    1 47.643468     3
#7    6  7.353373     5
#8    7 17.381455     2
Or this can be done in a compact way with data.table
library(data.table)
setDT(df)[df[, rep(which(!hour%%6), each = 2) + 0:1 ]]
#   hour   minutes count
#1:    0  9.914450     3
#2:    1 47.643468     3
#3:    6  7.353373     5
#4:    7 17.381455     2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With