I have a data frame DF, with three columns and n rows shown below:
Month Year Default
1 2015 T
2 2015 T
3 2015 F
4 2015 T
5 2015 T
6 2015 T
7 2015 F
I would like to check if there are 3 T in a roll and keep going then print out all the starting year and month into a new DF.
I need to obtain the output as shown above. The output should like:
Month Year
4 2015
Here's an attempt using data.table devel version on GH and the new rleid function
library(data.table) # v 1.9.5+
setDT(df)[, indx := rleid(Default)]
df[(Default), if(.N > 2) .SD[1L], by = indx]
# indx Month Year Default
# 1: 3 4 2015 TRUE
What we are basically doing here, is to set a unique index per consecutive events in Default, then by looking only when Default == TRUE we are checcking per each group if the group size is bigger than 2, if so, select the first instance in that group.
A shorter version (proposed by @Arun) would be
setDT(df)[, if(Default && .N > 2L) .SD[1L], by = .(indx = rleid(Default), Default)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With