Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Transition tables from longitudinal data in the long format using R

Tags:

dataframe

r

dplyr

This question is about how to generate frequency transition tables from longitudinal data in the long format using R base functions or commonly used packages such as dplyr. Consider the longitudinal data

id <- c(1,1,2,2,3,3,4,4)
state <- c("C","A", "A", "A", "B", "A", "C", "A")
period <- rep(c("Start", "End"), 4)
df <- data.frame(id, state, period)
df

  id state period
1  1      C  Start
2  1      A    End
3  2      A  Start
4  2      A    End
5  3      B  Start
6  3      A    End
7  4      C  Start
8  4      A    End

and the expected output

    transition freq
1     A to A    1
2     A to B    0
3     A to C    0
4     B to B    0
5     B to A    1
6     B to C    0
7     C to C    0
8     C to A    2
9     C to B    0

I can generate the above output using the function statetable.msm in the msm package. However, I would like to know if this could be generated by base functions in R or other packages such as dplyr. Help is much appreciated!

like image 549
T Richard Avatar asked Oct 19 '25 15:10

T Richard


1 Answers

A solution entirely within base R could be something like:

do.call("c",
  split(df$state, df$id) |>
  lapply(paste, collapse = " to ")) |>
  factor(levels = sort(c(outer(unique(df$state), unique(df$state), 
                               FUN = paste, sep = " to ")))) |>
  table() |>
  as.data.frame() |>
  setNames(c("transition", "freq"))
#>   transition freq
#> 1     A to A    1
#> 2     A to B    0
#> 3     A to C    0
#> 4     B to A    1
#> 5     B to B    0
#> 6     B to C    0
#> 7     C to A    2
#> 8     C to B    0
#> 9     C to C    0
like image 118
Allan Cameron Avatar answered Oct 22 '25 05:10

Allan Cameron



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!