Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Expanding a matrix to include rows for each element in an interval [duplicate]

Tags:

r

I have a data frame containing information on incidents for countries around the world. The structure of the data frame is similar to the following example:

a <- data.frame(country = c("AAA" , "BBB" , "CCC") ,
                incident = rep("disaster" , times = 3) ,
                'start year' = c(1990 , 1995 , 2011) ,
                'end year' = c(1993 , 1995 , 2012))

giving a:

  country incident start.year end.year
1     AAA disaster       1990     1993
2     BBB disaster       1995     1995
3     CCC disaster       2011     2012

I would like to transform this so that each row contains the incident for each individual year instead of only the interval. Ideally, it would look like something like this:

  country incident year
1     AAA disaster 1990
2     AAA disaster 1991
3     AAA disaster 1992
4     AAA disaster 1993
5     BBB disaster 1995
6     CCC disaster 2011
7     CCC disaster 2012

Is there an optimal code that can transform this to include a start and end year?

like image 365
Adrian Avatar asked Oct 15 '25 15:10

Adrian


1 Answers

We may use map2 to get the sequence between the two columns as a list and then unnest the list column

library(dplyr)
library(purrr)
library(tidyr)
a %>%
   transmute(country, incident, year = map2(start.year, end.year, `:`)) %>%
   unnest(year)

-output

# A tibble: 7 × 3
  country incident  year
  <chr>   <chr>    <int>
1 AAA     disaster  1990
2 AAA     disaster  1991
3 AAA     disaster  1992
4 AAA     disaster  1993
5 BBB     disaster  1995
6 CCC     disaster  2011
7 CCC     disaster  2012

If the 'country' column is unique, either use a group by/summarise or use rowwise to expand as well

a %>% 
   group_by(country) %>%
   summarise(incident, year = start.year:end.year, .groups = 'drop')
# A tibble: 7 × 3
  country incident  year
  <chr>   <chr>    <int>
1 AAA     disaster  1990
2 AAA     disaster  1991
3 AAA     disaster  1992
4 AAA     disaster  1993
5 BBB     disaster  1995
6 CCC     disaster  2011
7 CCC     disaster  2012

Or use uncount to expand the data

a %>% 
 uncount(end.year - start.year + 1) %>% 
 group_by(country) %>%
  mutate(year = start.year + row_number() - 1, .keep = 'unused', 
     end.year = NULL) %>% 
  ungroup
like image 154
akrun Avatar answered Oct 17 '25 04:10

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!