Suppose you have something like this:
Col1 Col2
a odd from 1 to 9
b even from 2 to 14
c even from 30 to 50
...
I would like to expand the rows by separating the interval into individual row, so:
Col1 Col2
a 1
a 3
a 5
...
b 2
b 4
b 6
...
c 30
c 32
c 34
...
Note that when it says "even from", the lower and upper bounds are also even, and the same goes for odd numbers.
separate Col2 into individual columns and then for each row create the sequence:
library(dplyr)
library(tidyr)
DF %>%
separate(Col2, into = c("parity", "X1", "from", "X2", "to")) %>%
group_by(Col1) %>%
do(data.frame(Col2 = seq(.$from, .$to, 2))) %>%
ungroup
The input DF in reproducible form is assumed to be:
DF <- structure(list(Col1 = c("a", "b", "c"), Col2 = c("odd from 1 to 9",
"even from 2 to 14", "even from 30 to 50")), .Names = c("Col1",
"Col2"), row.names = c(NA, -3L), class = "data.frame")
The next version of tidyr supports NA in the into vector to denote fields to ignore so the separate statement above could be written:
separate(Col2, into = c("parity", NA, "from", NA, "to")) %>%
Here is an option using base R. We extract the numeric elements in 'Col2' using gregexpr/regmatches into a list, then the get the sequence of elements by 2 with seq and stack it to data.frame
res <- stack(setNames(lapply(regmatches(DF$Col2, gregexpr("\\d+", DF$Col2)), function(x)
seq(as.numeric(x[1]), as.numeric(x[2]), by = 2)), DF$Col1))[2:1]
colnames(res) <- colnames(DF)
head(res)
# Col1 Col2
#1 a 1
#2 a 3
#3 a 5
#4 a 7
#5 a 9
#6 b 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With