I have a data.frame that has time series values for a, b, and c. I would like to build a random time series that randomly picks the value of the columns for each row (i.e. date).
So for example, if we have the following df:
df <- data.frame(date = c(as.Date("2018-08-01"),as.Date("2018-09-01"), as.Date("2018-10-01")), a = c(1.0, 1.5, 1.8), b=c(-1.0, -2.0, 3.0), c=c(-2.0, -15.0, 1.7))
#> df
# date a b c
# 1 2018-08-01 1.0 -1 -2.0
# 2 2018-09-01 1.5 -2 -15.0
# 3 2018-10-01 1.8 3 1.7
A possible random sample would look like (in this case picked a for the first month, b for the second, and c for the third).
df.random.sample <- data.frame(date = c(as.Date("2018-08-01"),as.Date("2018-09-01"), as.Date("2018-10-01")), random = c(1.0, -2.0, 1.7))
#> df.random.sample
# date random
#1 2018-08-01 1.0
#2 2018-09-01 -2.0
#3 2018-10-01 1.7
Most importantly, I have many different columns so would like this to work with column indexes so I do not need to specify each column name.
If we want to sample by row, then use apply
cbind(df[1], random = apply(df[-1], 1, sample, size = 1))
Or use a vectorized approach with row/column indexing
cbind(df[1], random = df[-1][cbind(seq_len(nrow(df)), sample(2:ncol(df))-1)])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With