I was hoping someone could help me with this problem. I’m working with time series data that has records every five years. The data frame df is an example, with the time step in df$Time and record values in df$A.
df
Time A
5 4.8
10 19.6
15 27.5
20 39.7
What I need to do is convert df to an annual time series and populate all the new records with new values based on a calculation. I have successfully converted df to annual data df2 using the following:
df2 <- data.frame("Time" = c(5:20), "A" = c(5:20))
df2$A[] <- sapply(df2$A, function(x) df$A[match(x, df$Time)])
df2[is.na(df2)] <- 0
df2
Time A
5 4.8
6 0
7 0
8 0
9 0
10 19.6
11 0
12 0
13 0
14 0
15 27.5
16 0
17 0
18 0
19 0
20 39.7
What I can’t figure out is how to calculate new values for df2$A. The calculation depends on the relative positions of other row/record values – for example, fordf2$A[2] the calculation would be df2$A[2] <- df2$A[1] + (df2$A[6] - df2$A[1])/5. It also changes throughout the data frame, for example df2$A[7] <- df2$A[6] + (df2$A[11] – df2$A[6]) / 5.
for loops have yielded only frustration, and I'm trying to do this without doing a line of code for every new record (the actual data has hundreds of records). I'm trying to get to df3 in an efficient way - thank you!
df3
Time A
5 4.8
6 7.76
7 10.72
8 13.68
9 16.64
10 19.6
11 21.18
12 22.76
13 24.34
14 25.92
15 27.5
16 29.94
17 32.38
18 34.82
19 37.26
20 39.7
What you are looking for is called linear interpolation. In R you can use the approx function like this:
df <- data.frame("Time" = c(5, 10, 15, 20), "A" = c(4.8, 19.6, 27.5, 39.7))
df2 <- as.data.frame(approx(x = df$Time, y = df$A, xout = 5:20))
names(df2) <- names(df)
Result:
> df2
Time A
1 5 4.80
2 6 7.76
3 7 10.72
4 8 13.68
5 9 16.64
6 10 19.60
7 11 21.18
8 12 22.76
9 13 24.34
10 14 25.92
11 15 27.50
12 16 29.94
13 17 32.38
14 18 34.82
15 19 37.26
16 20 39.70
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With