Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace values over threshold with values from another column

I have data that is a much larger version of the data shown below (only for example).

Code <- c("A", "A", "A", "A", "A", "B", "B", "B", "B", "B")

Date <- as.Date(c("2018-01-01", "2018-01-02", "2018-01-03", "2018-01-04", "2018-01-05", "2018-01-01", "2018-01-02", "2018-01-03", "2018-01-04", "2018-01-05"))

Max <- c(2.4, 4.2, 3.7, 10.6, 5.2, 8.7, 3.9, 4.8, 14.5, 3.2)

Mean <- c(2.2, 3.9, 3.1, 5.3, 4.3, 4.8, 3.6, 4.2, 6.0, 2.8)

Limit <- c(6.5, 6.5, 6.5, 6.5, 6.5, 10.5, 10.5, 10.5, 10.5, 10.5)

df <- data.frame(Code, Date, Max, Mean, Limit)

What I would like to do is to replace any Max values greater than the Limit with the Mean value, but only for Code == "A". So in this example the fourth Max value (10.6) would be replaced with 5.3.

I have been really struggling to wrap my head around this but I'm sure you lot on here can help!

I have tried separating the codes A and B into their own data frames and replacing them but I can't think of a way to do this. I also tried creating a new column using mutate with an ifelse statement inside but could not work it out.

I mainly use dplyr so if you could explain it using that that would be great! Thanks a lot, let me know if you require more information and I will add it.

like image 234
EllisR8 Avatar asked Dec 06 '25 13:12

EllisR8


2 Answers

df %>% 
  #Condition for Code = "A" and Max  > Limit
  mutate(
    Max = if_else(Code == "A" & Max > Limit, Mean, Max)
  )
   Code       Date  Max Mean Limit
1     A 2018-01-01  2.4  2.2   6.5
2     A 2018-01-02  4.2  3.9   6.5
3     A 2018-01-03  3.7  3.1   6.5
4     A 2018-01-04  5.3  5.3   6.5
5     A 2018-01-05  5.2  4.3   6.5
6     B 2018-01-01  8.7  4.8  10.5
7     B 2018-01-02  3.9  3.6  10.5
8     B 2018-01-03  4.8  4.2  10.5
9     B 2018-01-04 14.5  6.0  10.5
10    B 2018-01-05  3.2  2.8  10.5
like image 122
Vinícius Félix Avatar answered Dec 08 '25 06:12

Vinícius Félix


Here is a data.table option

> setDT(df)[Code == "A" & Max > Limit, Max := Mean][]
    Code       Date  Max Mean Limit
 1:    A 2018-01-01  2.4  2.2   6.5
 2:    A 2018-01-02  4.2  3.9   6.5
 3:    A 2018-01-03  3.7  3.1   6.5
 4:    A 2018-01-04  5.3  5.3   6.5
 5:    A 2018-01-05  5.2  4.3   6.5
 6:    B 2018-01-01  8.7  4.8  10.5
 7:    B 2018-01-02  3.9  3.6  10.5
 8:    B 2018-01-03  4.8  4.2  10.5
 9:    B 2018-01-04 14.5  6.0  10.5
10:    B 2018-01-05  3.2  2.8  10.5
like image 26
ThomasIsCoding Avatar answered Dec 08 '25 08:12

ThomasIsCoding



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!