Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R Get minimum value in dataframe selecting rows on 2 columns [duplicate]

I have a dataframe like the one I've simplified below. I want to first select rows with the same value based on column X, then in that selection select rows with the same value based on column Y. Then from that selection, I want to take the minimal value. I'm now using a forloop, but seems there must be an easier way. Thanks!

set.seed(123)    
data<-data.frame(X=rep(letters[1:3], each=8),Y=rep(c(1,2)),Z=sample(1:100, 12))
data
   X Y  Z
1  a 1 76
2  a 1 22
3  a 2 32
4  a 2 23
5  b 1 14
6  b 1 40
7  b 2 39
8  b 2 35
9  c 1 15
10 c 1 13
11 c 2 21
12 c 2 42

Desired outcome:

   X Y  Z
2  a 1 22
4  a 2 23
5  b 1 14
8  b 2 35
10 c 1 13
11 c 2 21
like image 780
joffie Avatar asked Jan 18 '26 21:01

joffie


1 Answers

Here is a data.table solution:

library(data.table)
data = data.table(data)
data[, min(Z), by=c("X", "Y")]

EDIT based on OP's comment:

If there is a NA value in one of the columns we sort by, an additional row is created:

data[2,2] <-NA
data[, min(Z,na.rm = T), by=c("X", "Y")]

   X  Y V1
1: a  1 31
2: a NA 79
3: a  2 14
4: b  1 31
5: b  2 14
6: c  1 50
7: c  2 25
like image 74
otwtm Avatar answered Jan 20 '26 13:01

otwtm



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!