I want to calculate the area under the curve for several features measured at multiple concentrations for a group of subjects. The MESS auc function (described here: Calculate the Area under a Curve in R) gives me the auc, but I can't figure out to apply it to every column (feature) for all subjects in my data file.
My data is basically organized like this:
rowname id conc feature1 feature2 feature3 ...
s1 ccr01 5 18575 80337 100496
s2 ccr01 4 18161 65723 109037
s3 ccr01 3 18092 99807 105363
s4 ccr01 2 5196 71520 84113
s5 ccr01 1 3940 50236 77145
s6 ccr02 5 1878 21812 10306
s7 ccr02 4 3660 18437 13408
s8 ccr02 3 4439 28379 25899
s9 ccr02 2 2710 22960 28080
s10 ccr02 1 1970 23557 22409
.
.
.
I want to return a matrix/df of feature AUCs (columns) ordered by unique subject IDs (rows):
rowname feature1 feature2 feature3
ccr01 52338.61 300823.6 388368.2
ccr02 12914.41 91486.32 84316.82
Any suggestions would be greatly appreciated!
Using the function from the linked post and plyr to get the function ddply, this might work (and the data is named dat)
library(zoo)
AUC <- function(x, fs)
sapply(fs, function(f) sum(diff(x$conc)*rollmean(x[,f],2)))
library(plyr)
ddply(dat, .(id), function(x) {
x <- x[order(x$conc),]
AUC(x, grep("feature", names(x), value=T))
})
# id feature1 feature2 feature3
# 1 ccr01 52706.5 302336.5 387333.5
# 2 ccr02 12733.0 92460.5 83744.5
Here, fs are the columns containing feature string, so it just applies the AUC function to those columns, grouped by id.
A dplyr solution,
library(dplyr)
AUC <- function(x, fs)
setNames(as.data.frame(
lapply(fs, function(f) sum(diff(x$conc)*rollmean(x[,f], 2)))),
fs)
dat %>%
group_by(id) %>%
arrange(conc) %>%
do(AUC(., grep("feature", names(.), value=T)))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With