I have the following dataframe:
df=read.table(text="A B C D
1,2 . 1,3 1,4
2,1 1,1 1,1 2,3 ", header=TRUE)
A B C D
1,2 . 1,3 1,4
2,1 1,1 1,1 2,3
I want to do rowmeans after spliting the values, so first value before comma in all rows, then second etc Results should be:
Value_1_mean Value_2_mean
1 (3/3) 3 (9/3)
1.5 (6/4) 1.5 (6/4)
The parenthesis are there just to show where the means came from and are not needed.
Or
Value_1_mean Value_2_mean
First value of comma in first row Second value of comma in first row
First value of comma in second row Second value of comma in second row
I've tried some codes but I think I'm far off.
in base R:
n <- paste0('value_',1:2,'_mean')
t(apply(df, 1,\(x) colMeans(read.table(text=x, sep=',', comment.char = '.', col.names = n))))
value_1_mean value_2_mean
[1,] 1.0 3.0
[2,] 1.5 1.5
read.table(text=unlist(df), sep = ',',
na.strings = '.', fill = TRUE, col.names = n) |>
cbind(rn= rownames(df))|>
aggregate(.~rn, mean, na.action = identity, na.rm = TRUE)
rn value_1_mean value_2_mean
1 1 1.0 3.0
2 2 1.5 1.5
Here is a data.table approach
library(data.table)
# set to data.table and melt to long format
df.long <- melt(setDT(df, keep.rownames = TRUE), id.vars = "rn")
# split to numeric values
df.long[, c("val1", "val2") := lapply(tstrsplit(value, ","), as.numeric)]
# summarise mean by rownumber column (rn)
df.long[, lapply(.SD, mean, na.rm = TRUE), by = .(rn), .SDcols = c("val1", "val2")]
# rn val1 val2
# 1: 1 1.0 3.0
# 2: 2 1.5 1.5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With