Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compare several matrices and calculate the percentage difference in R

Tags:

r

matrix

Suppose that's my data example

nrow<-4
ncol<-5
m1 <- matrix(rbinom(nrow*ncol,1,.5),nrow,ncol)
m2 <- matrix(rbinom(nrow*ncol,1,.5),nrow,ncol)
m3 <- matrix(rbinom(nrow*ncol,1,.5),nrow,ncol)

I need to compare 3 matrices sequentially. According to this principle. Eg

m1

[,1] [,2] [,3] [,4] [,5]
[1,] 1 0 1 0 1
[2,] 1 1 1 0 1
[3,] 1 0 0 1 1
[4,] 0 0 0 1 0

m2

[,1] [,2] [,3] [,4] [,5]
[1,] 1 0 0 0 1
[2,] 0 1 1 0 0
[3,] 1 0 0 0 0
[4,] 0 1 0 1 0

. Now count the number of matches in each column of the matrix. Take the first column of both matrices

    m1 m2

    1 1 matched values
    1 0 values did not match
    1 1 matched values
    0 0 matched values

in total, out of 4 values in the first columns of matrices m1 and m2, 3 coincided. It turns out that 75% of the values coincided.

then take the second column

 m1   m2
[,2] [,2] 
0     0 matched values
1     1 matched values
0     0 values did not match
0     1 values did not match

here is also a similar situation where 3 values coincided

in other words, as the desired output there must be something like

[,1] [,2] [,3] [,4] [,5]
[1,] 1 0 1 0 1
[2,] 1 1 1 0 1
[3,] 1 0 0 1 1
[4,] 0 0 0 1 0


    [,1] [,2] [,3] [,4] [,5]
[1,] 1 0 0 0 1
[2,] 0 1 1 0 0
[3,] 1 0 0 0 0
[4,] 0 1 0 1 0
    75 75 75 75 50 

Now let's calculate the average value of the percentages received. = 70%

(75+75+75+75+50)/5=70

What is the simplest way to calculate the percentage difference between all matrices? first between m1 and m2, then between m1 and m3, and lastly between m2 and m3

Thank you for your help

like image 647
psysky Avatar asked Oct 25 '25 05:10

psysky


1 Answers

Here are two base R options

  • combn + mean

combn generates combination pairs, and mean tells the mean of co-occurrences.

combn(
    lst,
    2,
    \(x) mean(do.call(`==`,x))
)

you will obtain

[1] 0.5 0.4 0.5
  • adist + toString

This approach generates a matrix that depicts the co-occurrences.

> 1 - adist(unlist(lapply(lst, toString))) / lengths(lst)
     [,1] [,2] [,3]
[1,]  1.0  0.5  0.4
[2,]  0.5  1.0  0.5
[3,]  0.4  0.5  1.0

Data

set.seed(0)
nrow <- 4
ncol <- 5
m1 <- matrix(rbinom(nrow * ncol, 1, .5), nrow, ncol)
m2 <- matrix(rbinom(nrow * ncol, 1, .5), nrow, ncol)
m3 <- matrix(rbinom(nrow * ncol, 1, .5), nrow, ncol)

lst <- list(m1, m2, m3)
like image 52
ThomasIsCoding Avatar answered Oct 26 '25 19:10

ThomasIsCoding