I want to create a 100*4 matrix of 0s and 1s, such that each row has only one 1 and each column has at least two 1s, in R.
MyMat <- as.matrix(rsparsematrix(nrow=100, ncol=4, nnz  = 100))
I am thinking of rsparsematrix but yet I am not sure how to apply my required conditions.
edit. My other try would be dummy_cols, but then no matter what. I am stuck with applying the two conditions yet. I guess there must be a more straightforward way of creating such a matrix.
1) A matrix consisting of 25 4x4 identity matrices stacked one on top of each other satisfies these requirements
m <- matrix(1, 25) %x% diag(4)
2) Exchanging the two arguments of %x% would also work and gives a different matrix which also satisfies this.
3) Any permutation of the rows and the columns of the two solution matrices in (1) and (2) would also satisfy the conditions.
m[sample(100), sample(4)]
4) If the objective is to generate a random table containing 0/1 values whose row sums are each 1 and whose column sums are each 25 then use r2dtable:
r <- r2dtable(1, rep(1, 100), rep(25, 4))[[1]]
5) or if it is desired to allow any column sums of at least 2 then:
rsums <- rep(1, 100)
csums <- rmultinom(1, 92, rep(0.25, 4)) + 2
r <- r2dtable(1, rsums, csums)[[1]]
Stochastically, with two rules:
1; and1s.I control the first implicitly by construction; I test against the second.
nr <- 100 ; nc <- 4
set.seed(42)
lim <- 10000
while (lim > 0) {
  lim <- lim - 1
  M <- t(replicate(nr, sample(c(1, rep(0, nc-1)))))
  if (all(colSums(M > 0) >= 2)) break
}
head(M)
#      [,1] [,2] [,3] [,4]
# [1,]    1    0    0    0
# [2,]    0    0    0    1
# [3,]    0    0    0    1
# [4,]    0    1    0    0
# [5,]    0    0    0    1
# [6,]    0    1    0    0
colSums(M)
# [1] 25 30 21 24
lim
# [1] 9999
My use of lim is hardly needed in this example, but is there as a mechanism to stop this from running infinitely: if you change the dimensions and/or the rules, it might become highly unlikely or infeasible to meet all rules, so this keeps the execution time limited. (10000 is completely arbitrary.)
My point in the comment is that it would be rather difficult to find a 100x4 matrix that matches rule 1 that does not match rule 2. In fact, since the odds of a 0 or a 1 in any one cell is 0.75 and 0.25, respectively, to find a column (among 100 rows) that contains fewer than two 1s would be around 1.1e-11.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With