I have a dataframe named df as follows:
Genes ID Type
CFH MB-0002 Gain
CFHR3 MB-0002 Gain
DEFB131 MB-0003 Gain
UNC93B5 MB-0003 Loss
CCDC125 MB-0004 Loss
CCNB1 MB-0002 Gain
CFH MB-0004 Loss
CCNB1 MB-0003 Gain
I want to build a matrix, say Mat, and write it into a csv file where I will have the Genes as rows and the IDs as columns. I want to put:
1 if the corresponding type is Gain -1 if the corresponding type is Loss
0 in all other places. And example of my matrix would be:
MB-0002 MB-0003 MB-0004
CFH 1 0 -1
CFHR3 1 0 0
DEFB131 0 1 0
UNC93B5 0 -1 0
CCDC125 0 0 -1
CCNB1 1 1 0
Try:
xtabs(c(1L, -1L)[Type] ~ ., data=df)
# ID
#Genes MB-0002 MB-0003 MB-0004
# CCDC125 0 0 -1
# CCNB1 1 1 0
# CFH 1 0 -1
# CFHR3 1 0 0
# DEFB131 0 1 0
# UNC93B5 0 -1 0
xtab() is similar to table() except that it takes a variable containing the frequency counts for each combination of levels. You can convert the result back to a data-frame with as.data.frame().
The left-hand side of the formula gives the "counts" (in this case the values that the contingency table is to be populated with). It uses a known trick to convert a factor to a numeric vector using indexing (see ?factor). The . on right-hand side is a short-cut for "the rest of the variables in the data-frame", which in this case is equivalent to Genes + ID.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With