Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using sparse matrix as an input to ranger package in R

Overview

To avoid memory issue, I have converted document term matrix to sparse matrix with “matrix” package using below piece of code:

library(matrix)
documentTermMatrixFrame <- Matrix(documentTermMatrixFrame, sparse = TRUE)

but when I try to use this matrix as an input to ranger() function of “ranger” package using below code:

library(ranger)
trainSet <- documentTermMatrixFrame[1:750,]
testSet <- documentTermMatrixFrame[751:999,]
fit <- ranger(trainingColumnNames, data=trainSet,write.forest=TRUE)

I am getting error:

Error in as.data.frame.default(data) : 
cannot coerce class "structure("dgCMatrix", package = "Matrix")" to a data.frame

Dataset

This is a sample of dataset which I am using

  
 
  <html>
    <table style="width:100%">
  <tr>
    <th>nitemid</th>
    <th>sUnSpsc</th> 
    <th>productDescription</th>
  </tr>
      <tr>
    <td>7460893</td>
    <td>26121609Network cable </td> 
    <td>Category 6A, Advanced MaTriX, 4-pair, 23 AWG, U/UTP copper cable, Plenum (CMP) Rated, White, 1000ft/305m ""</td>
  </tr>
       <tr>
    <td>7460456</td>
    <td>26121709Network cable </td> 
    <td>Shielded marine MUD-resistant armored copper cable, category 7 S/FTP, low smoke zero halogen (LSZH), 4-pair, conductors are 22 AWG construction with foamed PE insulation, twisted in pairs</td>
  </tr>
       <tr>
    <td>7460856</td>
    <td>26121890Inter connect cable </td> 
    <td>1 PC. = 100 M 2 X 1.5 QMM, 100M SPECIAL DESIGN TO UL CLASS 2 YELLOW TPE OIL-RESISTANT AS-INTERFACE SHAPED CABLE</td>
  </tr>
</html>

After preprocessing the description in dataset using stopword removal, punctuation removal,stemming etc... document-term matrix will be created which is in turn converted to sparse matrix.

sample of Documnent-term matrix for Dataset

terms
doc   advance  category ..... ..... ....... ....... ....... twist
 1      1         1                                           0
 2      0         1                                           1
 3      0         0                                           0

Question

how to use sparse matrix as an input to ranger() function ?

Could anyone please help

Thanks in Advance

like image 418
BHANUMATHI H M Avatar asked Dec 01 '25 06:12

BHANUMATHI H M


1 Answers

Since Version 0.7.2, sparse matrices like the ones from the package Matrix can now be passed to ranger, see the discussion here. Extending to what is said in the thread, sparse matrices are now also supported in the CRAN version and do not need additional parameters like in the inital github version.

like image 192
zerweck Avatar answered Dec 03 '25 23:12

zerweck



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!