I use the xgboost function in R, and I get the following error message
bst <- xgboost(data = germanvar, label = train$Creditability, max.depth = 2, eta = 1,nround = 2, objective = "binary:logistic")
Error in xgb.get.DMatrix(data, label, missing, weight) :
xgboost only support numerical matrix input,
use 'data.matrix' to transform the data.
In addition: Warning message:
In xgb.get.DMatrix(data, label, missing, weight) :
xgboost: label will be ignored.
Following is my full code.
credit<-read.csv("http://freakonometrics.free.fr/german_credit.csv", header=TRUE)
library(caret)
set.seed(1000)
intrain<-createDataPartition(y=credit$Creditability, p=0.7, list=FALSE)
train<-credit[intrain, ]
test<-credit[-intrain, ]
germanvar<-train[,2:21]
str(germanvar)
bst <- xgboost(data = germanvar, label = train$Creditability, max.depth = 2, eta = 1,
nround = 2, objective = "binary:logistic")
Data has a mixture of continuous and categorical variables.
However, because of the error message that only continuous variables can be used, all the variables were recognized as continuous, but the error message reappears.
How can I solve this problem???
So if you have categorical variables that are represented as numbers, it is not an ideal representation. But with deep enough trees you can get away with it. The trees will partition it eventually. I don't prefer that approach but it keeps you columns minimal, and can succeed given the right setup.
Note that xgboost takes numeric matrix as data, and numeric vector as label.
NOT INTEGERS :)
The following code will train with the inputs cast properly
credit<-read.csv("http://freakonometrics.free.fr/german_credit.csv", header=TRUE)
library(caret)
set.seed(1000)
intrain<-createDataPartition(y=credit$Creditability, p=0.7, list=FALSE)
train<-credit[intrain, ]
test<-credit[-intrain, ]
germanvar<-train[,2:21]
label <- as.numeric(train$Creditability) ## make it a numeric NOT integer
data <- as.matrix(germanvar) # to matrix
mode(data) <- 'double' # to numeric i.e double precision
bst <- xgboost(data = data, label = label, max.depth = 2, eta = 1,
nround = 2, objective = "binary:logistic")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With