Which model is the python equivalent for the 'GBTRegressor' Pyspark model?
Brief background: I'm trying to re-create a pyspark model into a python model. The model which is being used in the existing pipeline is GBTRegressor. I would like to know which model is the python equivalent for this so that I can use the similar parameters and deploy the model on python instead.
GBTRegressor is Gradient Boosting Tree for Regression. If you want to implement them in python, you can use sklearn's implementation. Following is a simple example of using GBT regression for diabetes dataset:
import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, ensemble
from sklearn.inspection import permutation_importance
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
# loading Data
diabetes = datasets.load_diabetes()
X, y = diabetes.data, diabetes.target
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.1, random_state=13)
params = {'n_estimators': 500,
'max_depth': 4,
'min_samples_split': 5,
'learning_rate': 0.01,
'loss': 'ls'}
# fitting model
reg = ensemble.GradientBoostingRegressor(**params)
reg.fit(X_train, y_train)
mse = mean_squared_error(y_test, reg.predict(X_test))
print("The mean squared error (MSE) on test set: {:.4f}".format(mse))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With