Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python equivalent for Pyspark model

Which model is the python equivalent for the 'GBTRegressor' Pyspark model?

Brief background: I'm trying to re-create a pyspark model into a python model. The model which is being used in the existing pipeline is GBTRegressor. I would like to know which model is the python equivalent for this so that I can use the similar parameters and deploy the model on python instead.

like image 767
Prabhat Mangina Avatar asked Mar 06 '26 06:03

Prabhat Mangina


1 Answers

GBTRegressor is Gradient Boosting Tree for Regression. If you want to implement them in python, you can use sklearn's implementation. Following is a simple example of using GBT regression for diabetes dataset:

import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, ensemble
from sklearn.inspection import permutation_importance
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split

# loading Data
diabetes = datasets.load_diabetes()
X, y = diabetes.data, diabetes.target

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.1, random_state=13)

params = {'n_estimators': 500,
          'max_depth': 4,
          'min_samples_split': 5,
          'learning_rate': 0.01,
          'loss': 'ls'}

# fitting model

reg = ensemble.GradientBoostingRegressor(**params)
reg.fit(X_train, y_train)

mse = mean_squared_error(y_test, reg.predict(X_test))
print("The mean squared error (MSE) on test set: {:.4f}".format(mse))
like image 110
Raha Moosavi Avatar answered Mar 08 '26 20:03

Raha Moosavi