Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Grid search with LightGBM regression

I want to train a regression model using Light GBM, and the following code works fine:

import lightgbm as lgb

d_train = lgb.Dataset(X_train, label=y_train)
params = {}
params['learning_rate'] = 0.1
params['boosting_type'] = 'gbdt'
params['objective'] = 'gamma'
params['metric'] = 'l1'
params['sub_feature'] = 0.5
params['num_leaves'] = 40
params['min_data'] = 50
params['max_depth'] = 30

lgb_model = lgb.train(params, d_train, 1000)

#Prediction
y_pred=lgb_model.predict(X_test)
mae_error = mean_absolute_error(y_test,y_pred)

print(mae_error)

But when I proceed to using GridSearchCV, I encounter problems. I am not completely sure how to set this up correctly. I found useful sources, for example here, but they seem to be working with a classifier.

1st try:

from sklearn.metrics import make_scorer
score_func = make_scorer(mean_absolute_error, greater_is_better=False)

model = lgb.LGBMClassifier( 
    boosting_type="gbdt",
    objective='regression',
    is_unbalance=True, 
    random_state=10, 
    n_estimators=50,
    num_leaves=30, 
    max_depth=8,
    feature_fraction=0.5,  
    bagging_fraction=0.8, 
    bagging_freq=15, 
    learning_rate=0.01,    
)

params_opt = {'n_estimators':range(200, 600, 80), 'num_leaves':range(20,60,10)}
gridSearchCV = GridSearchCV(estimator = model, 
    param_grid = params_opt, 
    scoring=score_func)
gridSearchCV.fit(X_train,y_train)
gridSearchCV.grid_scores_, gridSearchCV.best_params_, gridSearchCV.best_score_

, gives me a bunch of error before:

"ValueError: Unknown label type: 'continuous'"

UPDATE: I made the code run switching LGBMClassifier with LGBMModel. Should I try to use LGBMRegressor too, or does this not matter? (source: https://lightgbm.readthedocs.io/en/latest/_modules/lightgbm/sklearn.html)

like image 868
Helen Avatar asked Oct 29 '25 09:10

Helen


1 Answers

First of all, it is unclear what is the nature of you data and thus what type of model fits better. You use L1 metric, so i assume you have some sort of regression problem. If not, please correct me and elaborate why do you use L1 metric then. If yes, then it is unclear why do you use LGBMClassifier at all, since it serves classification problems (as @bakka has already pointed out).

Note, that in practise LGBMModel is the same as LGBMRegressor (you can see it in the code). However, there is no guarantee that this will remain so in the long-term future. So if you want to write good and maintainable code - do not use the base class LGBMModel, unless you know very well what you are doing, why and what are the consequences.

Regarding the parameter ranges: see this answer on github

like image 168
Mischa Lisovyi Avatar answered Oct 31 '25 12:10

Mischa Lisovyi



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!