Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to save the preprocessing objects in scikit-learn? [duplicate]

I am building a neural net with the purpose of make predictions on new data in the future. I first preprocess the training data using sklearn.preprocessing, then train the model, then make some predictions, then close the program. In the future, when new data comes in I have to use the same preprocessing scales to transform the new data before putting it into the model. Currently, I have to load all of the old data, fit the preprocessor, then transform the new data with those preprocessors. Is there a way for me to save the preprocessing objects objects (like sklearn.preprocessing.StandardScaler) so that I can just load the old objects rather than have to remake them?

like image 802
user1367204 Avatar asked Jan 20 '26 12:01

user1367204


1 Answers

I think besides pickle, you can also use joblib to do this. As stated in Scikit-learn's manual 3.4. Model persistence

In the specific case of scikit-learn, it may be better to use joblib’s replacement of pickle (dump & load), which is more efficient on objects that carry large numpy arrays internally as is often the case for fitted scikit-learn estimators, but can only pickle to the disk and not to a string:

from joblib import dump, load
dump(clf, 'filename.joblib') 

Later you can load back the pickled model (possibly in another Python process) with:

clf = load('filename.joblib') 

Refer to other posts for more information, Saving StandardScaler() model for use on new datasets, Save MinMaxScaler model in sklearn.

like image 109
sikisis Avatar answered Jan 23 '26 19:01

sikisis



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!