Is there a way to save the preprocessing objects in scikit-learn? [duplicate]

Question

I am building a neural net with the purpose of make predictions on new data in the future. I first preprocess the training data using sklearn.preprocessing, then train the model, then make some predictions, then close the program. In the future, when new data comes in I have to use the same preprocessing scales to transform the new data before putting it into the model. Currently, I have to load all of the old data, fit the preprocessor, then transform the new data with those preprocessors. Is there a way for me to save the preprocessing objects objects (like sklearn.preprocessing.StandardScaler) so that I can just load the old objects rather than have to remake them?

sikisis · Accepted Answer

I think besides pickle, you can also use joblib to do this. As stated in Scikit-learn's manual 3.4. Model persistence

In the specific case of scikit-learn, it may be better to use joblib’s replacement of pickle (dump & load), which is more efficient on objects that carry large numpy arrays internally as is often the case for fitted scikit-learn estimators, but can only pickle to the disk and not to a string:

from joblib import dump, load
dump(clf, 'filename.joblib')

Later you can load back the pickled model (possibly in another Python process) with:

clf = load('filename.joblib')

Refer to other posts for more information, Saving StandardScaler() model for use on new datasets, Save MinMaxScaler model in sklearn.

Is there a way to save the preprocessing objects in scikit-learn? [duplicate]

Tags:

machine-learning

preprocessor

scikit-learn

user1367204

1 Answers

sikisis

Recent Activity

Donate For Us

Is there a way to save the preprocessing objects in scikit-learn? [duplicate]

Tags:

machine-learning

preprocessor

scikit-learn

user1367204

1 Answers

sikisis

Related questions

Recent Activity

Donate For Us