I have a numeric vector train that I use in a training data set for a model. Assume I want to cut it into 5 bins. I know I can do it using cut(x, 5) from CategoricalArrays.jl. How to set the same binning in the test vector from a test data set of the model?
Perhaps there is a better solution but this would work:
using CategoricalArrays, Statistics
nbins = 5
breaks = Statistics.quantile(train, (1:nbins-1)/nbins)
cat_train = cut(train, breaks;extend=true,labels=string.("BIN_",1:5))
cat_test = cut(test, breaks;extend=true,labels=string.("BIN_",1:5))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With