If I have a data matrix, how do I check if the categorical variables have been one-hot encoded or not? I need to use LIME to explain my prediction, and I read that LIME works only if you have category labels instead of one-hot encoded columns. I found code to convert it, but it works only if it has been encoded otherwise the columns get turned to NaNs.
So I need e piece of code that looks at a numpy array with data and tells me if it has been one hot encoded or not.
You can sum all the rows, and see if you get a all 1's array, as in the following example:
Example:
X = np.array(
[
[1, 0, 0],
[0, 1, 0],
[0, 0, 1],
[0, 1, 0],
[1, 0, 0]
]
)
print(f'X is one-hot-encoded: {(X.sum(axis=1)-np.ones(X.shape[0])).sum()==0}')
Result:
X is one-hot-encoded: True
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With