How to check if my data is one-hot encoded

Question

If I have a data matrix, how do I check if the categorical variables have been one-hot encoded or not? I need to use LIME to explain my prediction, and I read that LIME works only if you have category labels instead of one-hot encoded columns. I found code to convert it, but it works only if it has been encoded otherwise the columns get turned to NaNs.

So I need e piece of code that looks at a numpy array with data and tells me if it has been one hot encoded or not.

Michael · Accepted Answer

You can sum all the rows, and see if you get a all 1's array, as in the following example:

Example:

X = np.array(
    [
        [1, 0, 0],
        [0, 1, 0],
        [0, 0, 1],
        [0, 1, 0],
        [1, 0, 0]
    ]
)
print(f'X is one-hot-encoded: {(X.sum(axis=1)-np.ones(X.shape[0])).sum()==0}')

Result:

X is one-hot-encoded: True

How to check if my data is one-hot encoded

Tags:

pandas

machine-learning

numpy

deep-learning

scikit-learn

vishak bharadwaj

1 Answers

Michael

Recent Activity

Donate For Us

How to check if my data is one-hot encoded

Tags:

pandas

machine-learning

numpy

deep-learning

scikit-learn

vishak bharadwaj

1 Answers

Michael

Related questions

Recent Activity

Donate For Us