Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate a string representation of a one-hot encoding

In Python, I need to generate a dict that maps a letter to a pre-defined "one-hot" representation of that letter. By way of illustration, the dict should look like this:

{ 'A': '1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0',
  'B': '0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0', # ...
}

There is one bit (represented as a character) per letter of the alphabet. Hence each string will contain 25 zeros and one 1. The position of the 1 is determined by the position of the corresponding letter in the alphabet.

I came up with some code that generates this:

# Character set is explicitly specified for fine grained control
_letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
n = len(_letters)
one_hot = [' '.join(['0']*a + ['1'] + ['0']*b)
            for a, b in zip(range(n), range(n-1, -1, -1))]
outputs = dict(zip(_letters, one_hot))

Is there a more efficient/cleaner/more pythonic way to do the same thing?

like image 922
E.M. Avatar asked Dec 18 '25 00:12

E.M.


1 Answers

I find this to be more readable:

from string import ascii_uppercase

one_hot = {}
for i, l in enumerate(ascii_uppercase):
    bits = ['0']*26; bits[i] = '1'
    one_hot[l] = ' '.join(bits)

If you need a more general alphabet, just enumerate over a string of the characters, and replace ['0']*26 with ['0']*len(alphabet).

like image 154
Autoplectic Avatar answered Dec 19 '25 14:12

Autoplectic



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!