Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create 2 column binary numpy array from string list?

Input:

A string list like this:

['a', 'a', 'a', 'b', 'b', 'a', 'b']

Output I want:

A numpy array like this:

array([[ 1,  0],
       [ 1,  0],
       [ 1,  0],
       [ 0,  1],
       [ 0,  1],
       [ 1,  0],
       [ 0,  1]])

What I tried:

Try 1 - My starting data is actually stored in a column as a csv file. So I tried the following:

data1 = genfromtxt('csvname.csv', delimiter=',')

I did this because I thought I could manipulate the csv data into to form I want after I input it into the numpy format. However, the problem is I get all nan which is not a number. I'm not sure how else to go about this effectively because I need to do this for a large data set.

Try 2 - The ineffective method which I was thinking of doing:

For each element of the list, append [1,0] if a and append [0,1] if b.

Is there a better method?

like image 801
pr338 Avatar asked Jun 21 '26 09:06

pr338


2 Answers

Using List comprehension

Code:

import numpy
lst = ['a', 'a', 'a', 'b', 'b', 'a', 'b']
numpy.array([[1,0] if val =="a" else [0,1]for val in lst])

Output:

array([[1, 0],
    [1, 0],
    [1, 0],
    [0, 1],
    [0, 1],
    [1, 0],
    [0, 1]])

Note:

  • Rather then appending to a list\numpy array, creating a list is faster
like image 75
The6thSense Avatar answered Jun 24 '26 00:06

The6thSense


Building List

import numpy as np
list = ['a','a','a','b','b','a','b']
np.array([[ch=='a',ch=='b'] for ch in list]).astype(int)

Output

array([[1, 0],
    [1, 0],
    [1, 0],
    [0, 1],
    [0, 1],
    [1, 0],
    [0, 1]])

Does this solve it for you?

like image 40
thundergolfer Avatar answered Jun 24 '26 01:06

thundergolfer



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!