Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert string labels to numeric values

I have a csv file(delimiter=,) containing following fields

filename labels
xyz.png  cat
pqz.png  dog
abc.png  mouse           

there is a list containing all the classes

data-classes = ["cat", "dog", "mouse"]

Question : How to replace the string labels in csv with the index of the labels data-classes (i.e. if label == cat then label should change to 0 ) and save it in csv file.

like image 715
T T Avatar asked Sep 18 '25 22:09

T T


1 Answers

Assuming that all classes are present in your list you can do this using apply and call index on the list to return the ordinal position of the class in the list:

In[5]:
df['labels'].apply(data_classes.index)

Out[5]: 
0    0
1    1
2    2
Name: labels, dtype: int64

However, it will be faster to define a dict of your mapping and pass this an use map IMO as this is cython-ised so should be faster:

In[7]:
d = dict(zip(data_classes, range(0,3)))
d

Out[7]: {'cat': 0, 'dog': 1, 'mouse': 2}

In[8]:
df['labels'].map(d, na_action='ignore')

Out[8]: 
0    0
1    1
2    2
Name: labels, dtype: int64

If there are classes not present then NaN is returned

like image 107
EdChum Avatar answered Sep 20 '25 12:09

EdChum